Implement the Hashable interface on Container classes ?
Hi,
is there a reason to not implement the Hashable interface on the "container" classes? I think that it will be very useful in order to compare containers.
It seems that the current implementation forces users to loop over collections to compare them. I know that if this comparison is implemented as a method, it will work the same way, but I think is better to implement the comparison one single time rather than reimplement it again and again in "userland".
I like this, but I'm not sure how I would handle the hash value. I would prefer to implement an equals method on Collection, but not implement Hashable.
Hi @rtheunissen
any plan about this equals feature ? Can help ?
Just see that new \Ds\Set([1, 3]) == new \Ds\Set([1, 2, 3]) returns true and it makes my unit tests very sad.
Ok, this is not a BC break feature... any chance to get it backported to 1.x or does it need a big internal refactoring?
Doesn't need internal refactoring, I'm really just lumping everything into 2.0 because it's my active branch. Happy to implement in 1.x now and just merge into 2.0.
See #67 for status on equals.
If we did implement Hashable on collections, they would simply use spl_object_hash and $this === $that, which is what the extension already falls back on.
We either have to use and support equals everywhere, or not at all.
An example of a consistency I want to avoid at all costs:
$a = new Ds\Set([1, 2, 3]);
$b = new Ds\Set(["1", "2", "3"]);
$a == $b; // true, because of loose property comparison.
$a === $b; // false, because they are not the same instance.
$a->equals($b); // false, because the sets don't contain the same values.
The inconsistency is between the == and the equals, which is something we can fix in the extension, but not the polyfill. This is the primary blocker. There is no way to implement "not the same instance but strict comparison of properties" in userland PHP as at 7.1.
Like https://github.com/atoum/atoum/issues/671 mentions, even if we implement equals on Collection,== won't behave as expected and won't resolve the problem outlined in this issue.
As mentioned on another ticket it might be good to extract the equals method into an Equatable interface which Hashable extends from. I'm not sure collections should be Hashable but they ought to be able to check for equality.
The issues with ==, ===, and equals are not that important in my opinion. Users cannot use == and === on userland objects reliably already, so providing equals and documenting it is the best way forward, in my opinion.
Usually an Equatable, Equalable, Identifiable, … interface goes hand in hand with the hash method because they actually just compare the hashes of the objects.
interface Identifiable {
function equals($other): bool;
function hash(): string;
}
trait IdentifiableTrait {
public function equals($other): bool {
return $other instanceof $this && $other->hash() === $this->hash();
}
public function hash(): string {
return \spl_object_hash();
}
}
final class SomeEntity implements Identifiable {
use IdentifiableTrait;
private $id;
public function hash(): string {
return \md5(__CLASS__ . $this->id, \false);
}
}
final class SomeValueObject implements Identifiable {
use IdentifiableTrait;
private $some_prop;
public function hash(): string {
return \md5(\var_export($this, \true), \false);
}
}
This might be a way how you could implement it.
Usually an Equatable, Equalable, Identifiable, … interface goes hand in hand with the hash method because they actually just compare the hashes of the objects.
I'm not sure where you get that idea but that is fundamentally broken. The whole reason equals is required by Hashable is to distinguish between two objects whose hashes collide but they aren't the same. The hash gets you into the correct bucket or region and then you iterate until you find one that equals matches.
Edit to expand on this a bit: If $lhs->equals($rhs) is true then $lhs->hash() == $rhs->hash() must also be true. However the latter does not imply the former, meaning that just because two hashes are considered equal does not mean the objects are. This is because fast hash values (which we want) are typically lossy. Their goal is to get us into the correct bucket or region and then we can use the more expensive but completely accurate equals to narrow in on the exact location.
The purpose of the hash function as it is used here in the extension is what you describe. However, that is different in programming languages that have equality built-in, e.g.:
Also, as explained in #67, lhs.equals(rhs) does not imply lhs.hash == rhs.hash for all types, only for most.
Also, as explained in #67, lhs.equals(rhs) does not imply lhs.hash == rhs.hash for all types, only for most.
Can you directly link me to the explanation? I see some type-juggling issues but not where two objects which are of the same type, are Hashable, and are equal by their equals call do not have the same hash code.
@morrisonlevi there you go: https://github.com/php-ds/extension/issues/67#issuecomment-347663830
Floating point values are not objects, do not implement Hashable, and thus do not have an equals method. While INF and NAN semantics are something to be aware of it's not the same case as I'm talking about here. In every case where you have a Hashable object you must use the semantics outlined or your type is not well-behaved and will not work properly in the collection.
Not in PHP, that is true. What about two DateTime instances that have the same time but are in different time zones? Are they equal? Probably. Do they have the same hash? Probably not.
I’m only trying to make clear that partial equality and ordering is something that is true for some types. The constraint you are laying out is normal for the keys of hash map implementations (if x == y then x.hash == y.hash) because they are broken if that does not hold.
I think the decision here is that only Hashable types should be supported by hash-based structures, and the contract assumes that if x == y then x.hash == y.hash. Containers that are mutable will not implement Hashable. You would need to convert to an immutable type first.
Hashable and Equatable should be separated.