Why to only use immutable variables for Java hashCode()

There are lots of websites which explain to Java coders, that whenever you override equals() you also have to override hashCode(). You need to override those methods especially when you use JPA. Eclipse is a nice tool, that helps you with generating those „custom implementations“.

But beware of the such generated hashCode() methods. It will ruine your day when you use HashSet or HashTable containers and expect them to work like they did with the standard implementation.

The rule of thumb is:

„Only include those variables in the hashCode generation, which will not change during the usage time of the object.“

Why? Just think about a simple class „Foo“ that has a mutable (means normal) property named „value“ and a Eclipse generated hashCode() method implementation (which will use the field value to calculate the hash). Given this class, what will this little test print out?:

    public static void main(String[] args) {
    
        Foo f = new Foo();
        
        f.value = 100;
        
        HashSet<Foo> set = new HashSet<Foo>();
        
        set.add(f);
        
        f.value = 101;
        
        if (set.contains(f)){
            System.out.println("All fine!");
        } else {
            System.err.println("WTF!?");
        }
        
    }

The right answer is: „WTF!?“

Believe it or not, the HashSet will not recognise the instance f anymore, because HashSet and HashTable will use the hashCode() method before the equals() method will be used to identify if an object is included in the container.

So only use immutable variables like ids that will not change during the usage of the class. Such a hashCode() method will still comply to the contract that

Equal objects must produce the same hash code as long as they are equal, however unequal objects need not produce distinct hash codes.

2 Gedanken zu „Why to only use immutable variables for Java hashCode()“

Using just an immutable property sounds nice but in practice won’t work. Not all objects will have immutable properties, and even if they did it might not make sense. Consider two Image objects that have the same binary data in them but different immutable ids. I think most people would expect these to be equal (and therefore have the same hashCode).

Regarding your statement about „HashSet and HashTable will use the hashCode() method before the equals() method“ that makes sense when you think about how a hash map works. It uses the hashCode() to get the location to store the object. I suspect that when you do „contains(…)“ HashSet uses the hashCode() to find the group of objects that match, and then iterates over them for an exact equals().

see:
http://stackoverflow.com/questions/9463483/weird-hashset-contains-behaviour

Antworten

kevinfleischer sagt:

12. September 2012 um 10:38 pm

For the image example this may work. But the behavior is unacceptable for i.e. a eMail application, where it would „forget“ about an email, just because you answered it (and changed its state), or a game where it would „forget“ about a player, just because he has more points etc.

Maybe I was using the Hashset not correctly, but for it very often was a kind of „registry for instances“. With the „wrong“ hashCode() implementation this is not possible, because it will not recognize instances after changes inside them.

Antworten

matthiaslathe sagt:

12. September 2012 um 2:17 am

Using just an immutable property sounds nice but in practice won’t work. Not all objects will have immutable properties, and even if they did it might not make sense. Consider two Image objects that have the same binary data in them but different immutable ids. I think most people would expect these to be equal (and therefore have the same hashCode).

Regarding your statement about „HashSet and HashTable will use the hashCode() method before the equals() method“ that makes sense when you think about how a hash map works. It uses the hashCode() to get the location to store the object. I suspect that when you do „contains(…)“ HashSet uses the hashCode() to find the group of objects that match, and then iterates over them for an exact equals().

see:
http://stackoverflow.com/questions/9463483/weird-hashset-contains-behaviour

Antworten
1. kevinfleischer sagt:
  
  12. September 2012 um 10:38 pm
  
  For the image example this may work. But the behavior is unacceptable for i.e. a eMail application, where it would „forget“ about an email, just because you answered it (and changed its state), or a game where it would „forget“ about a player, just because he has more points etc.
  
  Maybe I was using the Hashset not correctly, but for it very often was a kind of „registry for instances“. With the „wrong“ hashCode() implementation this is not possible, because it will not recognize instances after changes inside them.
  
  Antworten

Kevin Fleischer's Weblog

Why to only use immutable variables for Java hashCode()

2 Gedanken zu „Why to only use immutable variables for Java hashCode()“

Hinterlasse eine Antwort zu matthiaslathe Antwort abbrechen

Teilen Sie dies mit:

Ähnliche Beiträge

2 Gedanken zu „Why to only use immutable variables for Java hashCode()“

Hinterlasse eine Antwort zu matthiaslathe Antwort abbrechen