How HashMap works in Java? (original) (raw)
Hello guys, if you are looking for an answer of popular Java interview question, how HashMap works in Java? or How HashMap works internally in Java then you have come to the right place. In this article, I will not just answer those question but also many related questions like How get() and put() method works in Java? what role equals() and hashcode() play in HashMap, How HashMap resize itself, and why key object of HashMap should be Immutable in Java. Let's start with the basics first. HashMap in Java works on hashing principles. It is a data structure that allows us to store object and retrieve it in constant time O(1) provided we know the key. Also this is one of my most popular article with more than 6 million views and I have now updated it to include latest information like how HashMap now uses a balanced tree instead of linked list for storing collision objects and new diagrams on HashMap working which provide visual explanations.
In hashing, hash functions are used to link keys and values in HashMap. Objects are stored by the calling put(key, value) method of HashMap and retrieved by calling the get(key) method.
When we call the put method, the hashcode() method of the key object is called so that the hash function of the map can find a bucket location to store the value object, which is actually an index of the internal array, known as the table.
By the way, you can easily verify how HashMap works by looking at the code of HashMap.java in your Eclipse IDE if you know how to attach the source code of JDK in Eclipse.
How HashMap Internally Works in Java?
How HashMap works in Java or sometimes how does get method works in HashMap is a very common question on Java interviews nowadays. Almost everybody who worked in Java knows about HashMap, where to use HashMap, and the difference between Hashtable and HashMap then why did this interview question become so special? Because of the depth, it offers.
It has become a very popular Java interview question in almost any senior or mid-senior level Java interview. Investment banks mostly prefer to ask this question and sometimes even ask you to implement your own HashMap based upon your coding aptitude.
The introduction of ConcurrentHashMap and other concurrent collections has also made these questions a starting point to delve into a more advanced feature. let's start the journey.
Interview questions start with a simple statement:
1. Have you used HashMap before or What is HashMap? Why do you use it?
Almost everybody answers this with yes and then the interviewee keeps talking about common facts about HashMap like HashMap accepts null while Hashtable doesn't, HashMap is not synchronized, HashMap is fast, and so on along with basics like its stores key and value pairs, etc.
This shows that the person has used HashMap and is quite familiar with the functionality it offers, but the interview takes a sharp turn from here and the next set of follow-up questions gets more detailed about the fundamentals involved with HashMap in Java. Interviewer strike back with questions like:
2. Do you Know how HashMap works in Java or How does get () method of HashMap works in Java
And then you get answers like, I don't bother its standard Java API, you better look code on Java source or Open JDK; I can find it out in Google at any time, etc.
But some interviewee definitely answers this and will say HashMap works on the principle of hashing, we have put(key, value) and get(key) method for storing and retrieving Objects from HashMap.
When we pass Key and Value object to put() method on Java HashMap, HashMap implementation calls hashCode method on Key object and applies returned hashcode into its own hashing function to find a bucket location for storing Entry object, the important point to mention is that HashMap in Java stores both key and value object as Map.Entry in a bucket is essential to understand the retrieving logic.
Here is also a nice flowchart diagram which shows how does get() method of HashMap or Hashtable works in Java and how does it retrieve value:
If candidate fails to recognize this and say it only stores Value in the bucket they will fail to explain the retrieving logic of any object stored in Java HashMap.
Though, this answer is very much acceptable and does make sense that the interviewee has a fair bit of knowledge on how hashing works and how HashMap works in Java.
But this is just the start of the story and confusion increases when you put the candidate on scenarios faced by Java developers on day by day basis. The next question could be about collision detection and collision resolution in Java HashMap like:
3. What will happen if two different objects have the same hashcode?
Now from here onwards real confusion starts, sometime candidate will say that since hashcode is equal, both objects are equal and HashMap will throw an exception or not store them again, etc, Then you might want to remind them about equals() and hashCode() contract that two unequal objects in Java can have the same hashcode.
Some will give up at this point and few will move ahead and say "Since hashcode is same, bucket location would be same and collision will occur in HashMap Since HashMap uses LinkedList to store object, this entry (object of Map.Entry comprise key and value ) will be stored in LinkedList.
Great !! this answer makes sense though there are many collision resolution methods available like linear probing and chaining, this is the simplest, and HashMap in Java does follow this. But the story does not end here and the interviewer asks further questions
Btw, here is a nice flowchart diagram which shows what will happen if two different objects hashcode() method will return same value while storing in HashMap
4. How will you retrieve the Value object if two Keys will have the same hashcode?
The interviewee will say we will call the get() method and then HashMap will use Key Object's hashcode to find out bucket location and retrieves the Value object but then you need to remind him that there are two Value objects which are stored in the same bucket, so they will say about traversal in LinkedList until we find the value object.
At that point you can ask how do you identify value object because you don't have value object to compare until they know that HashMap stores both Key and Value in LinkedList node or as Map.Entry they won't be able to resolve this issue and will try and fail.
But those bunch of people who remember this key information will say that after finding bucket location, we will call keys.equals() method to identify a correct node in LinkedList and return associated value object for that key in Java HashMap. Perfect this is the correct answer.
In many cases, interviewee fails at this stage because they get confused betweenhashCode() and equals() or keys and values object in Java HashMap which is pretty obvious because they are dealing with the hashcode() in all previous questions and equals() come in the picture only in case of retrieving value object from HashMap in Java.
Good and experienced Java developer will point out here that using an immutable, final object with proper equals() and hashcode() implementation would act as perfect Java HashMap keys and improve the performance of Java HashMap by reducing collision.
Immutability also allows caching their hashcode of different keys which makes the overall retrieval process very fast and suggests that String and various wrapper classes e.g. Integer very good keys in Java HashMap.
Now if you clear this entire Java HashMap interview, You will be surprised by this very interesting question "What happens On HashMap in Java if the size of the HashMap exceeds a given threshold defined by load factor ?".
Until you know how HashMap works exactly you won't be able to answer this question. If the size of the Map exceeds a given threshold defined by load-factor e.g. if the load factor is .75 it will act to re-size the map once it filled 75%.
Similar to other collection classes like ArrayList, Java HashMap re-size itself by creating a new bucket array of size twice the previous size of HashMap and then start putting every old element into that new bucket array. This process is called rehashing because it also applies the hash function to find a new bucket location.
If you manage to answer this question on HashMap in Java you will be greeted by "do you see any problem with resizing of HashMap in Java", you might not be able to pick the context, and then he will try to give you a hint about multiple threads accessing the Java HashMap and potentially looking for race condition on HashMap in Java.
So the answer is Yes there is a potential race condition that exists while resizing HashMap in Java, if two thread at the same time found that now HashMap needs resizing and they both try to resize.
On the process of resizing HashMap in Java, the element in the bucket which is stored in the linked list get reversed in order during their migration to the new bucket because Java HashMap doesn't append the new element at the tail instead it append new element at the head to avoid tail traversing.
If race condition happens then you will end up with an infinite loop. Though this point, you can potentially argue that what the hell makes you think to use HashMap in a multi-threaded environment to interviewer :)
Other Popular Hashtable and HashMap Questions
Here are few Java HashMap interview questions which is contributed by readers of the Javarevisited blog and I have included them on main article for everyone's benefit.
5. Why String, Integer, and other wrapper classes are considered good keys?
String, Integer, and other wrapper classes are natural candidates of the HashMap key, and String is the most frequently used key as well because String is immutable and final, and overrides equals and hashcode() method. Other wrapper class also shares similar property.
Immutability is required, in order to prevent changes on fields used to calculate hashCode() because if a key object returns different hashCode during insertion and retrieval then it won't be possible to get an object from HashMap.
Immutability is best as it offers other advantages as well like thread-safety, If you can keep your hashCode the same by only making certain fields final, then you go for that as well.
Since the equals() and hashCode() method is used during retrieval of value objects from HashMap, it's important that key objects correctly override these methods and follow contact.
If an unequal object returns different hashcode then the chances of collision will be less which subsequently improves the performance of HashMap.
6. Can we use any custom object as a key in HashMap?
This is an extension of previous questions. Of course, you can use any Object as a key in Java HashMap provided it follows equals and hashCode contract and its hashCode should not vary once the object is inserted into Map.
If the custom object is Immutable then this will be already taken care of because you can not change it once created.
7. Can we use ConcurrentHashMap in place of Hashtable?
This is another question which getting popular due to the increasing popularity of ConcurrentHashMap. Since we know Hashtable is synchronized ConcurrentHashMap provides better concurrency by only locking a portion of the map determined by concurrency level.
ConcurrentHashMap is certainly introduced as Hashtable and can be used in place of it, but Hashtable provides stronger thread-safety than ConcurrentHashMap. See my post difference between Hashtable and ConcurrentHashMap for more details.
Personally, I like this question because of its depth and the number of concepts it touches indirectly if you look at questions asked during the interview this HashMap question has verified
- The concept of hashing
- Collision resolution in HashMap
- Use of equals () and hashCode () and their importance in HashMap?
- The benefit of the immutable object?
- Race condition on HashMap in Java
- Resizing of Java HashMap
Just to summarize here are the answers which do make sense for the above questions
8. How does HashMap work in Java?
HashMap works on the principle of hashing, we have put() and get() method for storing and retrieving objects from HashMap.
When we pass both key and value to put() method to store on HashMap, it uses key object hashcode() method to calculate hashcode and them by applying hashing on that hashcode it identifies bucket location for storing value object.
While retrieving it uses the key object equals method to find out the correct key-value pair and return the value object associated with that key. HashMap uses a linked list in case of collision and objects will be stored in the next node of the linked list.
9. What will happen if two different HashMap key objects have the same hashcode?
They will be stored in the same bucket but no next node of the linked list. And keys equals () method will be used to identify the correct key-value pair in HashMap.
10. How null key is handled in HashMap? Since equals() and hashCode() are used to store and retrieve values, how does it work in the case of the null key?
The null key is handled specially in HashMap, there are two separate methods for that putForNullKey(V value) and getForNullKey(). Later is an offloaded version of get() to look up null keys.
Null keys always map to index 0.
This null case is split out into separate methods for the sake of performance in the two most commonly used operations (get and put), but incorporated with conditionals in others. In short, the equals() and hashcode() methods are not used in the case of null keys in HashMap.
here is how nulls are retrieved from HashMap
private V getForNullKey() {
if (size == 0) {
return null;
}
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null)
return e.value;
}
return null;
}
In terms of usage, Java HashMap is very versatile and I have mostly used HashMap as a cache in an electronic trading application I have worked on.
Since the finance domain used Java heavily and due to performance reasons we need caching HashMap and ConcurrentHashMap comes as very handy there.
If you prefer to watch than read then I have also explained this concept of HashMap internal working on our YouTube channel where we have talked about the internal working of HashMap and How HashMap works in general.
HashMap Changes in JDK 1.7 and JDK 1.8
There is some performance improvement done on HashMap and ArrayList from JDK 1.7, which reduces memory consumption. Due to this empty Maps are lazily initialized and will cost you less memory. Earlier, when you create HashMap like new HashMap() it automatically creates an array of default length e.g. 16.
After some research, the Java team found that most of these Map are temporary and never use that many elements, and only end up wasting memory. Also, From JDK 1.8 onwards HashMap has introduced an improved strategy to deal with a high collision rate.
Since a poor hash function e.g. which always return the location of the same bucket, can turn a HashMap into a linked list, like converting the get() method to perform in O(n) instead of O(1) and someone can take advantage of this fact, Java now internally replace linked list to a binary true once a certain threshold is breached.
This ensures performance or order O(log(n)) even in the worst case where a hash function is not distributing keys properly.
Related post:
- Difference between Hashtable and HashMap in Java
- Difference between SynchrnizedHashMap and ConcurrentHashMap in Java
- 10 examples of Hashtable in Java
- How get() method of Hashtable internally works in Java
- How to sort HashMap in Java by values
- 25 Examples of ConcurrentHashMap in Java
- How does HashMap or LinkedHashMap handle collision in Java?
- 10 ConcurrentHashMap Interview Questions with Answers
- Difference between HashMap, TreeMap, and LinkedHashMap in Java
- Java HashMap containsKey() and containsValue() Example