How does HashMap work internally and how does it handle collisions?

How does it work?:

HashMap in Java works by storing key-value pairs in an array of nodes, where each node is a bucket. The key's hash code is used to determine the index of the bucket where the value will be stored. The hash code is computed using the hashCode() method of the key object and then processed to find the appropriate index within the array.

Handling Collisions:

Collisions occur when two different keys generate the same hash code and therefore map to the same bucket. HashMap handles collisions using a technique called chaining. Instead of storing a single entry in each bucket, HashMap stores a linked list (or a balanced tree if the list becomes too long) of entries. When a collision occurs, the new entry is added to this list.

Two Things to Keep in Mind:

  1. Performance Impact:
  2. As the number of collisions increases, the performance of HashMap can degrade because the time to find an element depends on the length of the linked list or tree in the bucket. To mitigate this, Java 8 introduced a change where the linked list in a bucket is converted to a balanced tree if it exceeds a certain threshold (usually 8).

  3. Rehashing:

  4. When the number of elements in the HashMap exceeds a certain load factor (default is 0.75), the HashMap will resize itself by doubling the number of buckets and rehashing all existing entries into the new buckets. This helps in maintaining good performance by reducing collisions.

Example:

Imagine storing employee records in a HashMap where the employee ID is the key. If two employees have IDs that produce the same hash code, both entries will be placed in the same bucket, and a linked list or tree will be used to store them. When retrieving an employee by ID, the HashMap will compute the hash code, locate the bucket, and traverse the list or tree to find the correct entry.