Mar 26, 2014

How HashSet works internally - Internal implementation

Internal implementation of any collection (HashMap, HashSet, LinkedhashMap, etc) is common interview question for senior java developer and it is highly recommended to brush-up internal implementations of collections before going for interview.Here I am discussing how HashSet works internally.Before reading this article you must know How HashMap works internally?, since HashMap is building block of internal working of HashSet.
Basic understanding of HashMap states that it stores key and value both in form of Map.Entry object. So our first befitting reply should be: "HashSet internally uses HashMap to store objects". HashSet constructor instantiate a HashMap when its constructor is being called. Below mentioned sample code from java.util.HashSet creates a HashMap named map.(Overloaded constructor of HashSet can be found here).
private transient HashMap<E,Object> map;
//Constructs a new, empty set; the backing HashMap instance 
//has default initial capacity (16) and load factor (0.75).
public HashSet() {
   map = new HashMap<E,Object>();
}
The very next question arises, since HashMap stores key/value pair while HashSet only stores object (set.add(someObject)) then "How Object is stored in HashSet":
Answer: HashMap stores object(Any value user is trying to store in set) as key and some pre-defined constant as value.By doing so, HashSet stores only unique values(Duplicate is not allowed).In other words, we call set.add(someObject) then call is redirected and map stores someObject as key and PRESENT(a constant) as value.Below mentioned sample code from java.util.HashSet, PRESENT is the the constatnt(marked final) value which is being used by HashMap.

// Dummy value to associate with an Object in the backing Map
 private static final Object PRESENT = new Object();
 //add operation 
 public boolean add(E e) {
   return map.put(e, PRESENT)==null; 
}
Please note the return type of add method is boolean: after adding object to map it is comparing with null, as it is expected( map will return the previous value associated with key, or null if there was no mapping for key).If returing true menas no previous element was there and it added successfully.
Similarly when we delete object form HashSet by using remove method, it redirects call to HashMap to remove object and return boolean based on object was removed or not.below is sample code for remove method:
public boolean remove(Object o) {
     return map.remove(o)==PRESENT;
  }
This is how HashSet works internally with the help of HashMap. Now try this question:
class TryHashSet{
 public static void main(String[] args) {
   HashSet<string> set = new HashSet();
   boolean stat1 = set.add("Topper of class");
   boolean stat2 = set.add("Second ranker");
   boolean stat3 = set.add("Topper of class");
   boolean stat4 = set.remove("Topper of class");
   boolean stat5 = set.remove("Topper of class");
   System.out.println(stat1 + " " + stat2 + " " +stat3 + " "+ stat4  + " " +stat5);
 }
}
Sample output : true true false true false

===============End of article==========================
Happy learning!!
Nikhil

Related post,you may like it:
1. Java puzzle - Set 1
2. Java puzzle - Set 2

Mar 16, 2014

How to find beginning of the cycle in linked list in Java - Floyd cycle detection algorithm

In continuation to precious post, How to detect cycle in a linked list, the main agenda of this post is to discuss - "How to detect the node where cycle starts". Floyd's algorithm states, the hare moves two steps for every step of the tortoise. If the tortoise and hare ever meet, there is a cycle, and the meeting point is part of the cycle, but not necessarily the first node in the cycle.
Floyd hypothesis for find Cycle Beginning node - Once tortoise and hare meets, put tortoise back to the beginning of the list and keep hare where they met. If we allow to both tortoise and hare move at the same speed (1 step for both), the first time they ever meet again will be the cycle beginning.Below is the sample code for finding starting node of cycle.

static public Node findStartOfCycle(Node head) {
 if (head == null)
  return null;
 Node fastReference = head;
 Node slowReference = head;
 /* Iterate linked list to detect cycle */
 while (fastReference != null && slowReference != null) {
  if (fastReference.next != null) {
   /* fast reference - crosses 2 nodes at a time */
   fastReference = fastReference.next.next;
  }
  if (slowReference != null) {
   /* fast reference - cross 1 node at a time */
   slowReference = slowReference.next;
  }
  if (fastReference!=null && slowReference!=null && fastReference.equals(slowReference)) {
   break;
  }
 }
 /*
  * Once cycle detected, place slowRef to start of linked list and traverse
  * linked list one node at a time- both slow & fast ref meet again at start of
  * node of cycle
  */
 if (fastReference != null && slowReference != null) {
  slowReference = head;
  while (fastReference != slowReference) {
   slowReference = slowReference.next;
   fastReference = fastReference.next;
  }
  return slowReference;
 }
 return null;
}

Refer this for more detail about - How does Floyd algorithm works and understand how after detecting cycle, slow reference and fast reference moves with same frequency and finds starting node of cycle.
Below is the complete program for how to detect and find the starting node of cycle.Consider the following sample linked list.
HEAD-->12-->18-->172-->62---->  632
                                        ^                |
                                         |               v
                                        121<--    192
public class FindStartNodeofCycle {
 static Node head;

static public Node findStartOfCycle(Node head) {
 if (head == null)
  return null;
 Node fastReference = head;
 Node slowReference = head;
 /* Iterate linked list to detect cycle */
 while (fastReference != null && slowReference != null) {
  if (fastReference.next != null) {
   /* fast reference - crosses 2 nodes at a time */
   fastReference = fastReference.next.next;
  }
  if (slowReference != null) {
   /* fast reference - cross 1 node at a time */
   slowReference = slowReference.next;
  }
  if (fastReference!=null && slowReference!=null && fastReference.equals(slowReference)) {
   break;
  }
 }
 /*
  * Once cycle detected, place slowRef to start of linked list and
  * traverse linked list one by one - they will meet again at start of
  * node of cycle
  */
 if (fastReference != null && slowReference != null) {
  slowReference = head;
  while (fastReference != slowReference) {
   slowReference = slowReference.next;
   fastReference = fastReference.next;
  }
  return slowReference;
 }
 return null;
}

 class Node {
  public int data;
  public Node next; // reference(handler) for next Node.

  public Node(int data) {
   this.data = data;
   this.next = null;
  }
 }

 public static void main(String[] args) {
  FindStartNodeofCycle cyc = new FindStartNodeofCycle();
  Node node1 = cyc.new Node(12);
  head = node1;
  Node node2 = cyc.new Node(18);
  Node node3 = cyc.new Node(172);
  Node node4 = cyc.new Node(62);
  Node node5 = cyc.new Node(632);
  Node node6 = cyc.new Node(192);
  Node node7 = cyc.new Node(121);

head.next = node2;  head.next.next = node3;  head.next.next.next = node4;  head.next.next.next.next = node5;  head.next.next.next.next.next = node6;
  head.next.next.next.next.next.next = node7;
  node7.next = node4; // cycle formed between node 6 and 4 - 62 is start of cycle
  
  Node startOfCycle = findStartOfCycle(head);
  System.out.println("Data value at start of cycle is "
    + startOfCycle.data);

 }

}
======Sample output=========
Data value at start of cycle is 62
=========================

Mar 8, 2014

Textual description of firstImageUrl

How to detect cycle in linked list in java - Floyd cycle detection algorithm

Problem statement :- How to detect a given linked list is cyclic or having a cycle or not ?
Cycle in a linked list means, when we iterate given linked list we are never going to reach end of list, because end node of list is having reference of some intermediate node. Se the following diagram:-

In order to detect cycle in a linked list, we can use Floyd cycle detection algorithm(tortoise and hare's algorithm)- the hare moves two steps for every step of the tortoise. It states that, if we iterate given linked list with different speed (with two references of different frequency) and check whether both fast and slow reference point to same node or not, if linked list having any cycle it will detected else iteration will complete(end of list detected). Below is the sample code for cycle detection. It returns true, if cycle is detected or Question may be modified to return given node which creates cycle.
private boolean checkCycleinLinkedList(Node head) {
 if (head == null)
  return false;
 Node fastReference = head;
 Node slowReference = head;
 while (fastReference != null && slowReference != null) {
  if (fastReference.next != null) {
   /* hare- fast reference (crosses 2 nodes at a time) */
   fastReference = fastReference.next.next;
  }
  if (slowReference != null) {
   /* tortoise - slow reference (cross 1 node at a time) */
   slowReference = slowReference.next;
  }
  if (fastReference!= null && slowReference!=null && 
     fastReference.equals(slowReference)) {
  break;
 }
 }
 if (fastReference == null || slowReference == null) {
  return false;
 }
 return true;
}
Note:- In Floyd's algorithm, the hare moves two steps for every step of the tortoise. If the tortoise and hare ever meet, there is a cycle, and the meeting point is part of the cycle, but not necessarily the first node in the cycle.Another problem associated with cycle detection algorithm is find start node of cycle.

Time and space complexity :- Since each node of linked list is visited only once so time complexity is O(n) and space complexity is O(1).

Below is the complete sample program for cycle detection in linked list.
public class CheckCyclicLinkedList {
 static Node head;

 static private boolean checkCycleinLinkedList(Node head) {
  if (head == null)
   return false;
  Node fastReference = head;
  Node slowReference = head;
  while (fastReference != null && slowReference != null) {
   if (fastReference.next != null) {
    /* fast reference - crosses 2 nodes at a time */
   fastReference = fastReference.next.next;
   // System.out.println(fastReference.data);
  }
  if (slowReference != null) {
   /* fast reference - cross 1 node at a time */
   slowReference = slowReference.next;
   // System.out.println(slowReference.data);
  }
  if (fastReference!= null && slowReference!=null && fastReference.equals(slowReference)) {
   break;
  }
 }
 if (fastReference == null || slowReference == null) {
  return false;
 }
 return true;
}

class Node {
 public int data;
 public Node next; // reference(handler) for next Node.

 public Node(int data) {
  this.data = data;
  this.next = null;
 }
}

public static void main(String[] args) {
 CheckCyclicLinkedList cyc = new CheckCyclicLinkedList();
 Node node1 = cyc.new Node(12);
 head = node1;
 Node node2 = cyc.new Node(18);
 head.next = node2;
 Node node3 = cyc.new Node(172);
 head.next.next = node3;
 Node node4 = cyc.new Node(62);
 head.next.next.next = node4;
 head.next.next.next.next = node3;// cycle formed here
 boolean cycleStat = checkCycleinLinkedList(head);
 if (cycleStat)
  System.out.println("Cycle found!!");
 else
  System.out.println("NO Cycle found!!");

 /* second call cycle is removed */
 head.next.next.next.next = null;// cycle removed
 boolean cycleStat2 = checkCycleinLinkedList(head);
 if (cycleStat2)
  System.out.println("Cycle found!!");
 else
  System.out.println("NO Cycle found!!");
 }

}
=======Sample program=======
Cycle found!!
NO Cycle found!!
==========================

Mar 3, 2014

Textual description of firstImageUrl

How HashMap works internally - Internal implementation of HashMap

HashMap is one of the concrete implementation of Map(An associate array data structure). HashMap stores key/value pair in form of Entry object(An static class managed by HashMap having fields: key, value, next,hash) and associated basic operations(get() and put()) are of order O(1)/constant time,assuming the hash function disperses the elements properly among the buckets(unless it degenerate to Linkedlist when using poor hashcode mechanism).This post will mainly focus on internal implementation and working of HashMap.It is one of the favorite question of interviewer when dealing with collections for senior java developer.As we progress, we will go in detail of building blocks of HashMap and discuss its importance in HashMap implementation.
First question that pops-up in mind is what is underlying data structure being used by HashMap to store its key/value? Answer: It is an re-sizable array of Entry object(Entry[] table) which act as container for the key/value pairs.This table is also called hash table because index value for this table is calculated by hash mechanism and then entry object is stored. There are two factors which control the re-sizing mechanism of hash table(Entry[] table) i.e : initial capacity and loadfactor.
The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased.
When does table size got increased or re-hashing occurs ? Answer: When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
By default initial capacity of this table is set 16(Until java 6) and it's initial capacity can be changed by using appropriate constructor. Please note from java 7 update 40, empty table is created and initialization of table is also moved out of constructor to the same line, where table is declared. And when first time put method is called then inflateTable(threshold) is called to inflate table.Below mentioned sample code from java.util.HashMap(java 6) indicating Entry table(hash table) and constructors:
public class HashMap<k,V> extends AbstractMap<k,V> 
                implements Map<K,V>, Cloneable, Serializable
 { 
    ....various CONSTANTS for default initial capacity(16), 
                                    loadfactor(0.75f) etc. 
   transient Entry[] table; //A resizeable table whose length 
                           //MUST Always be a power of two.
   final float loadFactor;//The load factor for the hash table.
   int threshold; // The next size value at which
                    // to resize (capacity * load factor).
   public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR;
    threshold = (int)(DEFAULT_INITIAL_CAPACITY 
                           * DEFAULT_LOAD_FACTOR);
    table = new Entry[DEFAULT_INITIAL_CAPACITY];
    init();
   }
   ... other supporting methods of HashMap 
  }
Now we have got the idea of where key/value stored in the form Entry object.It's time to investigate Entry class. Entry class is an static class maintained inside HashMap and it has four fields: key, value, next and hash.
key/value - it is user data which is being stored in HashMap.
hash - it is an integer value(dispersed/modified hash code calculated by hash(key.hashCode()))
next - it is a reference of type Entry, it comes into picture when HashMap degenerate to Linkedlist at particular index of hash table and will store entry object at that location.(We will revisit it later with diagram).Here is the sample code for Entry class:
static class Entry<K,V> implements Map.Entry<K,V> {
    final K key;
    V value;
    Entry<K,V> next;
    final int hash;
    //constructor for creating new Entry object
    Entry(int h, K k, V v, Entry<K,V> n) { 
       value = v;
       next = n;
       key = k;
       hash = h;
    } 
    .....many more supporting methods  
 }
As of now we have fair idea of how does Entry object looks like and where Entry object is stored.Now question arises How Entry objects is stored and how do we retrieve it? - put() and get() method of HashMap. We will discuss the put() method followed by get().
Say loudly!! HashMap so as HashTable works on hashing mechanism. It means,for storing Entry object we first find hashCode from key and then actual table index(bucket number) where Entry object is stored. It is three step process :
Step1: Calculate hashCode method hashCode(key) -> hashCode
Step2: Disperse the bit position of hashCode to avoid collison using hash(hashCode) -> modified hash value
Step3: Finally find actual index of table where Entry object is stored by passing this dispersed hash value to indexFor(hash, tableLength).
Now we have found the index(bucket) location of table, so there are two possibilities : Either that location is empty or some element is already there(Please note for two distinct key, hashCode and table index(bucket) can be same). If some Entry is already there then this state is called - collision has occurred in HashMap. Before storing/creating Entry object at that index position, HashMap check that some Entry object with same hash and key value is there or not.If some Entry is found, it simply returns value in that Entry object associated with given key and new Entry object is not created, otherwise new Entry object is created and null is returned.
What will happen when bucket number is same but key value is different ? In this case, HashMap maintains a linked list at that bucket location and next of Entry will come into picture. An Entry object will be created and appended with next of the previously added object(See in the diagram below).In such scenerio order of put operation degerate from O(1) to O(n).Below the sample code of put() operation dealing with index(bucket) location generation stated above and duplicate entry check in HashMap:
public V put(K key, V value) {
  if (key == null)
    return putForNullKey(value);
  int hash = hash(key.hashCode());// Step 1 and Step 2 
  int i = indexFor(hash, table.length);Step 3
  //loop for Checking Entry with same key is there or not
  for (Entry<K,V> e = table[i]; e != null; e = e.next) { 
   Object k;
   if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
   {
     V oldValue = e.value;
     e.value = value;
     e.recordAccess(this);
     return oldValue;
   }
  }
  ...... some other stuff
  addEntry(hash, key, value, i); //Create Entry Object
  return null;// return null, if new key and value is added.
 }
One important take away from above discussion is HashMap returns null if Entry object is created with key/value, else it returns the value associated with the key(already present). Java puzzles related to this concept find here(Question 3 and Question 4). Consider the following diagram for pictorial representation of our understanding :
Entry object 1 and 2 are originating from same bucket location since their hash value (H1) is same so they are forming linkedlist and Next of Entry 1 point to Entry 2 object. If only one element is present at bucket location then Next is pointing to null.
Before moving to get() operation,I would like to add one cent to the understanding of put() operation.As we know null is allowed as key in hashmap with null or non-null value. How does HashMap deals with null key? : HashMap has a offloaded version of put() method putForNullKey(value) to deal with null key. When put method finds that key value is null it will simply call this method.What putForNullKey(value) does? : It is important to note that null keys always map to hash 0, thus index 0. In other words, if we make an entry of null key and some value object, it will be always be stored at index 0 of hash table.As stated above in put() operation, before creating new Entry at bucket 0 , putForNullKey() check whether already some entry exist at location index 0 or not. If it finds any entry it returns the value associated  with null key, otherwise it will create new Entry object at that location and returns null. Please note only one null key is allowed in HashMap.Sample code of putForNullKey(value) can be found here.
We understood how Entry object is stored in HashTable using hashing mechanism.Next question come out, How does values are retrieved from HashMap using get(key) method ?
When any get(key) request comes to HashMap, it calculates hashCode and finds bucket location(similarly as we discussed above three step process). Once bucket location is known, it will iterate the Linkedlist at that location : if key and dispersed/modified hash value matches then it will return associated value of the key.Otherwise,it returns null(It means key was not found in HashMap).Sample code for get(key) method:
public V get(Object key) {
   if (key == null)
     return getForNullKey();
   //Find dispersed/modified hashCode that
   //is stored in Entry object
    int hash = hash(key.hashCode());
   //Find bucket no and iterare over linkedlist
   for (Entry<K,V> e = table[indexFor(hash, table.length)]; 
         e != null;
         e = e.next) {
    Object k;
    if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
      return e.value; //Return value associated with key
    }
  return null; //Return null, if key is not found in map
 }
What will happen when get(null) is called? Here Offloaded version of get() will be used to look up null keys. Null keys map to index 0.It will iterate the linkedlist at bucket location 0. If null is found value associated is returned else null is returned(indicating no null ket is found here).Sample code of getForNullKey() can be found here.
Before bidning up get(key) operation, we need to discuss one of the important interview question: What is role of equals() and hashCode() in working of HashMap?
As we can see in get(key) sample code above, key.hashCode() is used to find hashcode and then find bucket location.Similarly, equals() method is being used to check the requeted key and key in the linkedlist is same or not. In simple word, hashCode() and equals() method are used to retrieve correct entry object with same hashcode and key as requested by get(key) method.
This is all about internal class structure of HashMap and its internal working.
Here we have discussed some important concepts and questions related to HashMap.
=====================End of article======================

Happy Learning
Nikhil