Wednesday, August 25, 2021

About Collections and Maps in Java

Today we are gonna have a brief view over the Collection and Map interfaces in Java:

As following diagram shows, while they are all used to save a group of objects of same types, Sets, Lists, Queues and Deques belong to the Collection interface, whereas Maps have their own interface.

Reference: https://www.codejava.net/java-core/collections/java-map-collection-tutorial-and-examples

----------------------------------------------------------------------------------------------------------------------

Collection

root interface of collection framework. Its has following sub-interfaces 
    • Set:
                    No duplicates. At most one null element.
                    Implementations:
      • EnumSet:
      • HashSet: Fastest but no ordering guarantees. O(1) for most operations. Initial capacity and load factor.
      • LinkedHashSet: Speed between HashSet and TreeSet. Provides insertion-ordered iteration. 
      • TreeSet: O(logn).

    • List:
                All implementing classes:
AbstractList, AbstractSequentialList, ArrayList, AttributeList, CopyOnWriteArrayList, LinkedList, RoleList, RoleUnresolvedList, Stack (deprecated), Vector(deprecated)
    • Queue:
                All implementing classes:
AbstractQueue, ArrayBlockingQueue, ArrayDeque, ConcurrentLinkedDeque, ConcurrentLinkedQueue, DelayQueue, LinkedBlockingDeque, LinkedBlockingQueue, LinkedList, LinkedTransferQueue, PriorityBlockingQueue, PriorityQueue, SynchronousQueue
    • Deque: LIFO interface


                    Prefer Deque over Stack class. (Since Java doesn't support multiple inheritance, we can't extend Stack and other class at the same time.) 
 
Using the Stack class, we can manipulate elements in it just like working with an array. This has broken the LIFO contract. Using Deque(double ended Queue), we can add/delete on both ends of the queue.


Properties of different Collections:

Collections that additionally guarantee that no change in the Collection object will be visible are referred to as immutable. 

Lists that support fast (generally constant time) indexed element access are known as random access lists. 

Lists that do not support fast indexed element access are known as sequential access lists.


p.s. 

1. Great care must be exercised when using mutable elements in a Collection, because the behavior of equals is undefined when the property of element is changed. 

2. The contains() method would call equals(), which has following rules:

  1. reflexive: an object must equal itself 
  2. symmetric: x.equals(y) must return the same result as y.equals(x) 
  3. transitive: if x.equals(y) and y.equals(z) then also x.equals(z) 
  4. consistent: the value of equals() should change only if a property that is contained in equals() changes (no randomness allowed) 
An attempt to overwrite the equals by class inheritance may violate the equals() rules. Therefore, we favor composition over inheritance. (This is actually one of the principles of OOP)

Also, if we override the equals() method, we also have to override hashCode(). The hashcode contract has following properties:
  1. internal consistency: the value of hashCode() may only change if a property that is in equals() changes equals 
  2. consistency: objects that are equal to each other must return the same hashCode 
  3. collisions: unequal objects may have the same hashCode 
We can EqualsVerifier Maven test dependency to verify our implementation.
----------------------------------------------------------------------------------------------------------------------

Map

Cannot contain duplicate keys; each key can map to at most one value. Map provides 3 different views: (If map modified during the iteration, the results of these methods are undefined.)
    • set of keys: keySet()
    • collection of values: values()
    • set of key-value mappings: entrySet()
  • The order of a map = iterators return order. Some map implementations, like the TreeMap class, make specific guarantees as to their order; others, like the HashMap class, do not.
p.s.
  1.  If mutable objects are used as keys, beware that undefined behavior by equals() method may appear if the values of an object changed in such a map.
  2. A map cannot contain itself as key, but as value. However, the equals and hashcode are no longer well-defined in such map.

Note on methods of Map:
V put (K,V): If the map previously contained a mapping for the key, the old value is replaced by the specified value. (So the Key is re-inserted into map)

Sub-interfaces of Map interface:

    • SortedMap: All keys must implement the Comparable interface. 
    • NavigableMap: extends SortedMap<K,V>. provides navigation methods (lowerKey/Entry, floorKey/Entry, ceilingKey/Entry, and higherKey/Entry) returning closest matches for given search targets.
    • ConcurrentMap: offer thread safety and atomicity
    • ConcurrentNavigableMap: offer thread safety and atomicity

Implementations of Map Interface:




________________________________________________________________________________
    • HashMap: Hash table based implementation of the Map interface. It's equivalent to the obsolete implementation java.util.Hashtable, except that HashMap accepts null keys and values, and it is unsynchronized. No guarantee on the constant order over time. O(1) for put/get. 
                    It has two performance factors:
      1. initial capacity: Default = 16 (number of "buckets")
      2. load factor: A measure of how full the hash table is allowed to get before its capacity is automatically increased. (default = 0.75) The lower, more free buckets, less collision, lower time complexity.
                            p.s. If stored elements > capacity*load factor, then capacity*2, and the HashMap is rehashed.(Rehashing = Redistributing items equally over the buckets, and is an expensive process)
                            p.s. Collision: When same bucket index is returned for two different keys.

        Internal structure of HashMap:                        
        HashMap initially uses the LinkedList. If collision happens, the key is added at the head of the LinkedList. Worst case performance is therefore O(n). However, when the number of entries crosses a certain threshold, it will replace a LinkedList with a balanced binary tree (Red–black tree). The TREEIFY_THRESHOLD (by default=8) constant decides this threshold value. (Worst-case performance O(log n).)
________________________________________________________________________________

    • LinkedHashMap: (extends HashMap)
                Hash table and linked list implementation of the Map interface. The order of this structure is guaranteed, which is usually the insertion-order (even re-insertion of existing key won't affect the insertion-order!). 
O(1) on add, remove, contains. Performance sliently below HashMap due to extra overhead of maintaining LinkedList. With one exception: iteration over collection is, unlike HashMap, proportional to its size (number of stored elements), not capacity, so basically cheaper.
Unsynchronized: One way to externally synchronize it is to use a wrapper:
Map m = Collections.synchronizedMap(new LinkedHashMap(...));

________________________________________________________________________________

    • TreeMap: implements NavigableMap
A Red-Black tree based NavigableMap implementation. Unlike LinkedHashMap, which returns as insertion order, a TreeMap is sorted by its natural order (or customized Comparator).
Log(n) on containsKey, get, put and remove operations.





References:
1. Collections: https://docs.oracle.com/javase/7/docs/api/java/util/Collection.html
2. Maps: https://docs.oracle.com/javase/8/docs/api/java/util/Map.html


No comments:

Post a Comment