行僧

参与开源,努力提升。 我的GitHub地址:https://github.com/playingjoker

HashMap数据结构

1)HashMap概述

HashMap是基于哈希表的map接口的非同步实现,此实现提供所有可选的映射操作,并允许使用null值和null键。
此类不保证映射的顺序,特别是它不保证该顺序恒久不变。

本案例基于jdk1.8版本

2)HashMap数据结构

在java语言编程中,最基本的数据结构就两种:数组和引用,其他所有的数据结构都可以通过这两个基本的数据结构来实现,在jkd 1.7以前,hashmap就是一个链表散列的结构,但是在jdk1.8发布后,hashmap的链表长度大于一定值过后,则变成红黑树.
HashMap数据结构
java中采用的便是链地址法,便是每个数组元素上都是一个链表。
当key被hash后,得到数组下标,将数据放在对应数组下标的链表上,其中每个元素都用node节点表示:

/**
 * Basic hash bin node, used for most entries.  (See below for 
 * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)            
 */
static class Node<K,V> implements Map.Entry<K,V> {
	final int hash;
	final K key;
	V value;
	Node<K,V> next;

	Node(int hash, K key, V value, Node<K,V> next) {
		this.hash = hash;
		this.key = key;
		this.value = value;
		this.next = next;
	}

	public final K getKey()        { return key; }
	public final V getValue()      { return value; }
	public final String toString() { return key + "=" + value; }

	public final int hashCode() {
		return Objects.hashCode(key) ^ Objects.hashCode(value);
	}

	public final V setValue(V newValue) {
		V oldValue = value;
		value = newValue;
		return oldValue;
	}

	public final boolean equals(Object o) {
		if (o == this)
			return true;
		if (o instanceof Map.Entry) {
			Map.Entry,?> e = (Map.Entry,?>)o;
		if (Objects.equals(key, e.getKey()) &&
			Objects.equals(value, e.getValue()))
			return true;
		}
		return false;
	}
}

node是hashmap的一个内部类,用来储存数据和保持链表结构的。它的本质就是一个映射(键值对)。
当然,会产生两个key值产生同一个位置,(最主要的便是因为index的产生原理,当然也有可能是产生了一样的hash值)这种情况叫哈希碰撞。当然hash算法计算结果越分散均匀,发生hash碰撞的机率就越小,map的存储效率就越高。
hashmap中又一个很重要的字段就是Node[] table。如上图所示,这就是hashmap的基本结构,构成链表的数组。

如果哈希桶数组很大,即使较差的Hash算法也会比较分散,如果哈希桶数组数组很小,即使好的Hash算法也会出现较多碰撞,所以就需要在空间成本和时间成本之间权衡,其实就是在根据实际情况确定哈希桶数组的大小,并在此基础上设计好的hash算法减少Hash碰撞。那么通过什么方式来控制map使得Hash碰撞的概率又小,哈希桶数组(Node[] table)占用空间又少呢?答案就是好的Hash算法和扩容机制。
下面是HashMap中一些很重要的参数,调整这些参数,我们可以在Hash碰撞的性能和占用空间中权衡一个平衡值。

  /**
   * The default initial capacity - MUST be a power of two. 
   */
   static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

  /**
   * The maximum capacity, used if a higher value is implicitly specified 
   * by either of the constructors with arguments. 
   * MUST be a power of two <= 1<<30. 
   */
   static final int MAXIMUM_CAPACITY = 1 << 30;

  /**
   * The load factor used when none specified in constructor. 
   */
   static final float DEFAULT_LOAD_FACTOR = 0.75f;

  /**
   * The bin count threshold for using a tree rather than list for a 
   * bin.  Bins are converted to trees when adding an element to a 
   * bin with at least this many nodes. The value must be greater 
   * than 2 and should be at least 8 to mesh with assumptions in 
   * tree removal about conversion back to plain bins upon 
   * shrinkage. 
   */
   static final int TREEIFY_THRESHOLD = 8;

  /**
   * The bin count threshold for untreeifying a (split) bin during a 
   * resize operation. Should be less than TREEIFY_THRESHOLD, and at 
   * most 6 to mesh with shrinkage detection under removal. 
   */
   static final int UNTREEIFY_THRESHOLD = 6;

  /**
   * The smallest table capacity for which bins may be treeified. 
   * (Otherwise the table is resized if too many nodes in a bin.) 
   * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts 
   * between resizing and treeification thresholds. 
   */
   static final int MIN_TREEIFY_CAPACITY = 64;

3)确认hashmap索引位置

hash算法

static final int hash(Object key) {
	  int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

4)hashmap的put过程(单指putVal方法),当容量已满会出现扩容算法,不在此篇文章讨论范围内:

  1. 对key做hash算法求值,得出index;
    1. 如果没有碰撞,则放到对应的table中去,作为firstValue;
    2. 如果发生碰撞,则以链表形式存入next节点;
    3. 如果碰撞导致链表过长(>=TREEIFY_THRESHOLD-1),就把链表转为TreeNode
    4. 如果节点已经存在,就替换oldValue
    5. 如果超过threshold(复杂算法,单独贴代码),就重新resize().
      源码如下:
/**
 * Implements Map.put and related methods * * @param hash hash for key
 * @param key the key
 * @param value the value to put
 * @param onlyIfAbsent if true, don't change existing value
 * @param evict if false, the table is in creation mode.
 * @return previous value, or null if none
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
	Node<K,V>[] tab; Node<K,V> p; int n, i;
	if ((tab = table) == null || (n = tab.length) == 0)
		n = (tab = resize()).length;
	if ((p = tab[i = (n - 1) & hash]) == null)
		tab[i] = newNode(hash, key, value, null);
	else {
		Node<K,V> e; K k;
		if (p.hash == hash &&
				((k = p.key) == key || (key != null && key.equals(k))))
			e = p;
		else if (p instanceof TreeNode)
			e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
		else {
			for (int binCount = 0; ; ++binCount) {
				if ((e = p.next) == null) {
					p.next = newNode(hash, key, value, null);
					if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
						  treeifyBin(tab, hash);
						break;
				}
				if (e.hash == hash &&
					  ((k = e.key) == key || (key != null && key.equals(k))))
					break;
				p = e;
			}
		}
		if (e != null) { // existing mapping for key
			V oldValue = e.value;
			if (!onlyIfAbsent || oldValue == null)
				e.value = value;
			afterNodeAccess(e);
			return oldValue;
		}
	}
	++modCount;
	if (++size > threshold)
		resize();
	afterNodeInsertion(evict);
	return null;
}

threshold算法代码:

/**
 * Returns a power of two size for the given target capacity. */static final int tableSizeFor(int cap) {
    int n = cap - 1;
	n |= n >>> 1;
	n |= n >>> 2;
	n |= n >>> 4;
	n |= n >>> 8;
	n |= n >>> 16;
	return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

初始化过程会先计算threshold,后续的resize会重新计算.

5)hashmap的get方法:

  1. 会优先检查第一个节点,如果第一个节点就是目标值,则直接返回;
  2. 如果为树节点,则递归树getTreeNode->find(),
    否则遍历链表,做key.equals(k),复杂度为O(n);
    源码:
/**
 * Implements Map.get and related methods * * @param hash hash for key
 * @param key the key
 * @return the node, or null if none
 */
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
	if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
		if (first.hash == hash && // always check first node
			((k = first.key) == key || (key != null && key.equals(k))))
			return first;
		if ((e = first.next) != null) {
			if (first instanceof TreeNode)
				return ((TreeNode<K,V>)first).getTreeNode(hash, key);
			do {
				if (e.hash == hash &&
					((k = e.key) == key || (key != null && key.equals(k))))
					return e;
			} while ((e = e.next) != null);
		}
	}
	return null;
}

评论
validate
337 浏览