文章詳情頁

分析Java中Map的遍歷性能問題

瀏覽：18日期：2022-08-09 17:25:59

目錄一、引言二、迭代器測試三、迭代器源碼探究四、其他遍歷方法4.1、增強型for循環4.2、Map.forEach4.3、Stream.forEach五、總結一、引言

我們知道java HashMap的擴容是有成本的，為了減少擴容的次數和成本，可以給HashMap設置初始容量大小，如下所示：

HashMap<string, integer=''> map0 = new HashMap<string, integer=''>(100000);

但是在實際使用的過程中，發現性能不但沒有提升，反而顯著下降了！代碼里對HashMap的操作也只有遍歷了，看來是遍歷出了問題，于是做了一番測試，得到如下結果：

HashMap的迭代器遍歷性能與 initial capacity 有關，與size無關

二、迭代器測試

貼上測試代碼：

public class MapForEachTest { public static void main(String[] args) {HashMap<string, integer=''> map0 = new HashMap<string, integer=''>(100000);initDataAndPrint(map0);HashMap<string, integer=''> map1 = new HashMap<string, integer=''>();initDataAndPrint(map1); } private static void initDataAndPrint(HashMap map) {initData(map);long start = System.currentTimeMillis();for (int i = 0; i < 100; i++) { forEach(map);}long end = System.currentTimeMillis();System.out.println('');System.out.println('HashMap Size: ' + map.size() + ' 耗時: ' + (end - start) + ' ms'); } private static void forEach(HashMap map) {for (Iterator<map.entry<string, integer=''>> it = map.entrySet().iterator(); it.hasNext();){ Map.Entry<string, integer=''> item = it.next(); System.out.print(item.getKey()); // do something} } private static void initData(HashMap map) {map.put('a', 0);map.put('b', 1);map.put('c', 2);map.put('d', 3);map.put('e', 4);map.put('f', 5); }}

這是運行結果：

分析Java中Map的遍歷性能問題

我們將第一個Map初始化10w大小，第二個map不指定大小(實際16)，兩個存儲相同的數據，但是用迭代器遍歷100次的時候發現性能迥異，一個36ms一個4ms，實際上性能差距更大，這里的4ms是600次System.out.print的耗時，這里將print注掉再試下

for (Iterator<map.entry<string, integer=''>> it = map.entrySet().iterator(); it.hasNext();){ Map.Entry<string, integer=''> item = it.next(); // System.out.print(item.getKey()); // do something}

輸出結果如下：

分析Java中Map的遍歷性能問題

可以發現第二個map耗時幾乎為0，第一個達到了28ms，遍歷期間沒有進行任何操作，既然石錘了和 initial capacity 有關，下一步我們去看看為什么會這樣，找找Map迭代器的源碼看看。

三、迭代器源碼探究

我們來看看Map.entrySet().iterator()的源碼；

public final Iterator<map.entry<k,v>> iterator() { return new EntryIterator();}

其中EntryIterator是HashMap的內部抽象類，源碼并不多，我全部貼上來并附上中文注釋

abstract class HashIterator { // 下一個Node Node<k,v> next; // next entry to return // 當前Node Node<k,v> current; // current entry // 預期的Map大小，也就是說每個HashMap可以有多個迭代器(每次調用 iterator() 會new 一個迭代器出來)，但是只能有一個迭代器對他remove，否則會直接報錯(快速失敗) int expectedModCount; // for fast-fail// 當前節點所在的數組下標，HashMap內部是使用數組來存儲數據的，不了解的先去看看HashMap的源碼吧 int index; // current slot HashIterator() {// 初始化 expectedModCountexpectedModCount = modCount;// 淺拷貝一份Map的數據Node<k,v>[] t = table;current = next = null;index = 0;// 如果 Map 中數據不為空，遍歷數組找到第一個實際存儲的素，賦值給nextif (t != null && size > 0) { // advance to first entry do {} while (index < t.length && (next = t[index++]) == null);} } public final boolean hasNext() {return next != null; } final Node<k,v> nextNode() {// 用來淺拷貝table，和別名的作用差不多，沒啥用Node<k,v>[] t;// 定義一個e指存儲next，并在找到下一值時返它自己Node<k,v> e = next;if (modCount != expectedModCount) throw new ConcurrentModificationException();if (e == null) throw new NoSuchElementException(); // 使current指向e，也就是next，這次要找的值，并且讓next = current.next，一般為nullif ((next = (current = e).next) == null && (t = table) != null) { do {} while (index < t.length && (next = t[index++]) == null);}return e; } /** * 刪除元素，這里不講了，調的是HashMap的removeNode，沒啥特別的 **/ public final void remove() {Node<k,v> p = current;if (p == null) throw new IllegalStateException();if (modCount != expectedModCount) throw new ConcurrentModificationException();current = null;K key = p.key;removeNode(hash(key), key, null, false, false);// 用來保證快速失敗的expectedModCount = modCount; }}

上面的代碼一看就明白了，迭代器每次尋找下一個元素都會去遍歷數組，如果 initial capacity 特別大的話，也就是說 threshold 也大，table.length就大，所以遍歷比較耗性能。

table數組的大小設置是在resize()方法里：

Node<k,v>[] newTab = (Node<k,v>[])new Node[newCap];table = newTab;四、其他遍歷方法

注意代碼里我們用的是Map.entrySet().iterator()，實際上和keys().iterator(), values().iterator() 一樣，源碼如下：

final class KeyIterator extends HashIterator implements Iterator<k> { public final K next() { return nextNode().key; }}final class ValueIterator extends HashIterator implements Iterator<v> { public final V next() { return nextNode().value; }}final class EntryIterator extends HashIterator implements Iterator<map.entry<k,v>> { public final Map.Entry<k,v> next() { return nextNode(); }}

這兩個就不分析了，性能一樣。

實際使用中對集合的遍歷還有幾種方法：

普通for循環+下標增強型for循環 Map.forEach Stream.forEach

普通for循環+下標的方法不適用于Map，這里不討論了。

4.1、增強型for循環

增強行for循環實際上是通過迭代器來實現的，我們來看兩者的聯系

源碼：

編譯后的字節碼：

// access flags 0xA private static forEach(Ljava/util/HashMap;)V L0 LINENUMBER 41 L0 ALOAD 0 INVOKEVIRTUAL java/util/HashMap.entrySet ()Ljava/util/Set; INVOKEINTERFACE java/util/Set.iterator ()Ljava/util/Iterator; (itf) ASTORE 1 L1 FRAME APPEND [java/util/Iterator] ALOAD 1 INVOKEINTERFACE java/util/Iterator.hasNext ()Z (itf) IFEQ L2 L3 LINENUMBER 42 L3 ALOAD 1 INVOKEINTERFACE java/util/Iterator.next ()Ljava/lang/Object; (itf) CHECKCAST java/util/Map$Entry ASTORE 2 L4 LINENUMBER 43 L4 GETSTATIC java/lang/System.out : Ljava/io/PrintStream; ALOAD 2 INVOKEINTERFACE java/util/Map$Entry.getKey ()Ljava/lang/Object; (itf) CHECKCAST java/lang/String INVOKEVIRTUAL java/io/PrintStream.print (Ljava/lang/String;)V L5 LINENUMBER 45 L5 GOTO L1 L2 LINENUMBER 46 L2 FRAME CHOP 1 RETURN L6 LOCALVARIABLE item Ljava/util/Map$Entry; L4 L5 2 // signature Ljava/util/Map$Entry<ljava lang='' string;ljava='' integer;=''>; // declaration: item extends java.util.Map$Entry<java.lang.string, java.lang.integer=''> LOCALVARIABLE it Ljava/util/Iterator; L1 L2 1 // signature Ljava/util/Iterator<ljava util='' map$entry<ljava='' lang='' string;ljava='' integer;=''>;>; // declaration: it extends java.util.Iterator<java.util.map$entry<java.lang.string, java.lang.integer=''>> LOCALVARIABLE map Ljava/util/HashMap; L0 L6 0 MAXSTACK = 2 MAXLOCALS = 3 // access flags 0xA // signature (Ljava/util/HashMap<ljava lang='' string;ljava='' integer;=''>;)V // declaration: void forEach0(java.util.HashMap<java.lang.string, java.lang.integer=''>) private static forEach0(Ljava/util/HashMap;)V L0 LINENUMBER 50 L0 ALOAD 0 INVOKEVIRTUAL java/util/HashMap.entrySet ()Ljava/util/Set; INVOKEINTERFACE java/util/Set.iterator ()Ljava/util/Iterator; (itf) ASTORE 1 L1 FRAME APPEND [java/util/Iterator] ALOAD 1 INVOKEINTERFACE java/util/Iterator.hasNext ()Z (itf) IFEQ L2 ALOAD 1 INVOKEINTERFACE java/util/Iterator.next ()Ljava/lang/Object; (itf) CHECKCAST java/util/Map$Entry ASTORE 2 L3 LINENUMBER 51 L3 GETSTATIC java/lang/System.out : Ljava/io/PrintStream; ALOAD 2 INVOKEINTERFACE java/util/Map$Entry.getKey ()Ljava/lang/Object; (itf) INVOKEVIRTUAL java/io/PrintStream.print (Ljava/lang/Object;)V L4 LINENUMBER 52 L4 GOTO L1 L2 LINENUMBER 53 L2 FRAME CHOP 1 RETURN L5 LOCALVARIABLE entry Ljava/util/Map$Entry; L3 L4 2 LOCALVARIABLE map Ljava/util/HashMap; L0 L5 0 // signature Ljava/util/HashMap<ljava lang='' string;ljava='' integer;=''>; // declaration: map extends java.util.HashMap<java.lang.string, java.lang.integer=''> MAXSTACK = 2 MAXLOCALS = 3

都不用耐心觀察，兩個方法的字節碼除了局部變量不一樣其他都幾乎一樣，由此可以得出增強型for循環性能與迭代器一樣，實際運行結果也一樣，我不展示了，感興趣的自己去copy文章開頭和結尾的代碼試下。

分析Java中Map的遍歷性能問題

4.2、Map.forEach

先說一下為什么不把各種方法一起運行同時打印性能，這是因為CPU緩存的原因和JVM的一些優化會干擾到性能的判斷，附錄全部測試結果有說明

直接來看源碼吧

@Overridepublic void forEach(BiConsumer action) { Node<k,v>[] tab; if (action == null)throw new NullPointerException(); if (size > 0 && (tab = table) != null) {int mc = modCount;for (int i = 0; i < tab.length; ++i) { for (Node<k,v> e = tab[i]; e != null; e = e.next)action.accept(e.key, e.value);}if (modCount != mc) throw new ConcurrentModificationException(); }}

很簡短的源碼，就不打注釋了，從源碼我們不難獲取到以下信息：

該方法也是快速失敗的，遍歷期間不能刪除元素需要遍歷整個數組 BiConsumer加了@FunctionalInterface注解，用了 lambda

第三點和性能無關，這里只是提下

通過以上信息我們能確定這個性能與table數組的大小有關。

但是在實際測試的時候卻發現性能比迭代器差了不少：

分析Java中Map的遍歷性能問題

4.3、Stream.forEach

Stream與Map.forEach的共同點是都使用了lambda表達式。但兩者的源碼沒有任何復用的地方。

不知道你有沒有看累，先上測試結果吧：

分析Java中Map的遍歷性能問題

耗時比Map.foreach還要高點。

下面講講Straam.foreach順序流的源碼，這個也不復雜，不過累的話先去看看總結吧。

Stream.foreach的執行者是分流器，HashMap的分流器源碼就在HashMap類中，是一個靜態內部類，類名叫 EntrySpliterator

下面是順序流執行的方法

public void forEachRemaining(Consumer> action) { int i, hi, mc; if (action == null)throw new NullPointerException(); HashMap<k,v> m = map; Node<k,v>[] tab = m.table; if ((hi = fence) < 0) {mc = expectedModCount = m.modCount;hi = fence = (tab == null) ? 0 : tab.length; } elsemc = expectedModCount; if (tab != null && tab.length >= hi &&(i = index) >= 0 && (i < (index = hi) || current != null)) {Node<k,v> p = current;current = null;do { if (p == null)p = tab[i++]; else {action.accept(p);p = p.next; }} while (p != null || i < hi);if (m.modCount != mc) throw new ConcurrentModificationException(); }}

從以上源碼中我們也可以輕易得出遍歷需要順序掃描所有數組

五、總結

至此，Map的四種遍歷方法都測試完了，我們可以簡單得出兩個結論

Map的遍歷性能與內部table數組大小有關，也就是說與常用參數 initial capacity 有關，不管哪種遍歷方式都是的性能（由高到低）：迭代器 == 增強型For循環 > Map.forEach > Stream.foreach

這里就不說什么多少倍多少倍的性能差距了，拋開數據集大小都是扯淡，當我們不指定initial capacity的時候，四種遍歷方法耗時都是3ms，這3ms還是輸入輸出流的耗時，實際遍歷耗時都是0，所以數據集不大的時候用哪種都無所謂，就像不加輸入輸出流耗時不到1ms一樣，很多時候性能消耗是在遍歷中的業務操作，這篇文章不是為了讓你去優化代碼把foreach改成迭代器的，在大多數場景下并不需要關注迭代本身的性能，Stream與Lambda帶來的可讀性提升更加重要。

所以此文的目的就當是知識拓展吧，除了以上說到的遍歷性能問題，你還應該從中能獲取到的知識點有：

HashMap的數組是存儲在table數組里的 table數組是resize方法初始化的，new Map不會初始化數組 Map遍歷是table數組從下標0遞增排序的，所以他是無序的 keySet().iterator，values.iterator， entrySet.iterator 來說沒有本質區別，用的都是同一個迭代器各種遍歷方法里，只有迭代器可以remove，雖然增強型for循環底層也是迭代器，但這個語法糖隱藏了 remove 方法每次調用迭代器方法都會new 一個迭代器，但是只有一個可以修改 Map.forEach與Stream.forEach看上去一樣，實際實現是不一樣的

附：四種遍歷源碼

private static void forEach(HashMap map) { for (Iterator<map.entry<string, integer=''>> it = map.entrySet().iterator(); it.hasNext();){Map.Entry<string, integer=''> item = it.next();// System.out.print(item.getKey());// do something }}private static void forEach0(HashMap<string, integer=''> map) { for (Map.Entry entry : map.entrySet()) {System.out.print(entry.getKey()); }}private static void forEach1(HashMap<string, integer=''> map) { map.forEach((key, value) -> {System.out.print(key); });}private static void forEach2(HashMap<string, integer=''> map) { map.entrySet().stream().forEach(e -> {System.out.print(e.getKey()); });}

附：完整測試類與測試結果+一個奇怪的問題

public class MapForEachTest { public static void main(String[] args) {HashMap<string, integer=''> map0 = new HashMap<string, integer=''>(100000);HashMap<string, integer=''> map1 = new HashMap<string, integer=''>();initData(map0);initData(map1);testIterator(map0);testIterator(map1);testFor(map0);testFor(map1);testMapForeach(map0);testMapForeach(map1);testMapStreamForeach(map0);testMapStreamForeach(map1); } private static void testIterator(HashMap map) {long start = System.currentTimeMillis();for (int i = 0; i < 100; i++) { forEach(map);}long end = System.currentTimeMillis();System.out.println('');System.out.println('HashMap Size: ' + map.size() + ' 迭代器耗時: ' + (end - start) + ' ms'); } private static void testFor(HashMap map) {long start = System.currentTimeMillis();for (int i = 0; i < 100; i++) { forEach0(map);}long end = System.currentTimeMillis();System.out.println('');System.out.println('HashMap Size: ' + map.size() + ' 增強型For 耗時: ' + (end - start) + ' ms'); } private static void testMapForeach(HashMap map) {long start = System.currentTimeMillis();for (int i = 0; i < 100; i++) { forEach1(map);}long end = System.currentTimeMillis();System.out.println('');System.out.println('HashMap Size: ' + map.size() + ' MapForeach 耗時: ' + (end - start) + ' ms'); } private static void testMapStreamForeach(HashMap map) {long start = System.currentTimeMillis();for (int i = 0; i < 100; i++) { forEach2(map);}long end = System.currentTimeMillis();System.out.println('');System.out.println('HashMap Size: ' + map.size() + ' MapStreamForeach 耗時: ' + (end - start) + ' ms'); } private static void forEach(HashMap map) {for (Iterator<map.entry<string, integer=''>> it = map.entrySet().iterator(); it.hasNext();){ Map.Entry<string, integer=''> item = it.next(); System.out.print(item.getKey()); // do something} } private static void forEach0(HashMap<string, integer=''> map) {for (Map.Entry entry : map.entrySet()) { System.out.print(entry.getKey());} } private static void forEach1(HashMap<string, integer=''> map) {map.forEach((key, value) -> { System.out.print(key);}); } private static void forEach2(HashMap<string, integer=''> map) {map.entrySet().stream().forEach(e -> { System.out.print(e.getKey());}); } private static void initData(HashMap map) {map.put('a', 0);map.put('b', 1);map.put('c', 2);map.put('d', 3);map.put('e', 4);map.put('f', 5); }}

測試結果：

分析Java中Map的遍歷性能問題

如果你認真看了上面的文章的話，會發現測試結果有個不對勁的地方：

MapStreamForeach的耗時似乎變少了

我可以告訴你這不是數據的原因，從我的測試測試結果來看，直接原因是因為先執行了 Map.foreach，如果你把 MapForeach 和 MapStreamForeach 調換一下執行順序，你會發現后執行的那個耗時更少。

以上就是分析Java中Map的遍歷性能問題的詳細內容，更多關于Java Map 遍歷性能的資料請關注好吧啦網其它相關文章！

Java

上一條：探討Java中的深淺拷貝問題下一條：Java 中的 Unsafe 魔法類的作用大全

相關文章：

1. IntelliJ IDEA設置默認瀏覽器的方法2. idea設置提示不區分大小寫的方法3. HTTP協議常用的請求頭和響應頭響應詳解說明（學習）4. IntelliJ IDEA創建web項目的方法5. VMware中如何安裝Ubuntu6. docker容器調用yum報錯的解決辦法7. .NET SkiaSharp 生成二維碼驗證碼及指定區域截取方法實現8. CentOS郵件服務器搭建系列—— POP / IMAP 服務器的構建（ Dovecot ）9. css代碼優化的12個技巧10. django創建css文件夾的具體方法

排行榜

					
					HTTP協議常用的請求頭和響應頭響應詳解說明（學習）
IntelliJ IDEA設置默認瀏覽器的方法
idea設置提示不區分大小寫的方法
docker容器調用yum報錯的解決辦法
.NET SkiaSharp 生成二維碼驗證碼及指定區域截取方法實現
IntelliJ IDEA創建web項目的方法
CentOS郵件服務器搭建系列—— POP / IMAP 服務器的構建（ Dovecot ）
VMware中如何安裝Ubuntu
使用IntelliJ IDEA 配置安卓(Android)開發環境的教程詳解(新手必看)
IntelliJ IDEA導入項目的方法
django創建css文件夾的具體方法