分析Java中ArrayList与LinkedList列表结构的源码_JAVA教程

一、ArrayList源码分析（JDK7）

ArrayList内部维护了一个动态的Object数组，ArrayList的动态增删就是对这个对组的动态的增加和删除。

1、ArrayList构造以及初始化

				?

									ArrayList实例变量

									//ArrayList默认容量

									private static final int DEFAULT_CAPACITY = 10;

									//默认空的Object数组， 用于定义空的ArrayList

									private static final Object[] EMPTY_ELEMENTDATA = {};

									//ArrayList存放存放元素的Object数组

									private transient Object[] elementData;

									//ArrayList中元素的数量

									private int size;

ArrayList构造函数：

无参构造函数: 即构造一个空的Object[]

				?

									public ArrayList() {

									  super();

									  this.elementData = EMPTY_ELEMENTDATA;

									}

指定容量大小构造：

				?

									public ArrayList(int initialCapacity) {

									  super();

									  if (initialCapacity < 0)

									    throw new IllegalArgumentException("Illegal Capacity: "+

									                      initialCapacity);

									  this.elementData = new Object[initialCapacity];

									}

指定某一实现Collection接口的集合构造：

				?

									public ArrayList(Collection<? extends E> c) {

									  elementData = c.toArray();

									  size = elementData.length;

									  // c.toArray might (incorrectly) not return Object[] (see 6260652)

									  if (elementData.getClass() != Object[].class)

									    elementData = Arrays.copyOf(elementData, size, Object[].class);

									}

这里也说明了Collection的作用， java-collection-framwork设计Collection接口而不是直接使用List，Set等接口的原因。

2、ArrayList的容量分配机制

ArrayList的容量上限： ArrayList容量是有上限的，理论允许分配Integer.Max_VALUE - 8大小的容量。但是能分配多少还跟堆栈设置有关，需要设置VM参数

				?

									private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

调用Add方法时扩容规则

				?

									public boolean add(E e) {

									    ensureCapacityInternal(size + 1); // Increments modCount!!

									    elementData[size++] = e;

									    return true;

									  }

ensureCapacityInternal(int)方法实际上确定一个最小扩容大小。

				?

									private void ensureCapacityInternal(int minCapacity) {

									    if (elementData == EMPTY_ELEMENTDATA) {

									      minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);

									    }

									    ensureExplicitCapacity(minCapacity);

									  }

									private void ensureExplicitCapacity(int minCapacity) {

									    modCount++;

									    // overflow-conscious code

									    if (minCapacity - elementData.length > 0)

									      grow(minCapacity);

									  }

关于modCount： modCount定义在抽象类AbstratList中，源码的注释基本说明了它的用处:在使用迭代器遍历的时候，用来检查列表中的元素是否发生结构性变化（列表元素数量发生改变的一个计数）了，主要在多线程环境下需要使用，防止一个线程正在迭代遍历，另一个线程修改了这个列表的结构。
grow方法为真正的扩容方法

				?

									private void grow(int minCapacity) {

									    // overflow-conscious code

									    int oldCapacity = elementData.length;

									    int newCapacity = oldCapacity + (oldCapacity >> 1);

									    if (newCapacity - minCapacity < 0)

									      newCapacity = minCapacity;

									    if (newCapacity - MAX_ARRAY_SIZE > 0)

									      newCapacity = hugeCapacity(minCapacity);

									    // minCapacity is usually close to size, so this is a win:

									    elementData = Arrays.copyOf(elementData, newCapacity);

									  }

其中对大容量扩容多少还有个hugeCapacity方法

				?

									private static int hugeCapacity(int minCapacity) {

									    if (minCapacity < 0) // overflow

									      throw new OutOfMemoryError();

									    return (minCapacity > MAX_ARRAY_SIZE) ?

									      Integer.MAX_VALUE :

									      MAX_ARRAY_SIZE;

									  }

总结：
每次扩容都会伴随着数组的复制操作，因此一次给定恰当的容量会提高一点性能。
下图是我归纳的整个扩容流程：

分析Java中ArrayList与LinkedList列表结构的源码

3.ArrayList迭代器

ArrayList的迭代器主要有两种Itr和ListItr, 但是在jDK1.8中还添加了一个ArrayListSpliterator, 下面分别学一下Itr和ListItr的源码分析。

（1）Itr:只能向后遍历

				?

									private class Itr implements Iterator<E> {

									    int cursor;    // index of next element to return

									    int lastRet = -1; // index of last element returned; -1 if no such

									    //expectedModCount 是modCount的一个副本

									    int expectedModCount = modCount;

									    public boolean hasNext() {

									      return cursor != size;

									    }

									    @SuppressWarnings("unchecked")

									    public E next() {

									      checkForComodification();

									      //记录当前位置

									      int i = cursor;

									      if (i >= size)

									        throw new NoSuchElementException();

									      Object[] elementData = ArrayList.this.elementData;

									      if (i >= elementData.length)

									        throw new ConcurrentModificationException();

									      //下一个元素的位置

									      cursor = i + 1;

									      return (E) elementData[lastRet = i];

									    }

									    //使用迭代器的remove方法

									    public void remove() {

									      if (lastRet < 0)

									        throw new IllegalStateException();

									      checkForComodification();

									      try {

									        //注意内部类调用外部类的方式

									        ArrayList.this.remove(lastRet);

									        //remove之后需要重新调整各个指针的位置

									        cursor = lastRet;

									        lastRet = -1;

									        expectedModCount = modCount;

									      } catch (IndexOutOfBoundsException ex) {

									        throw new ConcurrentModificationException();

									      }

									    }

									    final void checkForComodification() {

									      if (modCount != expectedModCount)

									        throw new ConcurrentModificationException();

									    }

									  }

从源码中可以看出Itr迭代器是向前迭代器，它提供了一个next方法用于获取ArrayList中的元素。
checkForComodification是java-collection-framwork中的一种fail-fast的错误检测机制。在多线程环境下对同一个集合操作，就可能触发fail-fast机制，抛出ConcurrentModificationException异常。

Itr迭代器定义了一个expectedModCount记录modCount副本。在ArrayList执行改变结构的操作的时候例如Add， remove， clear方法时modCount的值会改变。

通过Itr源码可以看出调用next和remove方法会触发fail-fast检查。此时如果在遍历该集合时，存在其他线程正在执行改变该集合结构的操作时就会发生异常。

（2）ListItr：支持向前和向后遍历，下面看看ListItr的源码：

				?

									private class ListItr extends Itr implements ListIterator<E> {

									    ListItr(int index) {

									      super();

									      cursor = index;

									    }

									    public boolean hasPrevious() {

									      return cursor != 0;

									    }

									    public int nextIndex() {

									      return cursor;

									    }

									    public int previousIndex() {

									      return cursor - 1;

									    }

									    @SuppressWarnings("unchecked")

									    public E previous() {

									      checkForComodification();

									      //arrayList前一个元素的位置

									      int i = cursor - 1;

									      if (i < 0)

									        throw new NoSuchElementException();

									      Object[] elementData = ArrayList.this.elementData;

									      if (i >= elementData.length)

									        throw new ConcurrentModificationException();

									      cursor = i;

									      return (E) elementData[lastRet = i];

									    }

									    //该迭代器中添加了set方法

									    public void set(E e) {

									      if (lastRet < 0)

									        throw new IllegalStateException();

									      checkForComodification();

									      try {

									        ArrayList.this.set(lastRet, e);

									      } catch (IndexOutOfBoundsException ex) {

									        throw new ConcurrentModificationException();

									      }

									    }

									    //该迭代器添加了add方法

									    public void add(E e) {

									      checkForComodification();

									      try {

									        int i = cursor;

									        ArrayList.this.add(i, e);

									        //重新标记指针位置

									        cursor = i + 1;

									        lastRet = -1;

									        expectedModCount = modCount;

									      } catch (IndexOutOfBoundsException ex) {

									        throw new ConcurrentModificationException();

									      }

									    }

									  }

ListItr的实现基本与Itr一致，添加了可以先前遍历的方法以及add与set方法。

（3）使用java.util.concurrent中的CopyOnWriteArrayList解决fast-fail问题

CopyOnWriteArrayList是线程安全的，具体看一下它的add方法源码：

				?

									public boolean add(E e) {

									    final ReentrantLock lock = this.lock;

									    lock.lock();

									    try {

									      Object[] elements = getArray();

									      int len = elements.length;

									      Object[] newElements = Arrays.copyOf(elements, len + 1);

									      newElements[len] = e;

									      setArray(newElements);

									      return true;

									    } finally {

									      lock.unlock();

									    }

									  }

CopyOnWriteArrayList就是写时复制的ArrayList。当开始写数据的操作时候， Arrays.copyOf一个新的数组，这样不会影响读操作。
这样的代价就是会损耗内存，带来性能的问题。CopyOnWriteArrayList写的时候在内存中生成一个副本对象，同时原来的对象仍然存在。
CopyOnWriteArrayList无法保证数据实时的一致，只能保证结果的一致。适用于并发下读多写少得场景，例如缓存。

（4）ArrayList的其他方法源码：

一个私有方法batchRemove(Collection<?>c, boolean complement)，即批量移除操作

				?

									private boolean batchRemove(Collection<?> c, boolean complement) {

									    //下面会提到使用final的原因

									    final Object[] elementData = this.elementData;

									    int r = 0, w = 0;

									    boolean modified = false;

									    try {

									      //遍历List中的元素，进行验证

									      for (; r < size; r++)

									        if (c.contains(elementData[r]) == complement)

									          elementData[w++] = elementData[r];

									    } finally {

									      //try中如果出现异常，则保证数据一致性执行下面的copy操作

									      if (r != size) {

									        System.arraycopy(elementData, r,

									                 elementData, w,

									                 size - r);

									        w += size - r;

									      }

									      //清理无用的元素， 通知GC回收

									      if (w != size) {

									        // clear to let GC do its work

									        for (int i = w; i < size; i++)

									          elementData[i] = null;

									        modCount += size - w;

									        size = w;

									        modified = true;

									      }

									    }

									    return modified;

									  }

final修饰的变量指的是同一个引用，为了后面保持数据的一致性。
此方法，想保留Collection c中的元素时， complement值为true；想移除c中的元素时， complement值为false。这样就分别变成了retainAll和removeAll方法。

swap：交换arrayList中的某两个位置的

二、LinkedList源码分析（JDK7）

LinkedList即链表，相对于顺序表，链表存储数据不需要使用地址连续的内存单元。减少了修改容器结构而带来的移动元素的问题，顺序访问相对高效。

1、结点（Node）的定义

JDK中的LinkedList是双向链表，每个结点分别存有上一个结点和下一个结点的信息。它的定义如下：

				?

									private static class Node<E> {

									  E item;

									  Node<E> next;

									  Node<E> prev;

									  Node<E> (Node<E> prev, E element, Node<E> next) {

									    this.item = element;

									    this.next = next;

									    this.prev = prev;

									  }

									}

2、LinkedList构造以及初始化

成员： LinkedList中维护了3个成员变量，用以记录链表中结点的个数，结点的前驱以及后继

				?

									transient int size = 0;

									transient Node<E> first;

									transient Node<E> last;

构造函数：默认构造函数即构造一个空的LinkedList

				?

									public LinkedList() {}

或者根据其他容器进行构造，后面我们会自己写一个构造一个有序的链表

				?

									public LinkedList(Collection<? extends E> c) {

									    this();

									    addAll(c);

									}

这里给出一点补充，关于泛型修饰符? super T 与 ? extends T的区别，参见这篇文章泛型中? super T和? extends T的区别

3、LinkedList的结构操作

头插法：即在链表头插入一个元素

				?

									private void linkFirst(E e) {

									  final Node<E> f = first;

									  final Node<E> newNode = new Node<>(null, e, f);

									  first = newNode;

									  //判断是否是空链表

									  if (f == null)

									    last = newNode;

									  else

									    f.prev = newNode;

									  size++;

									  modCount++;

									  }

尾插法：即在链表尾部插入一个元素

				?

									void linkLast(E e) {

									  final Node<E> l = last;

									  final Node<E> newNode = new Node<>(l, e, null);

									  last = newNode;

									  if (l == null)

									    first = newNode;

									  else

									    l.next = newNode;

									  size++;

									  modCount++;

									}

插入到当前结点之前：找当前结点的前驱

				?

									void linkBefore(E e, Node<E> succ) {

									    //确定当然结点非空

									    final Node<E> pred = succ.prev;

									    final Node<E> newNode = new Node<>(pred, e, succ);

									    succ.prev = newNode;

									    //判断当前结点是否是第一个结点

									    if (pred == null)

									      first = newNode;

									    else

									      pred.next = newNode;

									    size++;

									    modCount++;

									  }

头删法：删除链表的第一个结点

				?

									private E unlinkFirst(Node<E> f) {

									  // assert f == first && f != null;

									  final E element = f.item;

									  final Node<E> next = f.next;

									  f.item = null;

									  f.next = null; // help GC

									  first = next;

									  if (next == null)

									    last = null;

									  else

									    next.prev = null;

									  size--;

									  modCount++;

									  return element;

									}

尾删法：删除链表的最后一个结点

				?

									private E unlinkLast(Node<E> l) {

									  //保证l==last 并且l != null

									  final E element = l.item;

									  final Node<E> prev = l.prev;

									  l.item = null;

									  l.prev = null; // help GC

									  last = prev;

									  if (prev == null)

									    first = null;

									  else

									    prev.next = null;

									  size--;

									  modCount++;

									  return element;

									}

4、保持List接口与Deque的一致性

List接口允许使用下标来实现对容器的随机访问，对于数组这种实现随机访问是很容易的。对于链表，JDK也从逻辑上利用链表中结点的计数给出了随机访问的实现

				?

									Node<E> node(int index) {

									    //确保index的正确性

									    if (index < (size >> 1)) {

									      Node<E> x = first;

									      for (int i = 0; i < index; i++)

									        x = x.next;

									      return x;

									    } else {

									      Node<E> x = last;

									      for (int i = size - 1; i > index; i--)

									        x = x.prev;

									      return x;

									    }

									  }

index 属于前半部分的计数，从头遍历查找。index属于后半部分的计数，从末尾遍历查找。充分利用双向链表的特点。
因此，add(int index, T t), get(int), set(int)等方法就可以很容易的实现。

LinkedList实现了Deque接口，也就是LinkedList实现了双端队列容器的方法，下面给出一些API的总结。

5、LinkedList的遍历

既然LinkedList是双向链表，自然就可以前后遍历。与ArrayList同样，涉及到多线程操作容器的时候LinkedList也会出现fail-fast问题。
对于fail-fast问题上篇已经讲解过，这里就不说了。

关于迭代器，LinkedList有listIterator双向迭代器，和DescendingIterator逆序迭代器。都很简单。源码不在分析

如果遍历元素的话，随机访问的代价是比较大得。

三、LinkedList，ArrayList， Vector总结

1、LinkedList与ArrayList

ArrayList是实现了基于动态数组的数据结构，LinkedList基于链表的数据结构。

对于随机访问get和set，ArrayList觉得优于LinkedList，因为LinkedList要移动指针。

对于新增和删除操作add和remove，LinedList比较占优势，因为ArrayList要移动数据。这一点要看实际情况的。若只对单条数据插入或删除，ArrayList的速度反而优于LinkedList。但若是批量随机的插入删除数据，LinkedList的速度大大优于ArrayList. 因为ArrayList每插入一条数据，要移动插入点及之后的所有数据。

2、ArrayList与Vector

vector是线程同步的，所以它也是线程安全的，而arraylist是线程异步的，是不安全的。如果不考虑到线程的安全因素，一般用arraylist效率比较高。

如果集合中的元素的数目大于目前集合数组的长度时，vector增长率为目前数组长度的100%,而arraylist增长率为目前数组长度的50%.如过在集合中使用数据量比较大的数据，用vector有一定的优势。

如果查找一个指定位置的数据，vector和arraylist使用的时间是相同的，都是0(1),这个时候使用vector和arraylist都可以。而如果移动一个指定位置的数据花费的时间为0(n-i)n为总长度，这个时候就应该考虑到使用linklist,因为它移动一个指定位置的数据所花费的时间为0(1),而查询一个指定位置的数据时花费的时间为0(i)。

ArrayList 和Vector是采用数组方式存储数据，此数组元素数大于实际存储的数据以便增加和插入元素，都允许直接序号索引元素，但是插入数据要设计到数组元素移动等内存操作，所以索引数据快插入数据慢，Vector由于使用了synchronized方法（线程安全）所以性能上比ArrayList要差，LinkedList使用双向链表实现存储，按序号索引数据需要进行向前或向后遍历，但是插入数据时只需要记录本项的前后项即可，所以插入数度较快！