redis专属链表ziplist的使用_Redis

问题抛出

用过 python 的列表吗？就是那种可以存储任意类型数据的，支持随机读取的数据结构。
没有用过的话那就没办法了。

本质上这种列表可以使用数组、链表作为其底层结构，不知道python中的列表是以什么作为底层结构的。
但是redis的列表既不是用链表，也不是用数组作为其底层实现的，原因也显而易见：数组不方便，弄个二维的？柔性的？怎么写？链表可以实现，通用链表嘛，数据域放 void* 就可以实现列表功能。但是，链表的缺点也很明显，容易造成内存碎片。

在这个大环境下，秉承着“能省就省”的指导思想，请你设计一款数据结构。

结构设计

redis专属链表ziplist的使用

这个图里要注意，右侧是没有记录“当前元素的大小”的

这个图挺详细哈，都省得我对每一个字段释义了，整挺好。

其他话，文件开头的注释也讲的很清楚了。（ziplist.c）

				?

									/* the ziplist is a specially encoded dually linked list that is designed

									 * to be very memory efficient. it stores both strings and integer values,

									 * where integers are encoded as actual integers instead of a series of

									 * characters. it allows push and pop operations on either side of the list

									 * in o(1) time. however, because every operation requires a reallocation of

									 * the memory used by the ziplist, the actual complexity is related to the

									 * amount of memory used by the ziplist.

									 *

									 * ----------------------------------------------------------------------------

									 *

									 * ziplist overall layout

									 * ======================

									 *

									 * the general layout of the ziplist is as follows:

									 *

									 * <zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>

									 *

									 * note: all fields are stored in little endian, if not specified otherwise.

									 *

									 * <uint32_t zlbytes> is an unsigned integer to hold the number of bytes that

									 * the ziplist occupies, including the four bytes of the zlbytes field itself.

									 * this value needs to be stored to be able to resize the entire structure

									 * without the need to traverse it first.

									 *

									 * <uint32_t zltail> is the offset to the last entry in the list. this allows

									 * a pop operation on the far side of the list without the need for full

									 * traversal.

									 *

									 * <uint16_t zllen> is the number of entries. when there are more than

									 * 2^16-2 entries, this value is set to 2^16-1 and we need to traverse the

									 * entire list to know how many items it holds.

									 *

									 * <uint8_t zlend> is a special entry representing the end of the ziplist.

									 * is encoded as a single byte equal to 255. no other normal entry starts

									 * with a byte set to the value of 255.

									 *

									 * ziplist entries

									 * ===============

									 *

									 * every entry in the ziplist is prefixed by metadata that contains two pieces

									 * of information. first, the length of the previous entry is stored to be

									 * able to traverse the list from back to front. second, the entry encoding is

									 * provided. it represents the entry type, integer or string, and in the case

									 * of strings it also represents the length of the string payload.

									 * so a complete entry is stored like this:

									 *

									 * <prevlen> <encoding> <entry-data>

									 *

									 * sometimes the encoding represents the entry itself, like for small integers

									 * as we'll see later. in such a case the <entry-data> part is missing, and we

									 * could have just:

									 *

									 * <prevlen> <encoding>

									 *

									 * the length of the previous entry, <prevlen>, is encoded in the following way:

									 * if this length is smaller than 254 bytes, it will only consume a single

									 * byte representing the length as an unsinged 8 bit integer. when the length

									 * is greater than or equal to 254, it will consume 5 bytes. the first byte is

									 * set to 254 (fe) to indicate a larger value is following. the remaining 4

									 * bytes take the length of the previous entry as value.

									 *

									 * so practically an entry is encoded in the following way:

									 *

									 * <prevlen from 0 to 253> <encoding> <entry>

									 *

									 * or alternatively if the previous entry length is greater than 253 bytes

									 * the following encoding is used:

									 *

									 * 0xfe <4 bytes unsigned little endian prevlen> <encoding> <entry>

									 *

									 * the encoding field of the entry depends on the content of the

									 * entry. when the entry is a string, the first 2 bits of the encoding first

									 * byte will hold the type of encoding used to store the length of the string,

									 * followed by the actual length of the string. when the entry is an integer

									 * the first 2 bits are both set to 1. the following 2 bits are used to specify

									 * what kind of integer will be stored after this header. an overview of the

									 * different types and encodings is as follows. the first byte is always enough

									 * to determine the kind of entry.

									 *

									 * |00pppppp| - 1 byte

									 *      string value with length less than or equal to 63 bytes (6 bits).

									 *      "pppppp" represents the unsigned 6 bit length.

									 * |01pppppp|qqqqqqqq| - 2 bytes

									 *      string value with length less than or equal to 16383 bytes (14 bits).

									 *      important: the 14 bit number is stored in big endian.

									 * |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes

									 *      string value with length greater than or equal to 16384 bytes.

									 *      only the 4 bytes following the first byte represents the length

									 *      up to 2^32-1. the 6 lower bits of the first byte are not used and

									 *      are set to zero.

									 *      important: the 32 bit number is stored in big endian.

									 * |11000000| - 3 bytes

									 *      integer encoded as int16_t (2 bytes).

									 * |11010000| - 5 bytes

									 *      integer encoded as int32_t (4 bytes).

									 * |11100000| - 9 bytes

									 *      integer encoded as int64_t (8 bytes).

									 * |11110000| - 4 bytes

									 *      integer encoded as 24 bit signed (3 bytes).

									 * |11111110| - 2 bytes

									 *      integer encoded as 8 bit signed (1 byte).

									 * |1111xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.

									 *      unsigned integer from 0 to 12. the encoded value is actually from

									 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be

									 *      subtracted from the encoded 4 bit value to obtain the right value.

									 * |11111111| - end of ziplist special entry.

									 *

									 * like for the ziplist header, all the integers are represented in little

									 * endian byte order, even when this code is compiled in big endian systems.

									 *

									 * examples of actual ziplists

									 * ===========================

									 *

									 * the following is a ziplist containing the two elements representing

									 * the strings "2" and "5". it is composed of 15 bytes, that we visually

									 * split into sections:

									 *

									 *  [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]

									 *        |             |          |       |       |     |

									 *     zlbytes        zltail    entries   "2"     "5"   end

									 *

									 * the first 4 bytes represent the number 15, that is the number of bytes

									 * the whole ziplist is composed of. the second 4 bytes are the offset

									 * at which the last ziplist entry is found, that is 12, in fact the

									 * last entry, that is "5", is at offset 12 inside the ziplist.

									 * the next 16 bit integer represents the number of elements inside the

									 * ziplist, its value is 2 since there are just two elements inside.

									 * finally "00 f3" is the first entry representing the number 2. it is

									 * composed of the previous entry length, which is zero because this is

									 * our first entry, and the byte f3 which corresponds to the encoding

									 * |1111xxxx| with xxxx between 0001 and 1101. we need to remove the "f"

									 * higher order bits 1111, and subtract 1 from the "3", so the entry value

									 * is "2". the next entry has a prevlen of 02, since the first entry is

									 * composed of exactly two bytes. the entry itself, f6, is encoded exactly

									 * like the first entry, and 6-1 = 5, so the value of the entry is 5.

									 * finally the special entry ff signals the end of the ziplist.

									 *

									 * adding another element to the above string with the value "hello world"

									 * allows us to show how the ziplist encodes small strings. we'll just show

									 * the hex dump of the entry itself. imagine the bytes as following the

									 * entry that stores "5" in the ziplist above:

									 *

									 * [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]

									 *

									 * the first byte, 02, is the length of the previous entry. the next

									 * byte represents the encoding in the pattern |00pppppp| that means

									 * that the entry is a string of length <pppppp>, so 0b means that

									 * an 11 bytes string follows. from the third byte (48) to the last (64)

									 * there are just the ascii characters for "hello world".

									 *

									 * ----------------------------------------------------------------------------

									 *

									 * copyright (c) 2009-2012, pieter noordhuis <pcnoordhuis at gmail dot com>

									 * copyright (c) 2009-2017, salvatore sanfilippo <antirez at gmail dot com>

									 * all rights reserved.

									 */

看完了么？接下来就是基操阶段了，对于任何一种数据结构，基操无非增删查改。

实际节点

				?

									typedef struct zlentry {

									    unsigned int prevrawlensize; /* bytes used to encode the previous entry len*/

									    unsigned int prevrawlen;     /* previous entry len. */

									    unsigned int lensize;        /* bytes used to encode this entry type/len.

									                                    for example strings have a 1, 2 or 5 bytes

									                                    header. integers always use a single byte.*/

									    unsigned int len;            /* bytes used to represent the actual entry.

									                                    for strings this is just the string length

									                                    while for integers it is 1, 2, 3, 4, 8 or

									                                    0 (for 4 bit immediate) depending on the

									                                    number range. */

									    unsigned int headersize;     /* prevrawlensize + lensize. */

									    unsigned char encoding;      /* set to zip_str_* or zip_int_* depending on

									                                    the entry encoding. however for 4 bits

									                                    immediate integers this can assume a range

									                                    of values and must be range-checked. */

									    unsigned char *p;            /* pointer to the very start of the entry, that

									                                    is, this points to prev-entry-len field. */

									} zlentry;

基本操作

我觉得这张图还是要再摆一下：
redis专属链表ziplist的使用
这个图里要注意，右侧是没有记录“当前元素的大小”的

增

真实插入的是这个函数：

讲真，头皮有点发麻。那么我们等下还是用老套路，按步骤拆开来看。

				?

									/* insert item at "p". */

									unsigned char *__ziplistinsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {

									    size_t curlen = intrev32ifbe(ziplist_bytes(zl)), reqlen;

									    unsigned int prevlensize, prevlen = 0;

									    size_t offset;

									    int nextdiff = 0;

									    unsigned char encoding = 0;

									    long long value = 123456789; /* initialized to avoid warning. using a value

									                                    that is easy to see if for some reason

									                                    we use it uninitialized. */

									    zlentry tail;

									    /* find out prevlen for the entry that is inserted. */

									    if (p[0] != zip_end) {

									        zip_decode_prevlen(p, prevlensize, prevlen);

									    } else {

									        unsigned char *ptail = ziplist_entry_tail(zl);

									        if (ptail[0] != zip_end) {

									            prevlen = ziprawentrylength(ptail);

									        }

									    }

									    /* see if the entry can be encoded */

									    if (ziptryencoding(s,slen,&value,&encoding)) {

									        /* 'encoding' is set to the appropriate integer encoding */

									        reqlen = zipintsize(encoding);

									    } else {

									        /* 'encoding' is untouched, however zipstoreentryencoding will use the

									         * string length to figure out how to encode it. */

									        reqlen = slen;

									    }

									    /* we need space for both the length of the previous entry and

									     * the length of the payload. */

									    reqlen += zipstorepreventrylength(null,prevlen);

									    reqlen += zipstoreentryencoding(null,encoding,slen);

									    /* when the insert position is not equal to the tail, we need to

									     * make sure that the next entry can hold this entry's length in

									     * its prevlen field. */

									    int forcelarge = 0;

									    nextdiff = (p[0] != zip_end) ? zipprevlenbytediff(p,reqlen) : 0;

									    if (nextdiff == -4 && reqlen < 4) {

									        nextdiff = 0;

									        forcelarge = 1;

									    }

									    /* store offset because a realloc may change the address of zl. */

									    offset = p-zl;

									    zl = ziplistresize(zl,curlen+reqlen+nextdiff);

									    p = zl+offset;

									    /* apply memory move when necessary and update tail offset. */

									    if (p[0] != zip_end) {

									        /* subtract one because of the zip_end bytes */

									        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);

									        /* encode this entry's raw length in the next entry. */

									        if (forcelarge)

									            zipstorepreventrylengthlarge(p+reqlen,reqlen);

									        else

									            zipstorepreventrylength(p+reqlen,reqlen);

									        /* update offset for tail */

									        ziplist_tail_offset(zl) =

									            intrev32ifbe(intrev32ifbe(ziplist_tail_offset(zl))+reqlen);

									        /* when the tail contains more than one entry, we need to take

									         * "nextdiff" in account as well. otherwise, a change in the

									         * size of prevlen doesn't have an effect on the *tail* offset. */

									        zipentry(p+reqlen, &tail);

									        if (p[reqlen+tail.headersize+tail.len] != zip_end) {

									            ziplist_tail_offset(zl) =

									                intrev32ifbe(intrev32ifbe(ziplist_tail_offset(zl))+nextdiff);

									        }

									    } else {

									        /* this element will be the new tail. */

									        ziplist_tail_offset(zl) = intrev32ifbe(p-zl);

									    }

									    /* when nextdiff != 0, the raw length of the next entry has changed, so

									     * we need to cascade the update throughout the ziplist */

									    if (nextdiff != 0) {

									        offset = p-zl;

									        zl = __ziplistcascadeupdate(zl,p+reqlen);

									        p = zl+offset;

									    }

									    /* write the entry */

									    p += zipstorepreventrylength(p,prevlen);

									    p += zipstoreentryencoding(p,encoding,slen);

									    if (zip_is_str(encoding)) {

									        memcpy(p,s,slen);

									    } else {

									        zipsaveinteger(p,value,encoding);

									    }

									    ziplist_incr_length(zl,1);

									    return zl;

									}

对“链表”插入数据有几个步骤？
1、偏移
2、插进去
3、缝合

那这个“列表”，比较特殊一点，特殊在哪里？特殊在它比较紧凑，而且数据类型，其实也就两种，要么integer，要么string。所以它的步骤是？
1、数据重新编码
2、解析数据并分配空间
3、接入数据

重新编码

什么是重新编码？插入一个元素，是不是需要对：“前一个元素的大小、本身大小、当前元素编码” 这些数据进行一个统计，然后一并插入。就编这个。

插入位置无非三个，头中尾。
头：前一个元素大小为0，因为前面没有元素。
中：待插入位置后一个元素记录的“前一个元素大小”，当然，之后本身大小就成为了后一个元素眼中的“前一个元素大小”。
尾：那就要把三个字段加起来了。

具体怎么重新编码就不看了吧，这篇本来就已经很长了。

解析数据

再往下就是解析数据了。
首先尝试将数据解析为整数，如果可以解析，就按照压缩列表整数类型编码存储；如果解析失败，就按照压缩列表字节数组类型编码存储。

解析之后，数值存储在 value 中，编码格式存储在 encoding中。如果解析成功，还要计算整数所占字节数。变量 reqlen 存储当前元素所需空间大小，再累加其他两个字段的空间大小，就是本节点所需空间大小了。

重新分配空间

看注释这架势，咋滴，还存在没地方给它塞？

来我们看看。

这里的分配空间不是简单的就新插进来的数据多少空间就分配多少，如果没有仔细阅读上面那段英文的话，嗯，可以选择绕回去仔细阅读一下那个节点组成。特别是那个：

				?

									/*

									* the length of the previous entry, <prevlen>, is encoded in the following way:

									* if this length is smaller than 254 bytes, it will only consume a single

									* byte representing the length as an unsinged 8 bit integer. when the length

									* is greater than or equal to 254, it will consume 5 bytes. the first byte is

									* set to 254 (fe) to indicate a larger value is following. the remaining 4

									* bytes take the length of the previous entry as value.

									*/