The variable-size list is defined by two buffers (validity bitmap and offset) and a child array. While in Variable-length binary arrays the offsets point to a value buffer, in the case of a List<T> it point to location in the child array, which can potentially be a nested array itself.

List and LargeList

Lists with 64-bits offsets are called LargeList

This is an example of how the [[12, -7, 25], null, [0, -127, 127,** **50], []] is represented

  • The offsets buffer has exactly one more element than the List array it belongs to since there are four elements to our List<Int8> array and we have five elements in the offsets buffer.
  • Each value in the offsets buffer represents the starting slot of the corresponding list index, i. Looking closer at the offsets buffer, we notice that 3 and 7 are repeating, indicating that those lists are either null or empty (have a length of 0).

Computing the list length

To discover the length of a list at a given slot, you simply take the difference between the offset for that slot and the offset after it

FixedSizeList

The FixedSizeList<T>[N] works nearly the same as the variable-sized list, except there’s no need for an offsets buffer. The child array of a fixed-size list type is the values array, complete with its own validity buffer. The value in slot J of a fixed-size list array is stored in an N-long slice of the values array, starting at offset j*N like in the following figure

Determining the values for a given slot of FixedSizeList doesn’t require any lookups into a separate offsets buffer, making it more efficient if you know that your lists will always be a specific size. As a result, you also save space by not needing the extra memory for an offsets buffer.