PostgreSQL/Index GIN

GIN (Generalized Inverted Index) supports data types that are divisible into smaller components, e.g., elements of an array, words of a text document, or properties of a JSON object. We call them compound data types. Unlike B-Trees, GIN does not generate a single index entry for the complete value but one index entry for each component.

Every index entry consists of the value of the individual component plus the tuple ID (TID). Please notice that TIDs don't contain the sequence number of the component within the complete value. They contain only the number of the physical page in the data file plus the number of the row within the page.

Useable operators in a  clause depend on the data type:
 * Arrays:  (is contained in),   (contains),   (equal), and   (overlaps / has some common elements).
 * Text queries (lexems):  (contains).
 * JSON:  (JSON object field with the given key),   (JSON object field with the given key, as text).

Description
A GIN index consists of different sub-structures:
 * The Meta Page
 * One Entry B-Tree
 * Some Posting B-trees
 * One Pending List

Among others, the Meta Page contains pointers to the Entry B-Tree and the Pending List.

The Entry B-Tree implements a tree where the keys consist of the original component's values, e.g., the value of a single array element. At the non-leaf levels, their pointers point to child pages at the next level. At the leaf level, pages consist of two types of entries: First, there are Posting Lists. They consist of the key, followed by a list of TIDs. Second, if such a list of TIDs exceeds the capacity of the physical page, the list gets rearranged to a Posting B-Tree, which is stored on one or more other pages. The original Posting List gets replaced by a pointer to the new Posting B-Tree.

A Posting B-Tree is part of one or more physical pages and contains a B-Tree over tuple IDs which all point to such rows in the data file where its key can be found as the value of one of its components.

The implementation of the two B-Tree types differs from PostgreSQL's standard B-Tree implementation.

The Pending List is a list of pages where keys (component's values) and their dedicated TIDs are stored sequentially. The Pending List exists for optimization purposes; see below.



Optimizations
GIN indices use two special optimizations. First, if the value of different components (possibly in different rows) is used often, the set of TIDs is rearranged to a B-Tree within the GIN. Consider the case of many text documents: Many words will likely be repeatedly used in the same and in different documents. In this case, the list of TIDs may grow to the thousands, and a tree is better suited to manage them than a list.

Second, s or  s of compound data creates many index entries: one per component, e.g., one per word of a text. In the first step, such new index entries are collected in a separate Pending List outside of the index tree (in opposite to the original  command). During the next -run, the entries are moved from the pending list to the GIN tree structure using the same bulk insert technique used during initial index creation. This bulk technique speeds up the process, and - even more helpful - the work is delegated to a background process. Of course, there is a drawback: Every query must scan the pending list in addition to traversing the index tree.