行存储 VS 列存储

每行是一条数据。
每列是一个field。

按行存储

示例:

1
2
3
[ {A: 1, B:42},
{A: 5, B:12},
{A: 5, B:6} ],

按行获取数据比较方便,获取一列数据比较慢。

按列存储

示例:

1
2
{ A: [1, 5, 5],
B: [42, 12, 6]}.

按列获取数据比较方便,获取一行数据比较慢。

通常情况下,只有少数field会经常用。按列存储还能够更好利用cpu缓存。

  • more efficient for queries that target a limited number of fields by making better use of CPU caches。
    -

实例 - lucene

Doc values are a Lucene 4.x feature which allow for storing field values on disk in a column stride fashion, which is filesystem cache friendly and suitable for custom scoring, sorting or faceting, exactly like field data.

https://www.elastic.co/blog/disk-based-field-data-a-k-a-doc-values

实例