Rowles.LeanCorpus.Codecs.DocValues
Classes
BinaryDocValuesReader
Reads multi-valued binary DocValues from a .dvb sidecar file.
BinaryDocValuesWriter
Writes multi-valued binary DocValues in a column-stride format (.dvb).
FieldLengthReader
Reads exact per-field per-doc token counts from a
.flnfile. ReturnsDictionary<string, int[]>keyed by field name. Supports both VarInt (v2) and fixed ushort (v1) formats. Falls back gracefully when the file does not exist (caller should use quantised norms).
FieldLengthWriter
Writes exact per-field per-doc token counts to a
.flnfile. Layout v2: [Header][FieldCount:int32]([FieldNameLen:int32][FieldNameUTF8][DocCount:int32][VarInt * DocCount])* Uses VarInt encoding: 1 byte for lengths < 128, 2 bytes for < 16384.
NormsReader
Reads quantised per-field norm values as compact byte arrays. Uses memory-mapped I/O to avoid loading the entire file into a managed byte[].
NormsWriter
Quantises float norms to single bytes and writes them to disc. Writes per-field norms for accurate BM25 field-length normalisation.
NumericDocValuesReader
Reads per-document numeric values from a column-stride .dvn file. Returns the dense value arrays alongside per-field presence bitmaps (v2 files only). A null presence entry means all documents carry a value for that field.
NumericDocValuesWriter
Writes per-document numeric values in a compact column-stride format (.dvn). Layout per field (v2): [fieldName] [presenceByteCount: int32] [presenceBitmap: bytes if count > 0] [docCount: int32] [minValue: int64] [bitsPerValue: byte] [packed values...]. Version 1 (legacy): no presence block. Version 2+: presence block with 0 meaning all docs present.
SortedDocValuesReader
Reads per-document string values from a column-stride .dvs file. Returns the dense value arrays alongside per-field presence bitmaps (v2 files only). A null presence entry means all documents carry a value for that field.
SortedDocValuesWriter
Writes per-document string values in a column-stride format (.dvs). Layout (v2): [fieldName] [presenceByteCount: int32] [presenceBitmap: bytes if count > 0] [docCount: int32] [ordCount: int32] [ord table: length-prefixed strings] [ords: packed ints]. Deduplicates values via an ordinal table. Null entries in the values array indicate absent docs.
SortedNumericDocValuesReader
Reads multi-valued numeric DocValues from a .dsn sidecar file.
SortedNumericDocValuesWriter
Writes multi-valued numeric DocValues in a column-stride format (.dsn).
SortedSetDocValuesReader
Reads multi-valued string DocValues from a .dss sidecar file.
SortedSetDocValuesWriter
Writes multi-valued string DocValues in a column-stride format (.dss).