Table of Contents

Internal classSealedInternal FSTReader

Namespace
Rowles.LeanLucene.Codecs.Fst
Assembly
Rowles.LeanLucene.dll

Reads a v2 .dic file: compact byte-keyed sorted dictionary. All data is loaded into contiguous arrays at open time — no per-term string allocation. Binary search operates on raw UTF-8 bytes (~3× faster than char-span comparison).

internal sealed class FSTReader
FSTReader

Public method EnumerateAllTerms()

Enumerates all terms and their postings offsets in sorted order.

Public method GetAllTermsForField(string)

Returns all terms for a field (all terms with prefix "field\0").

Public method GetFuzzyMatches(string, ReadOnlySpan<char>, int, int)

Returns all terms within Levenshtein distance for a field, with edit distances. Uses prefix-sharing DP on sorted terms: consecutive terms sharing a prefix reuse the Levenshtein row up to the longest common prefix. Dead prefixes (row min > maxEdits) skip ahead via binary search. When more than maxExpansions terms match, only the closest are kept.

Public method GetTermsInRange(string, string?, string?, bool, bool)

Returns terms whose bare value falls within a lexicographic range.

Public method GetTermsMatching(string, ReadOnlySpan<char>)

Returns all terms matching a wildcard pattern for a given field.

Public method GetTermsMatchingRegex(string, Regex)

Returns terms for a field whose bare text matches the given compiled regex.

Public method GetTermsWithPrefix(ReadOnlySpan<char>)

Returns all terms sharing the given qualified prefix.

Public method IntersectAutomaton(string, IAutomaton)

Intersects the term dictionary with an automaton, returning matching terms. Operates on bare term bytes (after fieldPrefix). Uses CanMatch for pruning.

Public method Open(IndexInput)

Opens a v2 dictionary from an IndexInput positioned just after the codec header.

Public method TryGetPostingsOffset(ReadOnlySpan<byte>, out long)

O(1) average-case hash lookup on UTF-8 byte keys (falls back to chain walk on collision).

Public method TryGetPostingsOffset(ReadOnlySpan<char>, out long)

O(log N) binary search accepting a char span (encodes to UTF-8 internally).