Table of Contents

Public classSealed IndexWriterConfig

Namespace
Rowles.LeanLucene.Index.Indexer
Assembly
Rowles.LeanLucene.dll

Configuration for the IndexWriter.

public sealed class IndexWriterConfig
IndexWriterConfig

Properties

Public property AnalyserInternCacheSize

Maximum number of entries in the StandardAnalyser token intern cache. Larger caches reduce per-token string allocation for repeated terms. Default: 4096.

Public property BKDMaxLeafSize

Maximum number of point values in a BKD tree leaf node. Smaller leaves give faster range queries at the cost of larger index files. Default: 512.

Public property BuildHnswOnFlush

Build an HNSW graph for every vector field at flush time. Disable to fall back to flat brute-force scan (useful for tiny indices where the build overhead outweighs benefit). Default: true.

Public property CharFilters

Character-level filters applied to text before tokenisation. Runs in order before the analyser. Default: empty (no char filters).

Public property CompressionPolicy

Compression algorithm for stored fields. Default: LZ4 (fast decompression). Options: None, Lz4, Zstandard.

Public property DefaultAnalyser

Default analyser used for fields without a specific mapping.

Public property DeletionPolicy

Deletion policy applied after each commit. Default: keep latest only.

Public property DurableCommits

When true (default), Commit() flushes file contents and directory metadata to disk via fsync before and after the segments_N rename, guaranteeing the commit survives a power loss. Disable only for write-heavy benchmarks where durability is not required; correctness suffers if the host crashes mid-commit.

Public property FieldAnalysers

Per-field analyser overrides. Key is the field name.

Public property HnswBuildConfig

HNSW build configuration applied to every vector field. See HnswBuildConfig.

Public property HnswSeed

Optional deterministic seed for HNSW graph construction. When null, a random seed is generated per segment and persisted into the .hnsw file. Set explicitly for reproducible builds.

Public property IndexSort

Optional index-time sort order. When set, documents within each segment are physically reordered at flush time. Default: null (insertion order).

Public property MaxBufferedDocs

Maximum number of buffered documents before an automatic flush.

Public property MaxQueuedDocs

Maximum number of documents that can be queued for indexing before AddDocument blocks. Provides backpressure to prevent unbounded memory growth. Set to 0 to disable (not recommended). Default: 2 × MaxBufferedDocs.

Public property MaxTokensPerDocument

Maximum number of tokens allowed per text field per document. 0 means unlimited (no budget enforcement). Default: 0.

Public property MergeThreshold

Segment count threshold that triggers a tiered merge. When the number of segments at a given size tier reaches this value, the smallest are merged. Default: 10.

Public property MergeThrottleSegments

Maximum number of unmerged segments before AddDocument blocks until a merge completes. Provides backpressure to prevent unbounded segment accumulation. Default: 0 (disabled).

Public property Metrics

Metrics collector for flush, merge, and commit latency tracking. Default: NullMetricsCollector (no-op).

Public property NormaliseVectors

Whether vector fields should be normalised (L2) at index time. When true, dot product equals cosine similarity, enabling cheaper search. Default: true.

Public property PostingsSkipInterval

Skip interval for postings lists. Every N-th doc ID gets a skip pointer for O(log N) advance. Must be consistent between write and merge paths. Default: 128.

Public property RamBufferSizeMB

RAM buffer size in megabytes before an automatic flush.

Public property Schema

Optional schema defining per-field types and validation rules. When null (default), documents are accepted without schema validation.

Public property Similarity

Scoring model used by IndexSearcher. Default: BM25.

Public property StopWords

Custom stop words for the default StandardAnalyser. When null, the built-in English stop word list is used. Set to an empty list to disable stop word removal.

Public property StorePayloads

Whether to store per-position payloads in the postings.

Public property StoreTermVectors

Whether to store term vectors for text fields.

Public property StoredFieldBlockSize

Number of documents per stored field block. Larger blocks compress better but increase random-access cost. Default: 16.

Public property TokenBudgetPolicy

Action taken when a document exceeds MaxTokensPerDocument. Default: Truncate.