
StopWordFilter
- Namespace
- Rowles.LeanCorpus.Analysis.Filters
- Assembly
- Rowles.LeanCorpus.dll
Removes common English stop words from a token list using a frozen set for fast, allocation-free lookups.
public sealed class StopWordFilter : ITokenFilter
- Implements
Constructors
StopWordFilter()
Initialises a new StopWordFilter using the default English stop word list.
StopWordFilter(IEnumerable<string>?)
Initialises a new StopWordFilter with a custom stop word list.
Fields
DefaultStopWords
The classic 33-word English stop word list used by the default analyser. This is the default list used by
StandardAnalyserand aligns with the external benchmark baseline behaviour for maximum compatibility.
ExtendedStopWords
An extended English stop word list that removes a broader range of function words, prepositions, pronouns, modal verbs, and adverbs. This list is more aggressive than DefaultStopWords and will suppress indexing of terms such as
after,before,could,how,when, etc. Use only when reduced index size matters more than full-text compatibility. Pass this toIndexWriterConfig.StopWordsto opt in.
Methods
Apply(List<Token>)
Applies the filter to the token list, modifying it in place.
IsStopWord(ReadOnlySpan<char>)
Returns true if the given term span is a stop word (zero-alloc).
IsStopWord(string)
Returns true if the given term is a stop word.