Please enable JavaScript.
Coggle requires JavaScript to display documents.
Lucene (API (index module :star: (write the DB), search module :star:…
Lucene
API
index module :star:
write the DB
search module :star:
search DB
analysis module (reader -> token stream) :star:
queryParser module
parse query string
document module
store module
storage reader
util module
basic data type
Byte
VInt
Uint32 / Uint64
Chars
String
index file structure
lock
write lock
sugments_N
biggest one is the active one
others are still temporarily needed
compound (cfs)
now seems to be in segment
segments.gen
generation information for segments
files in each segment
fnm
field name
segment relative
fdx
field index
pointer for each document to position of its fields in .fdt
fdt
field data (for each doc)
field count
field number (for each field)
values (for each field)
Bits (state flag)
tis
term data
term -> field
term text (pre/post-fix)
term info
document frequency (number)
frequency file delta
proximity file delta
tii
term index
entirely read into memory
term info & pointer to term data (in tis)
.frq
document
break into fields
Text
Keyword
UnIndexed
UnStored
word breaking
ngrams
NLP
included data
body
path
have unique id
increasingly
gaps removed when merging
indexing
new small indexing files
segments are complete indices
merging indexing files
inverted document
field
stored literally
terms
indexed invertedly
separated or as a whole
segment
stores
document data
field_name -> field_value
pair list
some are deleted but retained
term dictionary
term frequency
term proximity
position in document
normalization factors
term vector
terms
frequency