#Kelondro
Kelondro is a database system, used in YaCy, which stores the data in
bunches of files.
It is used for RWI index, Citation Index, Queues, Bookmarks etc storage on a
local host.
All the log entries associated with it’s operation are marked KELONDRO in
the log.
System is made for rotational disks, trying to minimize costly IO. Data are
dumped in small files first, and merged into larger files afterwards, on
the background or during start-up (IO and memory heavy operation).
Files are named based on timestamp of creation, e. g.:
DATA/INDEX/freeworld/SEGMENTS/default/citation.index.20250309095834126.blob
.
When YaCy stops, .gap and .idx files are dumped (from memory?) and written
to files. (??)
If a system had crashed or switched-off abruptly, indexes/gaps has to be
regenerated upon startup, which may be time and IO consuming, depending on
the size of files. In the case of large datasets, that may take even several
hours.
Files are created for example, as you index, in case of RWIs. They’re
dumped time to time (when crawl pauses or cache is filled) and a new file is
created. Smaller files are merged in the background into larger, more
massive merging is done upon startup/restart.
segments, blobs, heaps?
blobs
/*
* This class implements a BLOB using a set of Heap objects
* In addition to a Heap this BLOB can delete large amounts of data using a given time limit.
* This is realized by creating separate BLOB files. New Files are created when either
* - a given time limit is reached
* - a given space limit is reached
* To organize such an array of BLOB files, the following file name structure is used:
* <BLOB-Name>/<YYYYMMDDhhmm>.blob
* That means all BLOB files are inside a directory that has the name of the BLOBArray.
* To delete content that is out-dated, one special method is implemented that deletes content by a given
* time-out. Deletions are not made automatically, they must be triggered using this method.
upon shrink, 4 stages:
best match?
smalest blobs are merged
up to maximum size (4GB?)
blobs older than 1 months are rewritten (why? it costs io!)