stellar-core History module (The module design attempts to satisfy several…
stellar-core History module
The module design attempts to satisfy several key constraints:
Tolerable latency for two main scenarios: "new peer joins network" and "peer temporarily de-synchronized, needs to catch up".
The history store is served to clients over HTTP.
In the root of the history store is a history archive state file (class HistoryArchiveState (see
)) which stores the most recent
) (including version info, most recent ledger number and most recent bucket hashes). As per RFC 5785, this checkpoint is stored at .well-known/stellar-history.json as a JSON file.
The history module is responsible for storing and retrieving "historical records" in longer-term (public, internet) archives.
take two forms
[Buckets from the BucketList]
checkpoints of "full ledger state".
a sequential log of ledger headers and transactions, stored as separate ledger and transaction files.
Checkpoints are made every 64 ledgers, which (at 5s ledger close time) is 320s or about 5m20s.
There will be 11 checkpoints per hour, 270 per day, and 98,550 per year.
Counting checkpoints within a 32bit value gives 43,581 years of service for the system.
checkpoint is in .well-known/stellar-history.json, each checkpoint is also stored permanently at a path whose name includes the last ledger number in the checkpoint (as a 32-bit hex string) and stored in a 3-level deep directory tree of hex digit prefixes.
Boundary conditions and counts
There is no ledger 0 -- that's the sequence number of a
ledger with no content, before "ledger 1, the genesis ledger" -- so the initial ledger block (block 0x0000003f) has 63 "real" ledger objects in it, not 64 as in all subsequent blocks.
We could, instead, shift all arithmetic in the system to "count from 1" and have ledger blocks run from [1,64] and [65,128] and so forth; but the disadvantages of propagating counts-from-1 arithmetic all through the system seem worse than the disadvantage of having to special-case the first history block, so we stick with "counting from 0" for now.
So all ledger blocks
a multiple of 64 and run until one-less-than the next multiple of 64, inclusive: [0,63], [64,127], [128,191], etc.
The catchup algorithm
Consider the following timeline:
The network's view of time begins at ledger GENESIS and, sometime thereafter, we assume this peer lost synchronization with its neighbour peers at ledger LAST.
GENESIS LAST RESUME INIT NEXT TIP
[-- buffered --]
When catching up, it's useful to denote the ledgers involved symbolically.