Skip to main content

Storage

CamusDB stores relational data on top of a distributed key/value layer provided by Kahuna. Tables, rows, indexes, schema metadata, locks, and transaction state are mapped to persistent key/value entries.

The design keeps SQL as the user-facing model while using a storage layout that can be routed, replicated, locked, and committed by the distributed KV layer.

Storage Stack

CamusDB's storage path has three layers:

LayerResponsibility
SQL enginePlans statements, validates schema, applies constraints, and decides which rows or indexes are touched.
CamusDB KV mappingEncodes table rows, index entries, and schema metadata as deterministic key/value entries.
KahunaPersists keys, coordinates locks and transactions, and relies on Raft-backed partition ownership in cluster mode.

Kahuna supports embedded storage backends such as RocksDB and SQLite. In the current CamusDB source, standalone databases are opened with a SQLite-backed embedded Kahuna node, and the included Docker cluster configuration also uses SQLite-backed KV and WAL paths. The storage wrapper also contains persistent RocksDB/SQLite construction helpers for deployment and testing scenarios.

For the lower-level backend details, see Kahuna's storage overview. For the recovery path, see WAL And Recovery.

Database Create And Open

Databases must be created explicitly before use. At creation time, CamusDB allocates an immutable database id and registers the name-to-id mapping.

When a database is opened in standalone mode, CamusDB uses the id-based storage directories created for that database:

{data_dir}/{database_id}/kv
{data_dir}/{database_id}/wal

It then starts an embedded Kahuna node, waits for the local partition leader, flushes recovered WAL state, and loads schema metadata from KV storage.

The human-readable database name is not used as the storage directory. Renaming a database updates the registry binding but leaves the id, directories, table ids, row keys, and index keys unchanged.

In cluster mode, CamusDB uses a process-level shared Kahuna node. The cluster node is started during server startup, then all opened databases share that distributed storage layer. WAL replay and flush happen at node startup so recovered committed entries are available before database metadata is used.

Key Layout

Rows and indexes are stored under table-specific prefixes. The table prefix is important because it lets the underlying routing layer consistently place table data on the expected partition.

ObjectKey shapeValue
Row{tableId}:r/{rowId}Serialized row bytes.
Unique index entry{tableId}:i:{indexId}/{encodedKey}Row id as UTF-8 text.
Non-unique index entry{tableId}:i:{indexId}/{encodedKey}{rowId}Row id as UTF-8 text.
Schema metadata{databaseId}/meta/schemaSerialized table schema map.
System metadata{databaseId}/meta/systemSerialized system schema.

Non-unique index keys append the row id directly after the encoded key. The row id has a fixed 24-character representation, so CamusDB can split it back out while preserving sortable index keys.

Row Values

Each row is stored as a compact binary value. The row payload includes:

  • Schema version.
  • Row object id.
  • One encoded value for each column in schema order.

The schema version lets CamusDB deserialize older row payloads through the schema history attached to the table. Column values are encoded by type:

Column typeStored representation
OID12-byte object id.
INT648-byte signed integer.
FLOAT648-byte double.
STRINGLength-prefixed UTF-16 string.
BOOLBoolean marker byte.
NULLNull marker byte.

Index Encoding

Index keys must sort the same way SQL values sort. CamusDB uses an order-preserving encoder for composite index values:

  • NULL sorts before present values.
  • INT64 flips the sign bit and stores fixed-width hexadecimal text.
  • FLOAT64 applies an order-preserving transform to IEEE-754 bits.
  • BOOL stores 0 or 1.
  • STRING and OID values use terminators and escaping so prefixes sort correctly.

This lets CamusDB scan index keys in lexicographic KV order and get SQL-order results for the indexed columns.

Writes And Locks

Write paths use persistent KV entries and explicit transaction state:

  1. Start a transaction.
  2. Acquire an exclusive lock for each row, index, or metadata key that will be written.
  3. Write or delete the affected keys.
  4. Track acquired locks and modified keys in the transaction object.
  5. Commit or roll back through Kahuna's transaction API.

Cross-partition writes use two-phase commit. CamusDB uses Serializable transactions by default, plus committed MVCC reads, conflict detection, and tracked write intents for atomic commit coordination.

Scans

Full table scans read the row bucket prefix:

{tableId}:r

Index scans read the index bucket prefix:

{tableId}:i:{indexId}

Because row ids and encoded index keys preserve sort order, CamusDB can stream rows or index entries from KV storage in deterministic order before applying query filtering, projection, sorting, limits, and aggregation.

Standalone vs Cluster Mode

Standalone mode creates a local embedded Kahuna node for each opened database. This is the simplest setup for tutorials and local development.

Cluster mode creates one process-level shared storage node and wires it to real inter-node communication and static discovery. Data is partitioned across Raft partitions, and each partition elects its own leader through Kommander.

See Cluster Mode for startup commands and configuration.