Implementing a Basic Database Engine in C for Embedded Systems

Embedded systems often require local data management without the overhead of full database servers. Implementing a lightweight database engine in C gives developers direct control over memory, performance, and storage. This article walks through the design and implementation of a simple embedded database engine, covering data structures, CRUD operations, indexing, and persistence strategies for resource-constrained environments.

Core Requirements for an Embedded Database Engine

An embedded database engine must operate within tight limits on RAM, flash, and processing speed. Typical requirements include deterministic behavior, minimal code footprint, and no external dependencies. The engine should support basic operations: insert, retrieve, update, delete, and search. Many embedded databases also need to survive power loss and store data on non-volatile memory such as EEPROM, SPI flash, or SD cards.

Choosing the right data structure is the first design decision. Arrays are simple but limited by static sizes. Linked lists allow dynamic growth but add pointer overhead. For balanced performance, a hybrid approach using fixed-size record pools with a free-list can work well. SQLite's design principles offer useful insights even for much simpler engines.

Designing the Record Storage Layer

The storage layer manages how records are laid out in memory or on disk. A common pattern is to treat each record as a fixed-length structure to simplify pointer arithmetic and allow direct indexing. Variable-length records complicate fragmentation and require a memory manager.

Fixed-Length Records with a Record Pool

Define a maximum number of records (e.g., MAX_RECORDS) and allocate a static array. A bitmap or free-list tracks which slots are used. When a record is deleted, its slot returns to the pool. This approach avoids dynamic allocation and guarantees O(1) allocation time.

#define MAX_RECORDS 256
typedef struct {
    int id;
    char name[32];
    float value;
    int active; // 1 if slot in use
} Record;

Record pool[MAX_RECORDS];

For persistent storage, the pool can be backed by a file or a region of flash. On start-up, the engine reads the pool from non-volatile memory into RAM, and on shutdown (or periodically) it writes it back.

Data Integrity Checks

Add a simple checksum field to each record to detect corruption. CRC-32 is a good choice for embedded systems, balancing complexity with error detection strength.

Implementing Basic CRUD Operations

With the record pool defined, implement functions to insert, find, update, and delete records. Search operations are often the performance bottleneck, so a naive linear search is acceptable only for small databases (a few hundred records).

Insert with Free-List Management

Maintain a free-list of indices. On insert, pop an index from the free-list, fill the record, and mark it active. The free-list itself can be a simple stack using an array of integers.

int free_list[MAX_RECORDS];
int free_count = MAX_RECORDS;
for (int i = 0; i < MAX_RECORDS; i++) free_list[i] = i;

int db_insert(int id, const char* name, float value) {
    if (free_count == 0) return -1; // no space
    int idx = free_list[--free_count];
    pool[idx].id = id;
    strncpy(pool[idx].name, name, sizeof(pool[idx].name)-1);
    pool[idx].value = value;
    pool[idx].active = 1;
    return idx;
}

Search and Update

A simple search iterates over the pool, checking only active records. For updates, locate the record, modify fields, and optionally re-check the free-list if the record is deleted.

int db_find_by_id(int id) {
    for (int i = 0; i < MAX_RECORDS; i++) {
        if (pool[i].active && pool[i].id == id) return i;
    }
    return -1;
}

void db_update(int idx, float new_value) {
    if (idx >= 0 && idx < MAX_RECORDS && pool[idx].active)
        pool[idx].value = new_value;
}

Advanced Concepts: Indexing and Persistence

As the number of records grows, linear search becomes expensive. Adding a simple index—like a sorted array of key pointers or a binary search tree—improves retrieval time. For embedded systems, a static array sorted on the key with binary search is often sufficient if inserts are infrequent.

Sorted Index with Binary Search

Maintain a parallel array of record indices sorted by the search key (e.g., ID). When inserting a new record, insert its index into the sorted array using a binary insertion. Then search becomes O(log n) via binary search. Deletions require shifting the index array, but for small databases this is acceptable.

Persistent Storage Using File I/O

On microcontrollers without a file system, raw flash memory writes are common. On Linux-based embedded systems, standard POSIX open()/read()/write() work well. Use a simple file format: write a header (magic number, version, record count) followed by the raw pool array. For crash resilience, a write-ahead log (WAL) can help, but for basic engines a single atomic write (that fits in one flash page) is enough. FreeRTOS+FAT is a lightweight file system option for deeply embedded systems.

void db_save(const char* filename) {
    FILE* fp = fopen(filename, "wb");
    if (!fp) return;
    fwrite(pool, sizeof(pool), 1, fp);
    fclose(fp);
}

void db_load(const char* filename) {
    FILE* fp = fopen(filename, "rb");
    if (!fp) return;
    fread(pool, sizeof(pool), 1, fp);
    fclose(fp);
    // Rebuild free-list from pool
    free_count = 0;
    for (int i = 0; i < MAX_RECORDS; i++) {
        if (!pool[i].active) free_list[free_count++] = i;
    }
}

Handling Constraints and Trade-offs

Embedded database engines face a constant trade-off between features and resource usage. Choosing which features to include depends on the application:

ACID compliance – Usually not needed. Simple atomic writes suffice for most sensor data logging.
Indexing – Adds insert cost but speeds up reads. For write-heavy workloads, skip indexes.
Concurrency – Most embedded systems run a single thread. Leverage mutexes if using an RTOS.
Memory footprint – Static allocation is safer than dynamic malloc(). Use #define constants for buffer sizes.
Power loss – For flash storage, avoid frequent small writes. Batch updates and use a double-buffer scheme.

Practical Example: A Temperature Logger Database

Consider an IoT temperature sensor that records readings every minute and stores them locally for 24 hours. The database engine must handle 1440 records (one per minute). Each record might contain a timestamp (Unix epoch), a temperature (float), and a sensor ID. Using the fixed-record pool with 256 slots is too small; here we need MAX_RECORDS = 1440. With 28 bytes per record (4+4+4+4 for overhead), the pool uses about 40 KB, feasible on many microcontrollers with 128 KB RAM.

The engine can store data in a ring-buffer fashion: when the pool is full, the oldest record is overwritten. Implement a "head" pointer for the next write slot and a "tail" for the oldest active record. This avoids free-list logic and provides O(1) insert. Search can be optimized with a binary search on timestamps if the records are stored in chronological order.

typedef struct {
    uint32_t timestamp;
    float temp_c;
    uint8_t sensor_id;
    uint8_t active; // not needed if using ring buffer
} TemperatureRecord;

#define MAX_LOGS 1440
TemperatureRecord logs[MAX_LOGS];
uint16_t head = 0;   // next write position
uint16_t count = 0;  // number of valid records

Inserting a reading: write to logs[head], increment head modulo MAX_LOGS, increment count (capped at MAX_LOGS). Searching for a specific timestamp: if count == MAX_LOGS, the log is a contiguous sequence from head to head-1 (wrap). Use binary search after computing the virtual start. This pattern is extremely lightweight and widely used in telemetry systems. Ring buffer basics provide additional implementation options.

Testing and Optimization on Target Hardware

Always test the database engine on actual embedded hardware. Emulators miss timing constraints, especially for flash write cycles and power loss scenarios. Monitor RAM usage with a profiling tool and verify edge cases: full storage, corrupted data, reset mid-write. A simple test harness runs thousands of random inserts, searches, and deletes while comparing to a golden model.

Flash wear leveling – If writing to EEPROM or NOR flash, limit total writes to a few hundred thousand. A circular buffer with a wear-leveling layer extends lifespan.
Power-fail safe – Use a commit mark: write a flag byte after a complete batch of records. On restart, check the flag; if missing, discard the last batch and revert to previous state.
Memory pooling – Avoid deep recursion. Keep function call stacks shallow. Use static buffers for file I/O.

Conclusion

Building a basic database engine in C for embedded systems is a practical approach to managing data in resource-limited devices. By focusing on simple data structures like fixed-record pools and ring buffers, developers achieve efficient CRUD operations with minimal overhead. Adding optional indexing and basic persistence turns a simple array into a reliable local datastore. The techniques described here scale from small sensor loggers to more sophisticated systems, and they provide a foundation for understanding how larger embedded databases like SQLite or Berkeley DB operate under the hood.