Skip to content

Compressed Delta (`cMdT`) Format

Binary container for regularly-sampled signed-integer multichannel signal data. A file is a fixed 28-byte header followed by an encoded samples block.

The samples block goes through two optional, independent transforms before storage:

  • Coding — per-sample delta encoding (1st or 2nd order) followed by zig-zag, mapping signed deltas onto small unsigned integers.
  • Compression — generic compression (Zstandard or zlib) over the coded bytes.

Both stages are selected per file via header fields.

All multi-byte fields are little-endian. The header is packed (no padding) and must be exactly 28 bytes:

OffSizeFieldC typeConstraint
04magicuint320x54644D63 (ASCII c M d T)
48payload_sizeuint64size of samples block in bytes
121total_channelsuint8≥ 1
134total_samplesuint32≥ 1, per channel
178sample_ratefloat64finite
251bits_per_sampleuint88, 16, 24, or 32
261codinguint8see Coding
271compressionuint8see Compression

The samples block immediately follows the header and spans payload_size bytes.

Samples are stored channel-major (non-interleaved). After full decoding — reversing compression then reversing coding — the layout is:

c0[0] c0[1] ... c0[M-1] c1[0] c1[1] ... c1[M-1] ... c(N-1)[M-1]

where N = total_channels and M = total_samples.

Each sample is a signed integer of bits_per_sample bits, stored little-endian. 24-bit samples occupy three consecutive bytes; readers sign-extend from bit 23 when widening to 32 bits.

The fully-decoded payload size equals total_channels × total_samples × (bits_per_sample / 8) bytes.

The coding field selects per-sample preprocessing applied independently per channel before cross-channel concatenation.

ValueSymbolPer-channel transform
0TWOS_COMPLEMENTnone — samples stored as raw signed ints
1DELTA_ZIGZAG1st-order delta, then zig-zag
2DOUBLE_DELTA_ZIGZAG2nd-order delta, then zig-zag

Storage is in bits_per_sample-wide slots in all three modes; the bit-width never changes.

Zig-zag maps signed integers onto unsigned integers placing small absolute values near zero. For an N-bit signed integer:

encode(n) = (n << 1) ^ (n >> (N - 1)) // N = 8/16/24/32, arithmetic right shift
decode(z) = (z >> 1) ^ -(z & 1)

For a channel with raw samples x[0..M-1], the encoder produces deltas d[0..M-1]:

d[0] = x[0]
d[i] = x[i] - x[i-1] for i ≥ 1

Then writes y[i] = zigzag(d[i]) to the output slot for sample i. The first sample is the raw seed; subsequent samples are first-order differences.

Two raw seed samples, then second-order differences:

d[0] = x[0]
d[1] = x[1]
d[i] = (x[i] - x[i-1]) - (x[i-1] - x[i-2]) for i ≥ 2

Then writes y[i] = zigzag(d[i]) for each i. Note both d[0] and d[1] are raw samples — not first-order deltas — because the decoder needs two consecutive raw samples to seed the second-order recurrence.

Sample arithmetic is in bits_per_sample-wide signed two’s-complement; overflow wraps modulo 2^bits_per_sample and the inverse transform on decode recovers the original x[i].

The compression field selects how the coded bytes are stored.

ValueSymbolStorage
0NO_COMPRESSIONcoded bytes stored verbatim
1ZSTDcoded bytes compressed with Zstandard
2ZLIBcoded bytes compressed with zlib (deflate)

For NO_COMPRESSION, payload_size equals the raw size total_channels × total_samples × (bits_per_sample / 8).

For ZSTD and ZLIB, payload_size is the compressor’s output size; the decompressor must yield exactly the raw size.

A conforming reader must reject inputs that violate any of the following:

  • Header reads less than 28 bytes.
  • magic != 0x54644D63.
  • bits_per_sample is not 8, 16, 24, or 32.
  • coding > 2.
  • compression > 2.
  • total_channels == 0.
  • total_samples == 0.
  • sample_rate is NaN or ±Inf.
  • For NO_COMPRESSION: fewer than the raw size remain after the header.
  • For ZSTD / ZLIB: fewer than payload_size bytes remain after the header.
  • For ZSTD: payload does not start with a valid Zstandard frame header.
  • For ZLIB: payload does not start with a valid zlib stream header.

The validation order is at the reader’s discretion, but cheap checks (header size, magic, enum ranges) should run before stream-position-dependent checks (payload size, frame headers).

Single-call encoder for any valid combination of coding, compression, and bits_per_sample. Allocates the result on the heap; caller frees with free. Returns NULL on invalid arguments or allocation failure.

Compile: cc -std=c17 -O3 cmdt_encode.c -lzstd -lz -o cmdt_encode

#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#include <zstd.h>
#include <zlib.h>
#define CMDT_MAGIC 0x54644D63u
#define CMDT_CODING_TWOS 0
#define CMDT_CODING_DELTA 1
#define CMDT_CODING_DOUBLE 2
#define CMDT_COMPRESS_NONE 0
#define CMDT_COMPRESS_ZSTD 1
#define CMDT_COMPRESS_ZLIB 2
#pragma pack(push, 1)
typedef struct {
uint32_t magic;
uint64_t payload_size;
uint8_t total_channels;
uint32_t total_samples;
double sample_rate;
uint8_t bits_per_sample;
uint8_t coding;
uint8_t compression;
} CmdtHeader;
#pragma pack(pop)
static inline uint32_t zigzag_encode(int32_t value, uint8_t bits) {
return ((uint32_t)value << 1) ^ (uint32_t)(value >> (bits - 1));
}
static inline int32_t read_signed24(const uint8_t* p) {
int32_t value = p[0] | ((int32_t)p[1] << 8) | ((int32_t)p[2] << 16);
return (value & 0x00800000) ? (value | ~0x00FFFFFF) : value;
}
static inline void write_unsigned24(uint8_t* p, uint32_t value) {
p[0] = (uint8_t)value;
p[1] = (uint8_t)(value >> 8);
p[2] = (uint8_t)(value >> 16);
}
// Wrap value to a signed N-bit integer (modular two's-complement)
static inline int32_t truncate_signed(int32_t value, uint8_t bits) {
if (bits == 32) {
return value;
}
const uint32_t mask = (1u << bits) - 1;
const uint32_t sign_bit = 1u << (bits - 1);
const uint32_t low = (uint32_t)value & mask;
return (int32_t)((low ^ sign_bit) - sign_bit);
}
static int32_t read_sample(const uint8_t* src, uint32_t sample_index, uint8_t bits_per_sample) {
switch (bits_per_sample) {
case 8: return (int8_t)src[sample_index];
case 16: return *(const int16_t*)(src + sample_index * 2);
case 24: return read_signed24(src + sample_index * 3);
case 32: return *(const int32_t*)(src + sample_index * 4);
}
return 0;
}
static void write_sample(uint8_t* dst, uint32_t sample_index, uint8_t bits_per_sample, uint32_t value) {
switch (bits_per_sample) {
case 8: dst[sample_index] = (uint8_t)value; break;
case 16: *(uint16_t*)(dst + sample_index * 2) = (uint16_t)value; break;
case 24: write_unsigned24(dst + sample_index * 3, value); break;
case 32: *(uint32_t*)(dst + sample_index * 4) = value; break;
}
}
uint8_t* cmdt_encode(const void* input_samples,
uint8_t total_channels,
uint32_t total_samples,
double sample_rate,
uint8_t bits_per_sample,
uint8_t coding,
uint8_t compression,
size_t* output_size) {
if (!input_samples || !output_size || !total_channels || !total_samples) {
return NULL;
}
if (!isfinite(sample_rate)) {
return NULL;
}
if (bits_per_sample != 8 && bits_per_sample != 16 && bits_per_sample != 24 && bits_per_sample != 32) {
return NULL;
}
if (coding > CMDT_CODING_DOUBLE || compression > CMDT_COMPRESS_ZLIB) {
return NULL;
}
const size_t bytes_per_sample = bits_per_sample / 8;
const size_t raw_size = (size_t)total_channels * total_samples * bytes_per_sample;
// Apply per-channel coding into a contiguous channel-major buffer
uint8_t* coded = (uint8_t*)malloc(raw_size);
if (!coded) {
return NULL;
}
for (uint8_t channel = 0; channel < total_channels; ++channel) {
const uint8_t* src = (const uint8_t*)input_samples + (size_t)channel * total_samples * bytes_per_sample;
uint8_t* dst = coded + (size_t)channel * total_samples * bytes_per_sample;
for (uint32_t sample_index = 0; sample_index < total_samples; ++sample_index) {
const int32_t x = read_sample(src, sample_index, bits_per_sample);
int32_t d = x;
// Deltas use unsigned arithmetic: wraparound must stay defined for 32-bit samples
if (coding == CMDT_CODING_DELTA && sample_index >= 1) {
const int32_t x_prev = read_sample(src, sample_index - 1, bits_per_sample);
d = (int32_t)((uint32_t)x - (uint32_t)x_prev);
} else if (coding == CMDT_CODING_DOUBLE && sample_index >= 2) {
const int32_t x_prev1 = read_sample(src, sample_index - 1, bits_per_sample);
const int32_t x_prev2 = read_sample(src, sample_index - 2, bits_per_sample);
d = (int32_t)(((uint32_t)x - (uint32_t)x_prev1) - ((uint32_t)x_prev1 - (uint32_t)x_prev2));
}
// Wrap delta to bits_per_sample width before zigzag
d = truncate_signed(d, bits_per_sample);
const uint32_t y = (coding == CMDT_CODING_TWOS) ? (uint32_t)d : zigzag_encode(d, bits_per_sample);
write_sample(dst, sample_index, bits_per_sample, y);
}
}
// Optionally compress the coded buffer
const uint8_t* final_data = coded;
size_t final_size = raw_size;
uint8_t* compressed_buffer = NULL;
if (compression == CMDT_COMPRESS_ZSTD) {
const size_t bound = ZSTD_compressBound(raw_size);
compressed_buffer = (uint8_t*)malloc(bound);
if (!compressed_buffer) {
free(coded);
return NULL;
}
const size_t result = ZSTD_compress(compressed_buffer, bound, coded, raw_size, 3);
if (ZSTD_isError(result)) {
free(compressed_buffer);
free(coded);
return NULL;
}
final_data = compressed_buffer;
final_size = result;
} else if (compression == CMDT_COMPRESS_ZLIB) {
uLong bound = compressBound((uLong)raw_size);
compressed_buffer = (uint8_t*)malloc(bound);
if (!compressed_buffer) {
free(coded);
return NULL;
}
if (compress(compressed_buffer, &bound, coded, (uLong)raw_size) != Z_OK) {
free(compressed_buffer);
free(coded);
return NULL;
}
final_data = compressed_buffer;
final_size = (size_t)bound;
}
// Build the final blob: header followed by samples block
const size_t blob_size = sizeof(CmdtHeader) + final_size;
uint8_t* blob = (uint8_t*)malloc(blob_size);
if (!blob) {
free(compressed_buffer);
free(coded);
return NULL;
}
CmdtHeader* header = (CmdtHeader*)blob;
header->magic = CMDT_MAGIC;
header->payload_size = final_size;
header->total_channels = total_channels;
header->total_samples = total_samples;
header->sample_rate = sample_rate;
header->bits_per_sample = bits_per_sample;
header->coding = coding;
header->compression = compression;
memcpy(blob + sizeof(CmdtHeader), final_data, final_size);
free(compressed_buffer);
free(coded);
*output_size = blob_size;
return blob;
}
  • The encoder produces a host-byte-order header. On big-endian platforms the multi-byte header fields and 16/24/32-bit samples must be byte-swapped to little-endian before writing the bytes out. The decoder reads bytes verbatim and assumes little-endian.
  • For brevity, this encoder reads each input sample twice when computing deltas. A production encoder would maintain prev1 / prev2 running variables to make the inner loop a single read per sample.