Skip to content

EDF / EDF+ (European Data Format)

External polysomnography and polygraphic recording format. A file is a variable-length ASCII header followed by a stream of fixed-duration data records that carry the samples. There is no magic number; identification is by file extension and by the value of the first header field.

Two dialects share the same wire layout:

  • EDF (Kemp et al., 1992) — a single uninterrupted polygraphic recording with one or more signals sampled at fixed rates per signal.
  • EDF+ (Kemp & Olivan, 2003) — backward-compatible extension that introduces the EDF Annotations signal, allows optional gaps between data records, and fixes the structure of patient/recording identification fields. EDF+ files open in EDF readers; the differences are conventions on field content, not on framing. The only technical incompatibility is the discontinuous variant (EDF+D), which an EDF reader will render as if it were continuous.

A file has two structural parts:

  • Header — fully ASCII, fixed-width fields. Carries patient and recording identification, the start date and time, signal count, data-record geometry, and per-signal scaling and labelling.
  • Data records — fixed-size binary blocks. Within each record, signals are stored channel-major: all samples of signal 0, then all samples of signal 1, …, then all samples of signal ns - 1. Each sample is a 16-bit signed integer in little-endian two’s complement.

The reference texts are the EDF, EDF+, and EDF+ standard-texts pages at https://www.edfplus.info/specs/edf.html, https://www.edfplus.info/specs/edfplus.html, and https://www.edfplus.info/specs/edftexts.html.

EDF has no magic bytes. Readers detect the format by:

  • Extension.edf or .EDF. EDF+ requires this; plain EDF is conventionally the same.
  • Version field — the first 8 header bytes are the ASCII string "0 " (digit 0 at offset 0 followed by seven space bytes 0x20). EDF+ keeps the same value so that EDF-era viewers still open EDF+ files.

The EDF / EDF+C / EDF+D dialect is encoded in the main reserved field at bytes 192..235: empty (all spaces) or free local text for EDF, beginning with the ASCII tag "EDF+C" for continuous EDF+, or "EDF+D" for discontinuous EDF+.

These rules apply uniformly to every ASCII field in the main header and in the per-signal bands.

  • Width is fixed. Every field is exactly its declared byte width. Shorter values are right-padded with the space byte 0x20. Writers must truncate over-width values at the field boundary.
  • Left-justified. Useful content starts at the first byte of the field; padding is on the right.
  • Printable US-ASCII only. EDF+ restricts every header byte to the range 0x20..0x7E (decimal 32..126). Plain EDF readers in practice also expect printable ASCII.
  • Numbers are plain decimal strings. Digits 0..9, an optional leading - for negatives, and . as the decimal separator when a fraction is required. No digit-grouping characters (,, _, thin space). The comma , must never be used as a decimal separator.
  • X for unknown subfields. Where an EDF+ subfield is unknown, inapplicable, or anonymized, write a single ASCII X.
  • No spaces inside subfields. Where a 80-byte EDF+ identification field is split into space-separated subfields, the subfield text itself must not contain spaces. Replace internal spaces with another character (the EDF+ spec uses _ in its examples).
OffSizeFieldContent
08version"0 " (digit 0 plus seven spaces). EDF+ keeps the same value
880patient_idLocal patient identification (see below)
8880recording_idLocal recording identification (see below)
1688start_dateRecording start date, dd.mm.yy
1768start_timeRecording start time, hh.mm.ss
1848header_bytesTotal header length in bytes; must equal 256 * (ns + 1)
19244reservedEmpty/local text for EDF; "EDF+C" or "EDF+D" prefix for EDF+
2368record_countNumber of data records (nr), or -1 only while the file is still being written
2448record_durationDuration of one data record in seconds
2524signal_countNumber of signals (ns) in each data record, including any EDF Annotations signals

Field-by-field detail follows.

Fixed literal "0 ". Eight bytes, written as the digit 0 left-justified and padded with seven spaces. EDF+ deliberately keeps the EDF value so that legacy EDF readers will open EDF+ files; EDF+ readers tell the dialects apart by inspecting reserved.

Eighty-byte field for local patient identification.

Plain EDF leaves this as free local text.

EDF+ requires the field to begin with four mandatory subfields in this order, separated by single space bytes:

  1. Hospital administration code — the code by which the patient is known to the hospital. No spaces inside the code.
  2. Sex — single character F or M (English).
  3. Birth datedd-MMM-yyyy, with the month as a three-letter English abbreviation in capitals: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC. The day must be two digits with a leading zero where needed (02-AUG-1951 is valid, 2-AUG-1951 is not).
  4. Patient name — local patient name. Internal spaces must be replaced with another character such as _.

Additional installation-specific subfields may follow the four mandatory ones.

Unknown / inapplicable / anonymized subfields are written as X. If everything is unknown the field starts with X X X X.

Examples:

MCH-0234567 F 02-MAY-1951 Haagse_Harry
MCH-0234567 F 16-SEP-1987 Haagse_Harry
X X X X

Eighty-byte field for local recording identification.

Plain EDF leaves this as free local text.

EDF+ requires the field to begin with five mandatory subfields in this order, separated by single spaces:

  1. The literal text Startdate.
  2. Recording start date in dd-MMM-yyyy, using the same English uppercase month abbreviations as patient_id. This is the authoritative four-digit-year start date; the two-digit-year start_date of the main header is a redundant lossy view.
  3. Hospital administration investigation code (e.g. EEG number, PSG number).
  4. Investigator or technician code.
  5. Equipment code identifying the recording apparatus.

Additional subfields may follow. The same space-substitution and X-for-unknown rules apply. If every value after Startdate is unknown, the field starts with Startdate X X X X.

Examples:

Startdate 02-MAR-2002 PSG-1234/2002 NN Telemetry03
Startdate 16-SEP-1987 PSG-1234/1987 NN Telemetry03
Startdate X X X X

Recording start date in the format dd.mm.yy, using . as the separator. Eight bytes, no padding required.

EDF / EDF+ apply a fixed Y2K pivot:

  • yy in 85..99 represents years 1985..1999.
  • yy in 00..84 represents years 2000..2084.
  • After 2084 the two year characters must be the literal text yy; the real four-digit date is then carried only by the Startdate subfield of recording_id.

Only digits 0..9 and . are allowed in this field (with the single exception of the post-2084 yy literal).

Recording start time in the format hh.mm.ss, using . as the separator. Eight bytes, no padding required. Midnight is 00.00.00.

The value is local wall-clock time at the patient’s location when the recording started, not UTC. No timezone information is encoded.

For EDF+ the start_date / start_time pair denotes the absolute whole second in which the first data record begins. Any sub-second offset of the first sample is carried by the first time-keeping TAL (see Time-keeping annotation).

Total header length in bytes, written as an ASCII decimal integer padded with spaces. The value must equal

header_bytes = 256 * (signal_count + 1)

i.e. one main 256-byte block plus signal_count * 256 bytes of per-signal bands. Readers must verify this before walking the per- signal section.

Example: "768 " for signal_count = 2 (= 256 + 2 * 256).

Forty-four-byte field whose first five bytes encode the EDF+ dialect:

First 5 bytesDialectTimeline
all spaces or local EDF textEDFone continuous recording
"EDF+C"EDF+Ccontiguous: each record starts at the end of the previous
"EDF+D"EDF+Ddiscontinuous: gaps between records are allowed

For EDF+C, record k + 1 starts at start(record k) + record_duration. For EDF+D the true start time of each record is carried by the record’s time-keeping TAL.

The remaining 39 bytes are unused by the spec; write spaces.

This is the only field whose plain-EDF free-text use is not EDF+ compatible. An EDF reader will display an EDF+D file’s records back-to-back as if continuous, which is the only semantically lossy case in the EDF / EDF+ compatibility story.

ASCII decimal integer giving the number of data records (nr) in the file body. May be -1 only while the file is still being written and the final count is not yet known. As soon as the file is closed the writer must seek back to this field and overwrite it with the correct value; -1 is not a valid value for a closed file.

ASCII decimal value giving the duration of one data record in seconds.

For EDF+, the byte size of a data record (2 * sum_i samples_per_rec[i], or 3 * sum_i samples_per_rec[i] for BDF/BDF+) must not exceed 61440 bytes.

EDF+ relaxations relative to EDF:

  • Data records may be shorter than one second without any further restriction.
  • Data records in EDF+D need not be temporally contiguous (the duration covers the samples within a record, not the gap to the next record).
  • record_duration = 0 is permitted for two specific cases:
    1. The file contains no ordinary signals at all, only annotations (for example a sleep-scoring or QRS-onset-only file).
    2. The extreme EDF+D case where every ordinary signal contributes exactly one sample per data record and the records sit at arbitrary times. In both cases the time-keeping TAL of each record is the only timing information; the per-signal sampling rate formula does not apply.

ASCII decimal integer in 4 bytes giving ns, the number of signals present in every data record. EDF+ annotation signals occupy ordinary signal slots and are counted here; an annotations-only file therefore has ns >= 1. The 4-byte width caps ns at 9999.

After the main 256-byte block come ten contiguous fixed-width bands. Each band concatenates one per-signal field across all ns signals in signal order. The bands appear in this sequence:

#Base offsetWidth per signalFieldContent
125616label[i]Signal label (e.g. EEG Fpz-Cz, Body temp, EDF Annotations)
2256 + 16 * ns80transducer[i]Sensor or transducer description (e.g. AgAgCl electrode)
3256 + 96 * ns8physical_dim[i]Physical unit (e.g. uV, mV, degreeC, or spaces if uncalibrated)
4256 + 104 * ns8physical_min[i]Physical value corresponding to digital_min[i]
5256 + 112 * ns8physical_max[i]Physical value corresponding to digital_max[i]
6256 + 120 * ns8digital_min[i]Lower digital extremum for this signal
7256 + 128 * ns8digital_max[i]Upper digital extremum for this signal
8256 + 136 * ns80prefiltering[i]Filter description (e.g. HP:0.1Hz LP:75Hz N:50Hz)
9256 + 216 * ns8samples_per_rec[i]Number of samples this signal contributes to each data record
10256 + 224 * ns32signal_reserved[i]Per-signal reserved (spaces; spaces for EDF Annotations signals)

The byte after band 10 is 256 + 256 * ns, which must equal header_bytes from the main header.

The layout is banded, not interleaved: the file stores all ns labels first, then all ns transducer fields, and so on. A reader cannot extract one signal’s metadata by reading a contiguous 256-byte block at offset 256 + i * 256; it must seek into each band individually with field_base + i * field_width.

Field-by-field detail follows.

Sixteen ASCII bytes per signal.

The label identifies the signal type and, where applicable, the electrode montage. Plain EDF leaves the label as free local text.

EDF+ requires the standard-texts vocabulary at https://www.edfplus.info/specs/edftexts.html and uses the convention "<signal-type> <specification>", with the signal type as a short token (EEG, EOG, EMG, ECG, Resp, etc.) and the specification giving electrode locations or further qualification.

Examples (banded, left-justified, space-padded to 16):

EEG Fpz-Cz
ECG V2R
EMG RAT
Resp nasal
Body temp
Temp rectal
EDF Annotations

The label EDF Annotations is reserved: ordinary signals must not use it. EDF+ files must contain at least one signal with this label (see EDF Annotations signal).

For EXG-style labels (EEG, EOG, EMG, ECG), electrode pairs are written with a - between the two electrode names; the decoded physical value is the potential at the first electrode minus the potential at the second. When the reference is unknown, irrelevant, or too long for the 16-byte field, use Ref, Ref1, Ref2, etc.

Eighty ASCII bytes per signal describing the applied sensor.

Examples:

AgAgCl electrode
AgAgCl cup electrodes
thermistor
Rectal thermistor

For EDF+ EDF Annotations signals this band is filled with spaces.

Eight ASCII bytes per signal naming the physical unit.

Examples:

uV
mV
V
degreeC
degC

Both degreeC and degC appear as illustrative values in the spec text and worked example respectively. Use the ASCII letter u for micro, not the non-ASCII micro sign µ. Powers use ^ (e.g. m^3, m/s^2); evaluation order is prefix, then powers, then multiplication, then division (so Km^2 is (1000 m)^2, not 1000 * m^2).

For uncalibrated signals leave the band as eight spaces. The physical_min and physical_max fields must still differ (see below).

For EDF+ EDF Annotations signals this band is filled with spaces.

Eight ASCII bytes per signal each, in the unit declared by physical_dim. The two values fix the physical endpoints of the linear scaling map (see Digital-to-physical scaling).

Constraints:

  • physical_max != physical_min (strict). Equal values would cause a division by zero in the inverse map; some viewers refuse to open such files.
  • A negative amplifier gain is expressed by physical_max < physical_min. Do not swap the digital limits to express negative gain.
  • For uncalibrated signals with physical_dim set to spaces, the two physical extrema must still differ.

Example values from a worked EEG / temperature recording: EEG uses -440 / 510 (in uV), rectal temperature uses 34.4 / 40.2 (in degC).

Eight ASCII bytes per signal each, ASCII decimal integers.

Constraints:

  • digital_max > digital_min (strict).
  • For ordinary signals, both values must fit in 16-bit signed two’s complement (-32768..+32767), since each stored sample is a 16-bit signed integer.
  • For EDF+ EDF Annotations signals, digital_min and digital_max must be exactly -32768 and +32767 respectively, regardless of any physical interpretation.

The two values mark the digital endpoints of the linear scaling map. They often equal the A/D converter’s output range (e.g. -2048 / 2047 for a 12-bit ADC).

Eighty ASCII bytes per signal describing the analog filters applied to the signal during acquisition.

EDF+ uses the structured form

HP:0.1Hz LP:75Hz N:50Hz

with HP for high-pass, LP for low-pass, and N for notch filtering. Order is not pinned. Filters that were not applied may be omitted from the description. The full grammar is informal.

Example from a temperature channel:

LP:0.1Hz (first order)

For EDF+ EDF Annotations signals this band is filled with spaces.

Eight ASCII bytes per signal, ASCII decimal positive integer.

For ordinary signals this is the number of 16-bit samples the signal contributes to each data record. The per-signal sampling rate is

sample_rate[i] = samples_per_rec[i] / record_duration

provided record_duration > 0. Signals may carry different samples_per_rec values within the same file, so different sample rates per signal are normal.

For an EDF Annotations signal this field is the number of 2-byte slots reserved for annotation bytes in each data record; the annotation byte buffer per record is 2 * samples_per_rec[i] bytes.

Thirty-two ASCII bytes per signal. Reserved; write spaces. For EDF Annotations signals write spaces.

Each data record is a contiguous binary block. Within one record the signals are stored channel-major in signal order:

record[k] = sig[0][0 .. s0-1] sig[1][0 .. s1-1] ... sig[ns-1][0 .. sns-1]

where s_i = samples_per_rec[i].

Each ordinary sample is a 16-bit signed integer in little-endian two’s complement — least significant byte first. The total size of one record is

record_bytes = 2 * sum_i samples_per_rec[i]

The file body starts immediately after the header at offset header_bytes and contains record_count * record_bytes bytes when record_count >= 0.

For ordinary signals, samples are equispaced in time within a single data record. For EDF+D the interval from the last sample of record k to the first sample of record k + 1 may differ from the in- record sample interval and is set by the time-keeping TALs. Data records that follow each other in time must also follow each other in file order, in both EDF+C and EDF+D.

The four per-signal extremes (digital_min, digital_max, physical_min, physical_max) define a linear map from the stored 16-bit integer to the physical quantity:

gain = (physical_max - physical_min) / (digital_max - digital_min)
offset = physical_min - gain * digital_min
physical = gain * digital + offset

or equivalently

physical = physical_min + (digital - digital_min) * gain

The inverse map used by writers:

digital = round((physical - offset) / gain)
= round((physical - physical_min) / gain + digital_min)

Writers must clip or reject physical values that fall outside the declared digital range. A negative-gain channel is expressed by physical_max < physical_min; the formula above is unchanged.

Worked example for digital_min/max = -2048 / 2047, physical_min/max = -440 / 510 uV (EEG):

gain = (510 - (-440)) / (2047 - (-2048)) = 950 / 4095 ~= 0.232 uV/LSB
offset = -440 - 0.232 * (-2048) ~= 35 uV

i.e. an EEG offset of about 35 uV and a “digital per physical” gain of 4095/950 ~= 4.31 LSB/uV.

Every EDF+ file must contain at least one signal with the exact label EDF Annotations (15 characters, padded with one trailing space to fill the 16-byte label band), even when there are no user-visible annotations, because this signal carries the start time of every data record.

Required values in the signal’s per-signal header bands:

FieldRequired value for EDF Annotations
labelEDF Annotations left-justified, padded with spaces
transducerspaces
physical_dimspaces
physical_minany value, must differ from physical_max
physical_maxany value, must differ from physical_min
digital_minexactly -32768
digital_maxexactly +32767
prefilteringspaces
samples_per_recnumber of 2-byte slots allocated for annotation bytes
signal_reservedspaces

-1 / 1 is the conventional minimal pair for physical_min / physical_max, but any two distinct ASCII decimal values are valid.

Within a data record the annotation signal’s samples_per_rec 2-byte slots are not interpreted as integers. They are a contiguous byte buffer of 2 * samples_per_rec bytes that holds annotation bytes left-to-right in source order; the 2-byte sample grouping is an artefact of the EDF framing and is irrelevant to the annotation parser. For example, the text abc is stored as the byte sequence 97 98 99.

Multiple EDF Annotations signals may appear in one file; each is subject to the same header constraints. The first such signal in signal order is the time-keeping signal (see Time-keeping annotation); the rest are auxiliary annotation channels with the same TAL grammar.

Sizing rule of thumb: choose samples_per_rec so that 2 * samples_per_rec covers the largest expected TAL payload of any record, including the trailing 0x00 padding that fills the unused tail.

The annotation byte buffer of each data record holds a sequence of TALs. A TAL is the unit of annotation: one timestamp followed by zero or more text annotations sharing that timestamp.

Reserved control bytes:

ByteDecimalRole
0x1420Annotation separator and end-of-timestamp marker
0x1521Onset/Duration separator (only present when Duration is)
0x000TAL terminator and unused-byte fill
TAL = Onset ( 0x15 Duration )? 0x14 ( Annotation 0x14 )* 0x00
Onset = ( "+" | "-" ) Digits ( "." Digits )?
Duration = Digits ( "." Digits )?
Digits = digit { digit }
Annotation = UTF-8 text with restrictions described below

The bytes outside the regular grammar are forbidden inside their respective slots: 0x14 cannot appear inside an Annotation, 0x15 cannot appear inside Onset or Duration, 0x00 cannot appear anywhere except at the TAL terminator.

ASCII text using only bytes 0x2B (+), 0x2D (-), 0x2E (.), and 0x30..0x39 (digits 0..9).

Rules:

  • Must start with + or -.
  • + means the event occurs at or after file start; - means before.
  • The decimal point and fractional digits are optional but, when present, the . must appear only once and only when a sub-second precision is meaningful.
  • Fractional precision is arbitrary.

Examples: +0, +180, +1800.2, -0.065, +3456.789.

The value is elapsed seconds from the file’s start date / time pair in the main header.

Optional. ASCII text using only bytes 0x2E (.) and 0x30..0x39 (digits). No sign character is allowed.

When Duration is omitted, the preceding 0x15 byte is also omitted and the timestamp is just Onset 0x14.

The value is the event length in seconds.

Each annotation is the byte sequence between two 0x14 bytes (or between the trailing 0x14 of the timestamp and the next 0x14).

Rules:

  • Annotation text is UTF-8 and may carry any Unicode Basic Multilingual Plane character (the first 65 534 code points).
  • The byte 0x14 cannot appear inside an annotation; it would close the annotation early.
  • US-ASCII C0 control bytes 0x00..0x1F are forbidden except for 0x09 (TAB), 0x0A (LF), and 0x0D (CR), which are explicitly permitted to allow multi-line text and tables.
  • For repeating events the same annotation string must be used every time so that automated averaging, superimposition, and re-montaging can group them. Different events must use different strings.
  • An event spanning multiple data records is annotated once, in the record that contains its start; not repeated in following records.
  • The first TAL of a data record begins at byte 0 of the annotation signal’s slot for that record.
  • Subsequent TALs follow immediately after the preceding TAL’s terminating 0x00.
  • A TAL, including its trailing 0x00, must fit entirely inside one data record. TALs cannot span a record boundary.
  • Unused bytes at the end of the slot are filled with 0x00.
  • Annotations that describe events occurring inside a given record must be placed in that record’s annotation slot, even when the event onset is before the record’s own start (e.g. a pre-interval stimulus). Such annotations must follow the time-keeping TAL of the record, never precede it.

Two simultaneous events at t = 180 s carried in one TAL:

"+180" 0x14 "Lights off" 0x14 "Close door" 0x14 0x00

The same content split into two single-annotation TALs:

"+180" 0x14 "Lights off" 0x14 0x00
"+180" 0x14 "Close door" 0x14 0x00

A 25.5 second apnea starting at t = 1800.2 s:

"+1800.2" 0x15 "25.5" 0x14 "Apnea" 0x14 0x00

A pre-stimulus beep 65 ms before record start:

"-0.065" 0x14 "Pre-stimulus beep 1000Hz" 0x14 0x00

The first TAL in the first EDF Annotations signal of every data record is the record’s time-keeping TAL. Its annotation list is empty, so the timestamp is followed directly by two 0x14 bytes (the trailing 0x14 of the timestamp plus the closing 0x14 of the single empty annotation) and the TAL terminator:

Onset 0x14 0x14 0x00

The onset is the offset in seconds (possibly fractional) from the file’s start_date / start_time pair to the first sample of that data record.

Conventions and edge cases:

  • The first data record normally uses +0:
    "+0" 0x14 0x14 0x00
  • If the first record starts a fraction X seconds after the absolute whole second named by start_time, the first time-keeping TAL uses +0.X. The start_date / start_time pair is interpreted as the absolute whole second containing the first record’s start. If X is zero, the .X may be omitted.
  • For EDF+D, the gap after record k is onset[k+1] - onset[k] - record_duration (with record_duration > 0). When record_duration = 0, the onset values themselves define record times.
  • When a data record contains no ordinary signals (only annotations), the non-empty annotation immediately following the empty time-keeping annotation in the same TAL must name the event whose timing defines that record’s start. Example:
    "+3456.789" 0x14 0x14 "R-wave" 0x14 0x00
    states that this record starts at the occurrence of an R-wave 3456.789 s after file start.

Discontinuous recordings and non-equidistant data

Section titled “Discontinuous recordings and non-equidistant data”

EDF+D allows arbitrary inter-record gaps. The ordering rules still apply: records in temporal order are also in file order, and within a single record ordinary signal samples remain equispaced at the per-signal in-record rate.

Readers placing samples on the absolute timeline must use each record’s time-keeping onset, not the record index, to compute the absolute time of the first sample.

record_duration = 0 covers two degenerate cases (see also record_duration):

  • The file contains no ordinary signals — only annotations. The per-record samples_per_rec of any ordinary signal in such a file is zero, so this is essentially a stream of annotation records.
  • Each ordinary signal contributes exactly one sample per data record, and the records sit at arbitrary times. The per-record sampling-rate formula does not apply; each sample’s absolute time comes from the record’s time-keeping TAL.

EDF+ requires the label band to follow the form

<signal-type> <specification>

with one space between the components, left-justified and space-padded to 16 bytes. The signal-type token names the modality (ECG, EEG, EOG, EMG, Resp, Temp, SaO2, etc.); the specification names the lead, electrode pair, or sensor site. The full vocabulary lives at https://www.edfplus.info/specs/edftexts.html.

For an ECG channel the signal-type token is ECG and the specification is the lead name. The recognized lead names are:

ECG I ECG II ECG III
ECG aVR ECG aVL ECG aVF ECG -aVR
ECG V1 ECG V2 ECG V3 ECG V4 ECG V5 ECG V6
ECG V2R ECG V3R ECG V4R
ECG V7 ECG V8 ECG V9
ECG X ECG Y ECG Z

The full 16-byte label is the string above, space-padded. For example a right-sided V2R lead stores the bytes ECG V2R (ECG V2R followed by 9 trailing spaces).

Notes on lead semantics:

  • -aVR is the inverted augmented-vector lead.
  • V2R, V3R, V4R are the right-sided precordial leads.
  • V7, V8, V9 are the posterior precordial leads.
  • X, Y, Z are the orthogonal Frank-lead set.

Channels labelled with any other signal-type token (EEG, EMG, Resp, Temp, SaO2, EOG, ERG, MEG, MCG, EP, Sound, Light, Event, …) are not ECG and must be excluded from the ECG signal set.

physical_dim decomposes into an optional SI prefix followed by a basic dimension. ECG values use V, mV, or uV. The prefix multiplies the basic dimension by a power of ten:

PrefixPowerName
K3kilo
m-3milli
u-6micro
n-9nano

Case is significant: M is mega, m is milli. The micro prefix is the ASCII letter u (0x75); the Unicode micro sign µ is forbidden because header bytes must remain in printable US-ASCII 0x20..0x7E. The full SI table from yotta (Y, 10^24) to yocto (y, 10^-24) is at https://www.edfplus.info/specs/edftexts.html; the prefixes above are the ones that appear in ECG dimension strings.

The EDF Annotations signal carries free-form UTF-8 text. EDF+ reserves a fixed vocabulary so software can auto-detect events without parsing free text. The cardiac-relevant strings are:

  • Sinus Tachycardia
  • WC tachycardia (wide-complex)
  • NC tachycardia (narrow-complex)
  • Bradycardia
  • Asystole
  • Atrial fibrillation

Plus two universally useful boundary markers:

  • Recording starts
  • Recording ends

The full vocabulary covers sleep stages, respiratory events, movement events, arousals, and other non-cardiac entries; see the standard-texts reference.

A cardiac annotation must carry a Duration in its TAL timestamp (the 0x15 <duration> segment) since these events have a temporal extent. The recording-boundary strings are instantaneous and omit Duration.

An annotation may be bound to a specific signal channel by appending @@ followed by the target channel’s exact label (after stripping trailing spaces). For an ECG file:

Atrial fibrillation@@ECG II

ties the annotation to the lead-II channel.

A conforming reader must reject inputs that violate any of the following:

  • Header reads less than 256 bytes.
  • Bytes 0..7 are not "0 ".
  • Any header byte is outside 0x20..0x7E (for EDF+; plain EDF in practice also expects printable ASCII).
  • signal_count is missing, zero, negative, or not a parseable integer; or its decimal form does not fit in 4 ASCII characters.
  • header_bytes is missing, unparseable, or not equal to 256 * (signal_count + 1).
  • A per-signal band is truncated; any band cannot be parsed.
  • For any signal, digital_max <= digital_min.
  • For any signal, physical_max == physical_min.
  • For any signal, samples_per_rec is missing, zero, or negative.
  • record_duration is missing, negative, NaN, infinite, or unparseable; 0 is permitted only in EDF+ for annotation-only or single-sample-per-record use.
  • record_count is neither a non-negative integer nor -1.
  • record_count is -1 in a closed file (best-effort; readers may treat this as a warning rather than a hard error and still attempt to scan the body).
  • The body is shorter than record_count * record_bytes when record_count >= 0.
  • The reserved field starts with "EDF+" followed by any character other than 'C' or 'D'.
  • An ordinary signal has the reserved label EDF Annotations.
  • An EDF+ file has no EDF Annotations signal.
  • An EDF Annotations signal has digital_min != -32768, digital_max != +32767, physical_min == physical_max, or any non-space byte in transducer, physical_dim, prefiltering, or signal_reserved.
  • A TAL has a missing or malformed onset, a malformed duration (when present), an annotation byte equal to 0x14, or no terminating 0x00 before the record’s annotation slot ends.
  • A TAL spans a data-record boundary.
  • An annotation contains a forbidden control byte (any byte in 0x00..0x1F other than 0x09, 0x0A, 0x0D, plus the always-forbidden 0x14).
  • The first TAL of the first EDF Annotations signal in any data record is not a valid time-keeping TAL: missing onset, non-empty first annotation when the record carries ordinary signals, missing start-defining annotation when the record carries no ordinary signals, etc.

The validation order is at the reader’s discretion.

  • EDF and EDF+ store samples in channel-major order within each data record. Formats like WFDB interleave channels sample by sample; EDF does not. A reader extracting a single signal can skip the uninteresting bytes inside each record without reordering.
  • The 16-bit sample width is fixed. BDF and BDF+ are separate formats with a 24-bit sample width and a different version-field value, and are not covered here.
  • The version field is kept at "0 " for both EDF and EDF+ so that EDF-era viewers continue to open EDF+ files. The only semantic incompatibility is the discontinuous variant EDF+D: an EDF viewer will render an EDF+D file’s records back-to-back, hiding the gaps.
  • The 1985..2084 Y2K window is intrinsic to the two-digit-year start_date. For dates outside this window readers must consult the four-digit-year Startdate subfield of recording_id (EDF+ only).
  • Header bytes are printable US-ASCII. UTF-8 appears only inside annotation text inside data records.