HTK format files consist of a contiguous sequence of *samples*
preceded by a header. Each sample is a vector of either 2-byte integers or
4-byte floats. 2-byte integers are used for compressed forms as described
below and for vector quantised data as described later in
section 5.11. HTK format data files can also be used to store
speech waveforms as described in section 5.8.

The HTK file format header is 12 bytes long and contains the following data

The parameter kind consists of a 6 bit code representing the basic parameter kind plus additional bits for each of the possible qualifiers . The basic parameter kind codes arenSamples- number of samples in file (4-byte integer)

sampPeriod- sample period in 100ns units (4-byte integer)

sampSize- number of bytes per sample (2-byte integer)

parmKind- a code indicating the sample kind (2-byte integer)

and the bit-encoding for the qualifiers (in octal) is0WAVEFORMsampled waveform1

LPClinear prediction filter coefficients2

LPREFClinear prediction reflection coefficients3

LPCEPSTRALPC cepstral coefficients4

LPDELCEPLPC cepstra plus delta coefficients5

IREFCLPC reflection coef in 16 bit integer format6

MFCCmel-frequency cepstral coefficients7

FBANKlog mel-filter bank channel outputs8

MELSPEClinear mel-filter bank channel outputs9

USERuser defined sample kind10

DISCRETEvector quantised data

The_E000100 has energy

_N000200 absolute energy suppressed

_D000400 has delta coefficients

_A001000 has acceleration coefficients

_C002000 is compressed

_Z004000 has zero mean static coef.

_K010000 has CRC checksum

_O020000 has 0'th cepstral coef.

All parameterised forms of HTK data files consist of a sequence of vectors.
Each vector is organised as shown by the examples in Fig 5.4
where various different qualified forms are listed. As can be seen, an energy
value if present immediately follows the base coefficients. If delta
coefficients are added, these follow the base coefficients and energy value.
Note that the base form `LPC` is used in this figure only as an example,
the same layout applies to all base sample kinds. If the 0'th order cepstral
coefficient is included as well as energy then it is inserted immediately
before the energy coefficient, otherwise it replaces it.

For external storage of speech parameter files, two compression methods are
provided. For LP coding only, the `IREFC` parameter kind exploits the
fact that the reflection coefficients are bounded by and hence they can
be stored as scaled integers such that +1.0 is stored as 32767 and -1.0
is stored as -32767. For other types of parameterisation, a more general
compression facility indicated by the
`_C` qualifier is used.
HTK compressed parameter files consist of a set of compressed parameter
vectors stored as shorts such that for parameter *x*

The coefficients *A* and *B* are defined as

where is the maximum value of parameter *x* in the whole file and
is the corresponding minimum. *I* is the maximum range of a 2-byte
integer i.e. 32767. The values of *A* and *B* are stored as two floating
point vectors prepended to the start of the file immediately after the header.

When a HTK tool writes out a speech file to external storage, no further
signal conversions are performed. Thus, for most purposes, the target
parameter kind specifies both the required internal representation and the form
of the written output, if any. However, there is a distinction in the way that
the external data is actually stored. Firstly, it can be compressed as
described above by setting the configuration parameter `SAVECOMPRESSED`
to true. If the target kind is `LPREFC` then this compression is
implemented by converting to `IREFC` otherwise the general compression
algorithm described above is used. Secondly, in order to avoid data corruption
problems, externally stored HTK parameter files can have a cyclic redundancy
checksum appended. This is indicated by the qualifier
`_K` and it is generated by setting
the configuration parameter `SAVEWITHCRC` to true. The principle tool
which uses these output conversions is HCOPY (see
section 5.13).