Many tools need to input parameterised speech data and HTK provides a number of different methods for doing this:
All HTK speech input is controlled by configuration parameters which give details of what processing operations to apply to each input speech file or audio source. This chapter describes speech input/output in HTK. The general mechanisms are explained and the various configuration parameters are defined. The facilities for signal pre-processing, linear prediction-based processing, Fourier-based processing and vector quantisation are presented and the supported file formats are given. Also described are the facilities for augmenting the basic speech parameters with energy measures, delta coefficients and acceleration (delta-delta) coefficients and for splitting each parameter vector into multiple data streams to form observations. The chapter concludes with a brief description of the tools HLIST and HCOPY which are provided for viewing, manipulating and encoding speech files.