LIBCWAV -- Waveform I/O Library

NAME

LIBCWAV - Waveform I/O Library

DESCRIPTION

libcwav is a library of functions for reading, writing, and organizing waveform data from speech and other signals. This library is used by most of the speech processing software developed by the ASEL Speech Research group. The main features of the library are:

transparent access to a variety of common sample formats (e.g., 16-bit twos-complement integers, 12-bit and 8-bit Offset binary integers)
architectural independence (the same binary data files are readable on PC, Sparc, SGI, etc.)
ability to read/write data in either local (ASEL.wav) or RIFF (Microsoft.wav) file formats
multi-channel data can be stored and accessed either as individual channels or as multi-sample records
segmentation and labeling information is stored with sample data allowing I/O routines to reference named portions of larger files.

Presently library versions are available for Fortran 77 and C compilers on Unix, MSDOS, and Windows 3.1 platforms.

Waveform Filename Specifications

Some of the most useful features of the libcwav routines are represented as extensions to normal file naming conventions which allow specification of named regions of a file and/or specification of individual channels in multi-channel files. Specifically, a fully qualified waveform file name has the syntax:

[path]basename[.wav][$segment][#channel]

where only the basename of the file is required, but the name can optionally contain an extension (.wav by default), a segment name up to 6 characters in length indicating a waveform region that is defined within the file (see edw (1) on how segments are defined), and a channel number ranging from 0 to N-1 where N is the number of channels of data represented in each record of the waveform file. More information on using this syntax is available under the topic Specifying waveform segments and channels.

Waveform Data Structures

Waveforms are sampled functions of time, described in terms of their sampling rate (in samples per second or equivalently Hz) and sample resolution (in bits per sample). For digitized speech signals values typically range from an 8000 Hz sample rate with 8-bit samples (roughly telphone quality) to a 44100 Hz sample rate with 16-bit samples (CD quality). For multi-channel data like stereo sound, or physiological data (e.g., EEG data), each sample period consists of a record containing one sample for each channel, thus 'sample rate' for multi-channel data refers to the number of multi-sample records per second. libcwav provides a data structure for describing the various waveform properties called a Waveform Data Block and defined as:



     typedef struct {
         short           format;
         short           file_type;
         short           smplbits;
         unsigned short  sample_rate;
         short           ref_level;
         short           n_segments;
         short           n_per_rec;
         float           scale;
         float           bias;
         short           error;
     } WDB;

The WDB structure is set by the open waveform function opnwav when existing files are opened, or initialized by an application program and passed to opnwav when a new file is to be created. The elements of a WDB are:

WDB.format

Binary format of data in file. Possible values:

O8_format: 8 bit Offset Binary
OB_format: 12 bit Offset Binary
PC_format: 16 bit Twos Complement
TC_format: 12 bit Twos Complement
GT_format: Generic Twos Complement (bits per sample in WDB.smplbits)
GO_format: Generic Offset Binary (bits per sample in WDB.smplbits)

WDB.file_type

Storage format of the waveform in file. Possible values:

RIFF: A waveform file format using the standard IBM/Microsoft Resource Interchange File Format (RIFF).
ASEL: A waveform file using the ASEL internal storage format

WDB.smplbits

Number of bits per sample. (1-16) Only needed for formats GT_format and GO_format.

WDB.sample_rate

The sample rate in Hz.

WDB.n_segments

Number of segments defined in file, (only applicable for existing waveforms.) See wavgst for a description of segment tables.

WDB.n_per_rec

Total channels of data (i.e. samples) per record.

WDB.scale

Together with WDB.bias allows linear scaling from integer sample values to real-world coordinates. Sample values are intended to be transformed as:

real_value = WDB.scale * sample_value + WDB.bias;

WDB.bias

The intercept for linear transformation of sample values to real-world values.

WDB.error

One of the following constants will be returned by opnwav if an error occurs in opening a file:

NO_FILE: File could not be found or created
INVALID_FORMAT: Format not one of those defined
NO_SEGMENT: Segment not found in file
INVALID_CHANNEL: Channel number > total channels per record
INSUF_MEMORY: Memory allocation failure
INSUF_BUFFER: Predefined buffer too small for record.
INVALID_FILE: File neither RIFF nor ASEL format.

AUTHOR

Shirley Peters, H.T.Bunnell

NAME

DESCRIPTION

Waveform Filename Specifications

Waveform Data Structures

SEE ALSO

AUTHOR