NAME
LIBCWAV - Waveform I/O Library
DESCRIPTION
libcwav is a library of
functions
for reading, writing, and
organizing waveform data from speech and other signals. This library
is used by most of the speech processing software developed by the
ASEL Speech Research group. The main features of the library are:
- transparent access to a variety of common sample formats (e.g., 16-bit
twos-complement integers, 12-bit and 8-bit Offset binary integers)
- architectural independence (the same binary data files are
readable on PC, Sparc, SGI, etc.)
- ability to read/write data in
either local (ASEL.wav) or RIFF (Microsoft.wav) file formats
- multi-channel data can be stored and accessed either as individual
channels or as multi-sample records
- segmentation and labeling
information is stored with sample data allowing I/O routines to
reference named portions of larger files.
Presently library versions
are available for Fortran 77 and C compilers on Unix, MSDOS, and
Windows 3.1 platforms.
Waveform Filename Specifications
Some of the most useful features of the libcwav routines are
represented as extensions to normal file naming conventions which
allow specification of named regions of a file and/or specification of
individual channels in multi-channel files. Specifically, a fully
qualified waveform file name has the syntax:
[path]basename[.wav][$segment][#channel]
where only the basename of the file is required, but the name can
optionally contain an extension (.wav by default), a segment name up
to 6 characters in length indicating a waveform region that is
defined within the file (see edw (1) on how segments are defined), and
a channel number ranging from 0 to N-1 where N is the number of
channels of data represented in each record of the waveform file.
More information on using this syntax is available under the topic Specifying waveform segments and
channels.
Waveform Data Structures
Waveforms are sampled functions of time, described in terms
of their sampling rate (in samples per second or equivalently Hz) and
sample resolution (in bits per sample). For digitized speech signals
values typically range from an 8000 Hz sample rate with 8-bit samples
(roughly telphone quality) to a 44100 Hz sample rate with 16-bit
samples (CD quality). For multi-channel data like stereo sound, or
physiological data (e.g., EEG data), each sample period consists of a
record containing one sample for each channel, thus 'sample rate' for
multi-channel data refers to the number of multi-sample records per
second. libcwav provides a data structure for describing the various
waveform properties called a Waveform Data Block and defined as:
typedef struct {
short format;
short file_type;
short smplbits;
unsigned short sample_rate;
short ref_level;
short n_segments;
short n_per_rec;
float scale;
float bias;
short error;
} WDB;
The WDB
structure is set by the open waveform
function opnwav
when existing files are opened, or initialized by an application program
and passed to opnwav when a new file is to be created. The
elements of a WDB
are:
-
WDB.format
- Binary format of data in file. Possible values:
-
O8_format
- 8 bit Offset Binary
-
OB_format
- 12 bit Offset Binary
-
PC_format
- 16 bit Twos Complement
-
TC_format
- 12 bit Twos Complement
-
GT_format
- Generic Twos Complement (bits per sample in WDB.smplbits)
-
GO_format
- Generic Offset Binary (bits per sample in WDB.smplbits)
-
WDB.file_type
- Storage format of the waveform in file. Possible values:
-
RIFF
- A waveform file format using the standard IBM/Microsoft Resource
Interchange File Format (RIFF).
-
ASEL
- A waveform file using the ASEL internal storage format
-
WDB.smplbits
- Number of bits per sample. (1-16) Only needed for formats
GT_format
and GO_format
.
-
WDB.sample_rate
- The sample rate in Hz.
-
WDB.n_segments
- Number of segments defined in file, (only applicable for existing
waveforms.) See wavgst for a description of segment tables.
-
WDB.n_per_rec
- Total channels of data (i.e. samples) per record.
-
WDB.scale
- Together with
WDB.bias
allows linear
scaling from integer sample values to real-world
coordinates. Sample values are intended to be transformed as:
real_value = WDB.scale * sample_value + WDB.bias;
-
WDB.bias
- The intercept for linear transformation of sample values to
real-world values.
-
WDB.error
- One of the following constants will be returned by
opnwav if an error occurs in opening a file:
-
NO_FILE
- File could not be found or created
-
INVALID_FORMAT
- Format not one of those defined
-
NO_SEGMENT
- Segment not found in file
-
INVALID_CHANNEL
- Channel number > total channels per record
-
INSUF_MEMORY
- Memory allocation failure
-
INSUF_BUFFER
- Predefined buffer too small for record.
-
INVALID_FILE
- File neither
RIFF
nor ASEL
format.
SEE ALSO
Libcwav functions
AUTHOR
Shirley Peters, H.T.Bunnell