DALIGN
dalign - Align a waveform template to head or tail of another waveform.
SYNOPSIS
dalign [-echo] [-s:short] [-l:long] [-ic:step] [-end|-beg] [-name:sname] [-nsd:noisefloor] template target
DESCRIPTION
dalign matches a linearly time-scaled waveform template to the
head or tail of another waveform file. Leading or trailing silence is
ignored in the target waveform. The template is compressed in time by
the factor -s:short (a value < 1.0) and compared to the target speech
at successively increased (by the -ic:step amount) durations up to the
scale factor -l:long (a value > 1.0). The time scale at which the
closest match between the template and target waveform is acheived is
used to locate the inner boundary of the corresponding region in the
target waveform. The region from the start of the waveform to the
location of the inner template matching region (for -beg templates) or
from the end of the target waveform to the inner template matching
boundary (for -end templates) is then labeled using the -name:sname
string value.
This is useful, for example, to label the surrounding carrier context
in speech files composed of a carrier phrase preceeding, following, or
surrounding a phrase of interest. Consider a carrier phrase like
"Please record <variable> two times" in which only the <variable> part
is actually wanted. dalign can locate the leading "please
record" or the trailing "two times" and label those portions of the
file, allowing the residual center section to be excised for further
study. To do this, one instance of the talker saying "please record"
must be saved as a template to match to the head of the file, and one
instance of "two times" must be saved as another template for matching
to the tail of the file. Then, to match the head of the file, a
command like:
dalign -beg headtmpl sentence
will align the head template (e.g., "please record") to the sentence,
and a command like:
dalign -end tailtmpl sentence
could be used to align a tail template (e.g., "two times") to the end
of the sentence. Note that when labeling many sentences this way, one
must hand label the first example, thereafter, the hand-labeled
segments of the first sentence can be used as the templates for
subsequent sentences as in the following sequence of commands:
edw sent1 <to hand label "head" and/or "tail" segments>
dalign -beg -name:head sent1\$head sent2
dalign -end -name:tail sent1\$tail sent2
dalign -beg -name:head sent1\$head sent3
etc.
OPTIONS
- -echo
-
Causes dalign to echo the filename, assigned segment, and a goodness
of fit metric in the form "file$seg - value. These data may be piped
to a log file and skimmed to locate instances in which the goodness of
fit was particularly poor since those are more likely to be cases
where boundaries were incorrectly assigned. This flag is off by default.
- -s:short
-
Scaling factor for the shortest acceptable template match. The
duration of the template is scaled by this factor for the start of the
search for the best match. Default is 0.6.
- -l:long
- Scaling factor for the longest acceptable template match. The
duration of the template is scaled by this factor at the end of the
search for the best match. Default is 1.8.
- -ic:step
- The increment used to step through the scaling factors from shortest
to longest. Default is 0.05.
- -end|-beg
- A flag to indicate whether the template is to be matched to the
beginning or the to end of the target wavefrom. Only one of these may
be specified at a time. The default is -beg.
- -name:sname
- The name to be assigned the segment associated with the template match
region in the target file. If no -name is given, dalign will use
-name:head for -beg matches and -name:tail for -end matches.
- -nsd:noisefloor
- To trim silence from the beginning and end of the target file, dalign
computes the standard deviation of the spectral parameter vectors used
for the matching process. It then locates the minimum spectral
parameter frame within the file, assumes that is a silent frame, and
searches for the first (for -beg) or last (for -end) frame in the file
for which the parameter vector is more than <noisefloor> standard
deviations away from the minimum parameter vector. The defaultis 1.0.
SEE ALSO
align (1)
NOTES
If they do not exist in the CWD, dalign creates .zcp files for the
template and target files using a bark cepstrum analysis (see
zcep(1)). Unfortunately, these files are not compatable with other
.zcp files such as those produced by the zcep(1) program. Care must be
taken to see that any .zcp files present are actually those produced
by dalign when dalign is run, and the files should be deleted once
dalign has successfully be used.
AUTHORS
H.T.Bunnell, D. Yarrington