dalign - Align a waveform template to head or tail of another waveform.


dalign [-echo] [-s:short] [-l:long] [-ic:step] [-end|-beg] [-name:sname] [-nsd:noisefloor] template target


dalign matches a linearly time-scaled waveform template to the head or tail of another waveform file. Leading or trailing silence is ignored in the target waveform. The template is compressed in time by the factor -s:short (a value < 1.0) and compared to the target speech at successively increased (by the -ic:step amount) durations up to the scale factor -l:long (a value > 1.0). The time scale at which the closest match between the template and target waveform is acheived is used to locate the inner boundary of the corresponding region in the target waveform. The region from the start of the waveform to the location of the inner template matching region (for -beg templates) or from the end of the target waveform to the inner template matching boundary (for -end templates) is then labeled using the -name:sname string value.

This is useful, for example, to label the surrounding carrier context in speech files composed of a carrier phrase preceeding, following, or surrounding a phrase of interest. Consider a carrier phrase like "Please record <variable> two times" in which only the <variable> part is actually wanted. dalign can locate the leading "please record" or the trailing "two times" and label those portions of the file, allowing the residual center section to be excised for further study. To do this, one instance of the talker saying "please record" must be saved as a template to match to the head of the file, and one instance of "two times" must be saved as another template for matching to the tail of the file. Then, to match the head of the file, a command like: will align the head template (e.g., "please record") to the sentence, and a command like: could be used to align a tail template (e.g., "two times") to the end of the sentence. Note that when labeling many sentences this way, one must hand label the first example, thereafter, the hand-labeled segments of the first sentence can be used as the templates for subsequent sentences as in the following sequence of commands:


Causes dalign to echo the filename, assigned segment, and a goodness of fit metric in the form "file$seg - value. These data may be piped to a log file and skimmed to locate instances in which the goodness of fit was particularly poor since those are more likely to be cases where boundaries were incorrectly assigned. This flag is off by default.
Scaling factor for the shortest acceptable template match. The duration of the template is scaled by this factor for the start of the search for the best match. Default is 0.6.
Scaling factor for the longest acceptable template match. The duration of the template is scaled by this factor at the end of the search for the best match. Default is 1.8.
The increment used to step through the scaling factors from shortest to longest. Default is 0.05.
A flag to indicate whether the template is to be matched to the beginning or the to end of the target wavefrom. Only one of these may be specified at a time. The default is -beg.
The name to be assigned the segment associated with the template match region in the target file. If no -name is given, dalign will use -name:head for -beg matches and -name:tail for -end matches.
To trim silence from the beginning and end of the target file, dalign computes the standard deviation of the spectral parameter vectors used for the matching process. It then locates the minimum spectral parameter frame within the file, assumes that is a silent frame, and searches for the first (for -beg) or last (for -end) frame in the file for which the parameter vector is more than <noisefloor> standard deviations away from the minimum parameter vector. The defaultis 1.0.


align (1)


If they do not exist in the CWD, dalign creates .zcp files for the template and target files using a bark cepstrum analysis (see zcep(1)). Unfortunately, these files are not compatable with other .zcp files such as those produced by the zcep(1) program. Care must be taken to see that any .zcp files present are actually those produced by dalign when dalign is run, and the files should be deleted once dalign has successfully be used.


H.T.Bunnell, D. Yarrington