Evaluation Specification
Topics
Relevance Judgments
Test corpus of recorded
speech
The test corpus is a set of Broadcast News recordings
and transcriptions that must be licensed from the Lingusitics Data Consortium
and are subject to usage restrictions.
SOFTWARE
The following PERL5 scripts may be used in manipulating the SDR textual data:
ctm2srt.pl Filter to create a Speech Recognizer
Transcript (.srt) format file from a SCLITE Speech Recognition Scoring (.ctm)
format file.
srt2ctm.pl Filter to create a SCLITE Speech
Recognition Scoring (.ctm) formatfile from an Speech Recognizer Transcript (.srt)
format file.
srt2ltt.pl Filter to create a Lexical TREC
Transcript (.ltt) format file from an Speech Recognizer Transcript (.srt) format
file.