Minutes of June 2, 2006 Attendees: Jon Fiscus Yang Lui Bhuvana Ramabhadran - IBM Steve Lowe, Owen Kimball, Herb Gish Charles Wayne Chuck Wooters James - Dublin (Sorry I missed the name) Igor Szoke - Mazin Gilbert - AT&T - Introductions - Ask for new agenda items - New issues - log likelihood ratios - JF: explained the NIST position: We specified LLRs to simplify system processing because asking for posterior probs would require knowledge of the prior probs. The reason for specifying the scores is to permit pooled DET curves to be produced - HG: noted there are other possibilities for generating pooled DET and that the requirement will force scores across words to be commensurate. JF reply: This is intended as a model of the application - Action: Change the evalplan description to say a score is require, instead of a LLR, and to explicitly state how the scores will be used, e.g. DET pooling. - Probability of False Alarm formula - OK: noted that using recording durations to normalize the probability of False Alarms emphasizes the duration of the recording. - JF reply: That was intended. - Facilitating collaboration, e.g., shared phone lattices - All agreed it would be good to share resources like SDR model. - IS: willing to provide word/phone lattices for some of the material - CW: will check to see if ICSI can share meeting data delayed-sum data and/or lattices - Action item: JF will solicit the community for volunteers to share data and post the info on the web site. - Indexing the full recordings or only the UEM-specified data. - JF: the subject was discussed at length. A wide range of difficulties surfaced with allowing people to use audio from recordings outside of the selected test material. - JF made a proposal to not permit processing the audio outside of the UEM-specified data and no one disagreed. - Action item: Modify eval plan to make the use of non-evaluated audio illegal - Complex term specifications - IS described the question posed in email: (5/31/06 Question 2) - IS said we should defer this till next year's eval if it exists - No one objected - Open discussion - Processing all source types for a language - JF brought up the subject for comment - HG noted the meeting data posed a considerable amount of work for them since they haven't been involved in meeting research. - OK noted scoring over all source types, even though a site may choose to not address the source type, may make people not participate in a language. - CW noted we should optimize participation by not scoring over all source types - Action Item: eval plan will explicitly state that systems will not be scored over all source types but rather scored separately by source type. However, systems are still must process all the data. - Single-word search terms - IS asked if "function words" would be potential search terms for single word search terms. Further, will NIST provide a set of stop words. - It was noted the function words will be part of multiple-word search terms. e.g. "hounds of the baskerville". - JF: Did not think it was appropriate to formally specify the stop list - The general consensus was to not publish a list - OOV - IS asked how Out-Of-Vocabulary tokens would be handled: mandatory LVCSR lexicon? - General consensus was to not be restrictive on system development. Instead, add fields to the system output specifying the number of OOV words in the search term. - Action item: NIST will add the fields to the STDList file format. - Next conference call: - After eval plan is published, we will determine if a call is needed. - Other action items: Action item: NIST will add a matrix of source material times for the devset and the evalset. - Missed topics: - Diacritized Arabic: we did not discuss this issue at all.