Things that still need to be done...


  1. IPR negotiations: 4 new Mandarin web sources, 4 new SCOLA sources

  2. Transcription: 19 hrs/wk English, almost 80 hrs/wk non-English
    -- times about 20 weeks = total of almost 2000 hours (!)

  3. Story boundary annotation: can only be done when text is available
    • Closed-captioning available for 26 hrs/wk of English video
    • ~125 hrs/week in all 6 languages, times 20 weeks: 2500 hours
    • Don't start topic selection till this is done (!)

 <<   LDC   >> 


graff@ldc.upenn.edu