TREC 2002 Video Track Runs (cont.)
Feature extraction
ID Priority Affiliation
CMU.r1 1 Carnegie Mellon Univ.
CMU.r2 2
CMU.r3 3
CLIPS-LIT-GEOD 1 CLIPS_IMAG Lab
CLIPS-LIT-LIMSU 2
DCUFE2002 1 Dublin City Univ.
Eurecom1 1 Eurecom
Fudan_FE_Sys1 1 Fudan University
Fudan_FE_Sys2 2
IBM-1 1 IBM Research
IBM-2 2
MediaMill1 1
MediaMill2 2
MSRA 1 Microsoft Research Asia
TZI_univ_bremen 1 Univ. of Bremen
UMD1 1 Univ. of Maryland
UnivO_MT1 1 Univ. of Oulu
UnivO_MT2 2
Feature definitions
If the feature is true for some frame (sequence) within the shot, then
it is true for the shot.
-
Outdoors: Segment contains a recognizably outdoor
location, i.e., one outside of buildings. Should exclude all scenes that
are indoors or are close-ups of objects (even if the objects are outdoor).
-
Indoors: Segment contains a recognizably indoor location,
i.e., inside a building. Should exclude all scenes that are outdoors or
are close-ups of objects (even if the objects are indoor).
-
Face: Segment contains at least one human face with
the nose, mouth, and both eyes visible. Pictures of a face meeting the
above conditions count.
-
People: Segment contains a group of two more humans,
each of which is at least partially visible and is recognizable as a human.
-
Cityscape: Segment contains a recognizably city/urban/suburban
setting.
-
Landscape: Segment contains a predominantly natural
inland setting, i.e., one with little or no evidence of development by
humans. For example, scenes consisting mostly of plowed/planted fields,
pastures, orchards would be excluded. Some buildings, if small features
on the overall landscape, should be OK. Scenes with bodies of water that
are clearly inland may be included.
-
Text Overlay: Segment contains superimposed text large
enough to be read.
-
Speech: A human voice uttering words is recognizable
as such in this segment
-
Instrumental Sound: Sound produced by one or more
musical instruments is recognizable as such in this segment. Included are
percussion instruments.
-
Monologue: Segment contains an event in which a single
person is at least partially visible and speaks for a long time without
interruption by another speaker. Pauses are ok if short.