Call to TREC 2008

Return to the TREC home page TREC home          National Institute of Standards and Technology Home Page

CALL FOR PARTICIPATION

TEXT RETRIEVAL CONFERENCE (TREC)


February 2008 - November 2008


Conducted by:
National Institute of Standards and Technology (NIST)

With support from:
Intelligence Advanced Research Projects Activity (IARPA)


The Text Retrieval Conference (TREC) workshop series encourages research in information retrieval and related applications by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. Now in its seventeenth year, the conference has become the major experimental effort in the field. Participants in the previous TREC conferences have examined a wide variety of retrieval techniques and retrieval environments, including cross-language retrieval, retrieval of web documents, multimedia retrieval, and question answering. Details about TREC can be found at the TREC web site, http://trec.nist.gov.

You are invited to participate in TREC 2008. TREC 2008 will consist of a set of tasks known as "tracks". Each track focuses on a particular subproblem or variant of the retrieval task as described below. Organizations may choose to participate in any or all of the tracks. For most tracks, training and test materials are available from NIST; a few tracks will use special collections that are available from other organizations for a fee.

Dissemination of TREC work and results other than in the (publicly available) conference proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on TREC results. All retrieval results submitted to NIST are published in the Proceedings and are archived on the TREC web site. The workshop in November is open only to participating groups that submit retrieval results for at least one track and to selected government personnel from sponsoring agencies.

Schedule:

Schedule: By February 21 -- submit application described below to NIST.
Returning an application will add you to the active participants' mailing list. On Feb 25, NIST will announce a new password for the "active participants" portion of the TREC web site. Included in this portion of the web site is information regarding the permission forms needed to obtain the TREC document disks.

Beginning March 1
document disks used in some existing TREC collections distributed to participants who have returned the required forms. Please note that no disks will be shipped before March 1.

mid-July--mid-August
results submission deadline for most tracks (Results deadline may need to be even earlier for some tracks depending on assessor availability. Specific deadlines for each track will be included in the track guidelines, which will be finalized in the spring.)

September 9 (estimated)
speaker proposals due at NIST.

September 30 (estimated)
relevance judgments and individual evaluation scores due back to participants.

Nov 18-21
TREC 2007 conference at NIST in Gaithersburg, Md. USA

Task Description

Below is a brief summary of the tasks. Complete descriptions of tasks performed in previous years are included in the Overview papers in each of the TREC proceedings (in the Publications section of the web site).

The exact definition of the tasks to be performed in each track for TREC 2008 is still being formulated. Track discussion takes place on the track mailing list. To be added to a track mailing list, follow the instructions for contacting that mailing list as given below. For questions about the track, send mail to the track coordinator (or post the question to the track mailing list once you join).

TREC 2008 will have one new track and four continuing tracks. The new track is the relevance feedback track, a track that will systematically explore the factors that affect relevance feedback behavior. The blog, enterprise, legal, and million query tracks will continue in TREC 2008, though the specific tasks in a track may differ from year to year. (Note that the QA track has been moved to the new Text Analysis Conference (TAC); the call for participation in TAC will be sent to the TREC friends mailing list.)

Blog Track

The purpose of the blog track is to explore information seeking behavior in the blogosphere.
Track coordinators: Iadh Ounis, Craig Macdonald, Ian Soboroff, trecblog-organisers (at) dcs.gla.ac.uk
Track web page: http://ir.dcs.gla.ac.uk/wiki/TREC-BLOG
Mailing list: send a mail message to listproc@nist.gov such that the body consists of the line subscribe trec-blog <FirstName> <LastName>

Enterprise Track

The purpose of the enterprise track is to study enterprise search: satisfying a user who is searching the data of an organization to complete some task.
Track coordinators: Nick Craswell, nickcr (at) microsoft.com
   Ian Soboroff, ian.soboroff (at) nist.gov
    Arjen de Vries, arjen (at) acm.org
Track web page: http://www.ins.cwi.nl/projects/trec-ent
Mailing list: send a mail message to listproc@nist.gov such that the body consists of the line subscribe trec-ent <FirstName> <LastName>

Legal Track

The goal of the legal track is to develop search technology that meets the needs of lawyers to engage in effective discovery in digital document collections.
Track coordinators: Jason Baron, jason.baron (at) nara.gov
   Doug Oard, oard (at) umd.edu
Track web page: http://trec-legal.umiacs.umd.edu
Mailing list: contact oard (at) umd.edu to be added to the list.

Million Query Track

The goal of the "million query" track is to test the hypothesis that a test collection built from very many very incompletely judged topics is a better tool than a collection built using traditional TREC pooling.
Track web page: http://ciir.cs.umass.edu/research/million
Track coordinator: James Allan, allan (at) cs.umass.edu
   Jay Aslam, jaa (at) ccs.neu.edu
Mailing list: Follow the instructions given on the track web page to join the email list
   million (at) cs.umass.edu

Relevance Feedback Track

The goal of the relevance feedback track is to provide a framework for exploring the effects of different factors on the success of relevance feedback.
Track coordinators: Chris Buckley, cabuckley (at) sabir.com
   Steve Robertson, ser (at) microsoft.com
Track web page: http://groups.google.com/group/trec-relfeed
Mailing list: Follow the instructions given on the track web page to join the email list

Conference Format

The conference itself will be used as a forum both for presentation of results (including failure analyses and system comparisons), and for more lengthy system presentations describing retrieval techniques used, experiments run using the data, and other issues of interest to researchers in information retrieval. As there is a limited amount of time for these presentations, the TREC program committee will determine which groups are asked to speak and which groups will present in a poster session. Groups that are interested in having a speaking slot during the workshop should submit a 200-300 word abstract in September describing the experiments they performed. The program committee will use these abstracts to select speakers.

As some organizations may not wish to describe their proprietary algorithms, TREC defines two categories of participation.

Category A: Full participation
Participants will be expected to present full details of system algorithms and various experiments run using the data, either in a talk or in a poster session.

Category C: Evaluation only
Participants in this category will be expected to submit results for common scoring and tabulation, and present their results in a poster session. They will not be expected to describe their systems in minute detail, but will be expected to describe the general approach and report on time and effort statistics.

Data

Many of the existing TREC English collections (documents, topics, and relevance judgments) are available for training purposes and may also be used in some of the tracks. Parts of the training collection (Disks 1-3) were assembled from Linguistic Data Consortium (LDC) text, and a signed User Agreement will be required from all participants. The documents are an assorted collection of newspapers, newswires, journals, and technical abstracts. The LDC has collected a more recent set of newswire material called the AQUAINT collection (Disks 6&7); this collection will also be available to TREC participants but is covered by a separate User Agreement. A third Agreement is needed for the remaining disks (4-5).

All documents are typical of those seen in a real-world situation (i.e. there will not be arcane vocabulary, but there may be missing pieces of text or typographical errors). For most tracks, the relevance judgments against which each system's output will be scored will be made by experienced relevance assessors based on the output of all TREC participants using a pooled relevance methodology. See the Overview paper in the TREC-8 proceedings (on the TREC web site) for a detailed discussion of pooling.

Response format and submission details

Organizations wishing to participate in TREC 2008 should respond to this call for participation by submitting an application. IMPORTANT NOTE: Participants in previous TRECs who will participate in TREC 2008 must submit a new application.

An application consists of the following five parts:

1.  Contact information
The full name of the main contact person from your organization.
  • The full name of your organization. If you are not participating as a member of an organization, please specify "self". If you know there is another group from your organization that will also participate in TREC 2008 (for example, two groups from the same university), please give enough qualification in the organization name to distinguish the different groups.
  • An organization/team name (up to 20 characters). This will be used as a unique identifier for your group. You will need to use this identifier on correspondence to NIST (when requesting data or sending permission forms, for example), so remember it. This identifier will also be used to tag your runs when you submit them.
  • Complete organization physical mail address. Sufficient such that mail sent to that address will be accepted by the post office.
  • Fully qualified phone and fax numbers for the main contact person.
  • Fully qualified, valid email address for the main contact person.
  • Exactly ONE fully qualified, valid email address to use for the TREC 2008 participants' mailing list. Because of the overhead involved in maintaining the mailing list of active participants, only one email address per participating group will be added to the TREC 2008 participants' list. We strongly encourage the use of a local mailing list at your institution that distributes TREC mail internally to project participants so that all involved see the mail sent to this list. TREC is run solely be email, so it is important that this address be valid and that mail sent to the address is read in a timely fashion. You may use the email address of the main contact person as the address for the mailing list, but please give it twice in the application so we are sure of your intentions.
2.  Whether you have participated in TREC before. If so, please indicate the years you participated, otherwise indicate that you are new to TREC.

3.  A one-paragraph description of your retrieval approach.

4.  Whether you will participate as a Category A or a Category C group.

5.  A list of tracks that you are likely to participate in. This is non-binding, but is helpful to know for planning.


There is no application form as such; a simple text file consisting of this information by number is the application. Please respond using only simple text (i.e., no pdf, word, rtf, postscript, etc. files). We will not process your application to participate in TREC 2008 unless it is complete.

All responses should be mailed to Lori Buckland, lori.buckland (at) nist.gov. Any questions about conference participation, response format, etc. should be sent to the general TREC email address, trec@nist.gov .




Last updated: Thursday, 17-Jan-08 10:40:09
Date created: Tuesday, December 18, 2007
trec@nist.gov