Text REtrieval Conference (TREC)
System Description

Organization Name: Macquarie University Run ID: afrun1
Section 1.0 System Summary and Timing
Section 1.1 System Information
Hardware Model Used for TREC Experiment: PC Intel4 2.8GHz
System Use: DEDICATED
Total Amount of Hard Disk Storage: 80 Gb
Total Amount of RAM: 1 MB
Clock Rate of CPU: 2800 MHz
Section 1.2 System Comparisons
Amount of developmental "Software Engineering": SOME
List of features that are not present in the system, but would have been beneficial to have:
List of features that are present in the system, and impacted its performance, but are not detailed within this form:
Section 2.0 Construction of Indices, Knowledge Bases, and Other Data Structures
Length of the stopword list: words
Type of Stemming: NONE
Controlled Vocabulary:
Term weighting:
  • Additional Comments on term weighting:
Phrase discovery:
  • Kind of phrase:
  • Method used: OTHER
Type of Spelling Correction: NONE
Manually-Indexed Terms:
Proper Noun Identification:
Syntactic Parsing:
Tokenizer:
Word Sense Disambiguation:
Other technique:
Additional comments: Used the preselected documents provided by NIST
Section 3.0 Statistics on Data Structures Built from TREC Text
Section 3.1 First Data Structure
Structure Type: NONE
Type of other data structure used:
Brief description of method using other data structure:
Total storage used: Gb
Total computer time to build: hours
Automatic process:
Manual hours required: hours
Type of manual labor: NONE
Term positions used:
Only single terms used:
Concepts (vs. single terms) represented:
  • Number of concepts represented:
Type of representation:
Auxilary files used:
  • Type of auxilary files used:
Additional comments:
Section 3.2 Second Data Structure
Structure Type: NONE
Type of other data structure used:
Brief description of method using other data structure:
Total storage used: Gb
Total computer time to build: hours
Automatic process:
Manual hours required: hours
Type of manual labor: NONE
Term positions used:
Only single terms used:
Concepts (vs. single terms) represented:
  • Number of concepts represented:
Type of representation:
Auxilary files used:
  • Type of auxilary files used:
Additional comments:
Section 3.3 Third Data Structure
Structure Type: NONE
Type of other data structure used:
Brief description of method using other data structure:
Total storage used: Gb
Total computer time to build: hours
Automatic process:
Manual hours required: hours
Type of manual labor: NONE
Term positions used:
Only single terms used:
Concepts (vs. single terms) represented:
  • Number of concepts represented:
Type of representation:
Auxilary files used:
  • Type of auxilary files used:
Additional comments:
Section 4.0 Data Built from Sources Other than the Input Text
Internally-built Auxiliary File

File type: OTHER
Domain type: DOMAIN SHARED
Total Storage: .19 Gb
Number of Concepts Represented: 416 concepts
Type of representation: RULES
Automatic or Manual: AUTOMATIC
  • Total Time to Build: 3 hours
  • Total Time to Modify (if already built): hours
Type of Manual Labor used: NONE
Additional comments: Graph rules automatically learned from a subset of 111 questions from the TREC 2004 QA track, extended with answer sentences with manually annotated answers
Externally-built Auxiliary File

File is: OTHER
Total Storage: 0.075 Gb
Number of Concepts Represented: 560 concepts
Type of representation: OTHER
Additional comments: A subset of 111 questions from the TREC 2004 QA track, extended with answer sentences with manually annotated answers. The total number of question/answer pairs was 560.
Section 5.0 Computer Searching
Average computer time to search (per query): CPU seconds
Times broken down by component(s):
Section 5.1 Searching Methods
Vector space model: NO
Probabilistic model: NO
Cluster searching: NO
N-gram matching: NO
Boolean matching: NO
Fuzzy logic: NO
Free text scanning: NO
Neural networks: NO
Conceptual graphic matching: YES
Other: NO
Additional comments:
Section 5.2 Factors in Ranking
Term frequency: YES
Inverse document frequency: NO
Other term weights:
Semantic closeness: NO
Position in document: NO
Syntactic clues: YES
Proximity of terms:
Information theoretic weights: NO
Document length: NO
Percentage of query terms which match: YES
N-gram frequency: NO
Word specificity: NO
Word sense frequency: NO
Cluster distance: NO
Other: NO
Additional comments:
Section 6.0 Query Construction
Section 6.1 Automatically Built Queries for Ad-hoc Tasks
Topic fields used:          
Average computer time to build query    CPU seconds
Term weighting (weights based on terms in topics):
Phrase extraction from topics:
Syntactic parsing of topics:
Word sense disambiguation:
Proper noun identification algorithm:
Tokenizer:
  • Patterns which were tokenized:
Expansion of queries using previously constructed data structures:
  • Comment:
Automatic addition of: NONE
Section 6.2 Manually Constructed Queries for Ad-hoc Tasks
Topic fields used:        
Average time to build query?   minutes
Type of query builder: OTHER
Tool used to build query: NONE
Method used in intial query construction? BOOLEAN CONNECTORS
  • If yes, what was the source of terms?
Total CPU time for all iterations:  seconds
Clock time from initial construction of query to completion of final query:   minutes
Average number of iterations:
Average number of documents examined per iteration:
Minimum number of iterations:
Maximum number of iterations:
The end of an iteration is determined by:
Automatic term reweighting from relevant documents:
Automatic query expansion from relevant documents:
  • Type of automatic query expansion: ALL TERMS IN \
Other automatic methods:
  • Other automatic methods included:
Manual methods used:
  • Type of manual method used: NONE
Send questions to trec@nist.gov

Disclaimer: Contents of this online document are not necessarily the official views of, nor endorsed by the U.S. Government, the Department of Commerce, or NIST.