Text REtrieval Conference (TREC)
|
Organization Name: Dublin City University | Run ID: DCU99L01 |
Section 1.0 System Summary and Timing |
---|
Section 1.1 System Information |
Hardware Model Used for TREC Experiment:Intel P2 / NT4 System Use:SHARED Total Amount of Hard Disk Storage:8 Gb Total Amount of RAM:64 MB Clock Rate of CPU:350 MHz |
Section 1.2 System Comparisons |
Amount of developmental "Software Engineering":SOME List of features that are not present in the system, but would have been beneficial to have:to have our own Content-Based Retrieval Service List of features that are present in the system, and impacted its performance, but are not detailed within this form: |
Section 2.0 Construction of Indices, Knowledge Bases, and Other Data Structures |
---|
Length of the stopword list:words Type of Stemming:NONE Controlled Vocabulary:NO Term weighting:NO
Phrase discovery:NO
Type of Spelling Correction:NONE Manually-Indexed Terms:NO Proper Noun Identification: Syntactic Parsing: Tokenizer: Word Sense Disambiguation: Other technique:YES Additional comments:The Content-Based Retrieval Service was a third party application, and hence some of it's internal data structures are unknown... |
Section 3.0 Statistics on Data Structures Built from TREC Text |
---|
Section 3.1 First Data Structure |
Structure Type:OTHER DATA STRUCTURE Type of other data structure used:Index by third party application Brief description of method using other data structure: Total storage used:1 Gb Total computer time to build:160 hours Automatic process:YES Manual hours required:hours Type of manual labor:NONE Term positions used: Only single terms used: Concepts (vs. single terms) represented:
Type of representation: Auxilary files used:YES
Additional comments:The Content-Based Retrieval Service was a third party application, and hence some of it's internal data structures are unknown... |
Section 3.2 Second Data Structure |
Structure Type:NONE Type of other data structure used: Brief description of method using other data structure: Total storage used:Gb Total computer time to build:hours Automatic process: Manual hours required:hours Type of manual labor:NONE Term positions used: Only single terms used: Concepts (vs. single terms) represented:
Type of representation: Auxilary files used:
Additional comments: |
Section 3.3 Third Data Structure |
Structure Type:NONE Type of other data structure used: Brief description of method using other data structure: Total storage used:Gb Total computer time to build:hours Automatic process: Manual hours required:hours Type of manual labor:NONE Term positions used: Only single terms used: Concepts (vs. single terms) represented:
Type of representation: Auxilary files used:
Additional comments: |
Section 4.0 Data Built from Sources Other than the Input Text |
---|
File type:OTHER Domain type:DOMAIN SHARED Total Storage:0.2 Gb Number of Concepts Represented:concepts Type of representation:OTHER Automatic or Manual:AUTOMATIC
Type of Manual Labor used:NONE Additional comments:This was a Connectivity Server developed from the connectivity source data provided by the Web Track Organisers. |
File is:NONE Total Storage:Gb Number of Concepts Represented:concepts Type of representation:NONE Additional comments: |
Section 5.0 Computer Searching |
---|
Average computer time to search (per query): 17 CPU seconds |
Times broken down by component(s): 15 Content + 2 Linkage |
Section 5.1 Searching Methods |
Vector space model: Probabilistic model: Cluster searching: N-gram matching: Boolean matching:YES Fuzzy logic: Free text scanning: Neural networks: Conceptual graphic matching: Other:YES Additional comments:The Content-Based Retrieval Service was a third party application, and hence some of it's internal data structures are unknown... |
Section 5.2 Factors in Ranking |
Term frequency: Inverse document frequency: Other term weights: Semantic closeness: Position in document: Syntactic clues: Proximity of terms: Information theoretic weights: Document length: Percentage of query terms which match: N-gram frequency: Word specificity: Word sense frequency: Cluster distance: Other:YES Additional comments:The indegree & outdegree of the file in a Conne |
Disclaimer: Contents of this online document are not necessarily the official views of, nor endorsed by the U.S. Government, the Department of Commerce, or NIST. |