System Summary and Timing Organization Name: Dublin City UNiversity List of Run ID's: DCU951 DCU952 Construction of Indices, Knowledge Bases, and other Data Structures Methods Used to build Data Structures - Length (in words) of the stopword list: 676. - Controlled Vocabulary? : Yes. - Stemming Algorithm: WordNet Stemmer. - Morphological Analysis: Yes. - Term Weighting: Yes. - Phrase Discovery? : - Kind of Phrase: Co-Locations. - Method Used (statistical, syntactic, other): Other. - Syntactic Parsing? : No. - Word Sense Disambiguation? : No. - Spelling Checking (with manual correction)? : No. - Spelling Correction? : No. - Proper Noun Identification Algorithm? : No. - Tokenizer? : No. - Manually-Indexed Terms? : No. - Other Techniques for building Data Structures: No. Statistics on Data Structures built from TREC Text - Inverted index - Total Storage (in MB): 253 - Total Computer Time to Build (in hours): 4.5 - Automatic Process? (If not, number of manual hours): Yes. - Use of Term Positions? : No. - Only Single Terms Used? : No. Query construction Automatically Built Queries (Ad-Hoc) - Topic Fields Used: DESC - Average Computer Time to Build Query (in cpu seconds): < 1 - Method used in Query Construction - Term Weighting (weights based on terms in topics)? : Yes - Phrase Extraction from Topics? : Yes - Syntactic Parsing of Topics? : No - Word Sense Disambiguation? : DCU951 Yes, DCU952 No. - Proper Noun Identification Algorithm? : No - Tokenizer? : No - Heuristic Associations to Add Terms? : No. - Expansion of Queries using Previously-Constructed Data Structure? : DCU951 Yes, DCU952 No. - Automatic Addition of Boolean Connectors or Proximity Operators? : No. Searching Search Times - Computer Time to Search (Average per Query, in CPU seconds): 10 Machine Searching Methods - Vector Space Model? : Yes Factors in Ranking - Term Frequency? : Yes - Inverse Document Frequency? : Yes Machine Information - Machine Type for TREC Experiment: Sparc 20 - Was the Machine Dedicated or Shared: Shared - Amount of Hard Disk Storage (in MB): 1200 - Amount of RAM (in MB): 96 System Comparisons - Amount of "Software Engineering" which went into the Development of the System: 4 Months - Given appropriate resources - Could your system run faster? : Yes - By how much (estimate)? : 30% to 40% - Features the System is Missing that would be beneficial: Index Compression Significant Areas of System - Brief Description of features in your system which you feel impact the system and are not answered by above questions: Query Space Processing approach