System Summary and Timing Organization Name: University of Neuchatel (Switzerland) List of Run ID's: UniNE3, UniNE4 Construction of Indices, Knowledge Bases, and other Data Structures Methods Used to build Data Structures - Length (in words) of the stopword list: 571 (SMART) - Stemming Algorithm: yes Lovins (SMART) - Term Weighting: yes - Phrase Discovery? : - Tokenizer? : Statistics on Data Structures built from TREC Text - Inverted index - Run ID : UniNE3, UniNE4 - Total Storage (in MB): 179 UniNE3, 710 UniNE4 - Total Computer Time to Build (in hours): 2.6 UniNE3, 10 UniNE4 - Automatic Process? (If not, number of manual hours): yes - Only Single Terms Used? : yes - Clusters - N-grams, Suffix arrays, Signature Files - Knowledge Bases - Use of Manual Labor - Special Routing Structures - Other Data Structures built from TREC text Query construction Automatically Built Queries (Ad-Hoc) - Topic Fields Used: desc - Average Computer Time to Build Query (in cpu seconds): 0.3 - Method used in Query Construction - Term Weighting (weights based on terms in topics)? : yes - Tokenizer? : - Expansion of Queries using Previously-Constructed Data Structure? : Searching Search Times - Run ID : UniNE3, UniNE4 - Computer Time to Search (Average per Query, in CPU seconds): 15.5 Machine Searching Methods - Vector Space Model? : yes - Probabilistic Model? : yes Factors in Ranking - Term Frequency? : yes - Inverse Document Frequency? : yes - Other Term Weights? : normalization of seach term weights done by SMART - Document Length? : yes Machine Information - Machine Type for TREC Experiment: SUN SPARCstation 10 model 51 - Was the Machine Dedicated or Shared: shared - Amount of Hard Disk Storage (in MB): 6,144 MB - Amount of RAM (in MB): 128 MB - Clock Rate of CPU (in MHz): 50 MHz System Comparisons - Amount of "Software Engineering" which went into the Development of the System: 4 months to understand SMART, 6 months to write our additional features in Smalltalk-80 - Given appropriate resources - Could your system run faster? : yes, because Smalltalk is interpreted - By how much (estimate)? : the improvement factor is unknown