System Summary and Timing Organization Name: University of Neuchatel (Switzerland) List of Run ID's: UniNE7, UniNE8, UniNE9, UniNE0 Construction of Indices, Knowledge Bases, and other Data Structures Methods Used to build Data Structures - Length (in words) of the stopword list: 571 (SMART) - Stemming Algorithm: yes Lovins (SMART) - Term Weighting: yes - Phrase Discovery?: - Tokenizer?: Statistics on Data Structures built from TREC Text - Inverted index - Run ID: UniNE7, UniNE8, UniNE9, UniNE0 - Total Storage (in MB): 179 (collection fusion) for UniNE0 and UniNE9 3 * 179 (Data fusion) for UniNE7 and UniNE8 - Total Computer Time to Build (in hours): 2.6 for UniNE0 and UniNE9 3 * 2.6 for UniNE7 and UniNE8 - Automatic Process? (If not, number of manual hours): yes - Only Single Terms Used?: yes - Clusters - N-grams, Suffix arrays, Signature Files - Knowledge Bases - Use of Manual Labor - Special Routing Structures - Other Data Structures built from TREC text Query construction Automatically Built Queries (Ad-Hoc) - Topic Fields Used: Desc only for UniNE7 and UniNE0 Desc & Narr for UniNE8 and UniNE9 - Average Computer Time to Build Query (in cpu seconds): 0.3 - Method used in Query Construction - Term Weighting (weights based on terms in topics)?: yes - Tokenizer?: - Expansion of Queries using Previously-Constructed Data Structure?: Searching Search Times - Run ID: UniNE7, UniNE8, UniNE9, UniNE0 - Computer Time to Search (Average per Query, in CPU seconds): 15.5 Machine Searching Methods - Vector Space Model?: yes - Probabilistic Model?: yes Factors in Ranking - Term Frequency?: yes - Inverse Document Frequency?: yes - Document Length?: yes Machine Information - Machine Type for TREC Experiment: SUN SPARCstation 10 model 51 - Was the Machine Dedicated or Shared: shared - Amount of Hard Disk Storage (in MB): 6,144 MB - Amount of RAM (in MB): 128 MB - Clock Rate of CPU (in MHz): 50 MHz System Comparisons - Amount of "Software Engineering" which went into the Development of the System: SMART + part of code in Smalltalk-80 - Given appropriate resources - Could your system run faster?: yes, of course (both for SMART and Smalltalk-80) - By how much (estimate)?: the improvement factor is unknown