System Summary and Timing
  Organization Name: U. of California, Berkeley
  List of Run ID's: Brkly9, Brkly10, Brkly11, Brkly12

  Construction of Indices, Knowledge Bases, and other Data Structures 

    Methods Used to build Data Structures  

    - Length (in words) of the stopword list: 592 
    - Controlled Vocabulary? :  NO 
    - Stemming Algorithm: SMART STEMMER               
      - Morphological Analysis: NO 
    - Term Weighting: YES 
    -  Phrase Discovery? : NO              
    -  Syntactic Parsing? : NO 
    -  Word Sense Disambiguation? : NO 
    -  Heuristic Associations (including short definition)? : NO 
    -  Spelling Checking (with manual correction)? : NO 
    -  Spelling Correction? : NO 
    -  Proper Noun Identification Algorithm? : NO 
    -  Tokenizer? : NO              
    -  Manually-Indexed Terms? : NO 
    -  Other Techniques for building Data Structures: NONE 

    Statistics on Data Structures built from TREC Text

    - Inverted index            
      - Run ID : Brkly9, Brkly10 
      - Total Storage (in MB): 550 
      - Total Computer Time to Build (in hours): APPROX. 50 
      - Automatic Process? (If not, number of manual hours): YES 
      - Use of Term Positions? : NO 
      - Only Single Terms Used? : YES 
    - Inverted index            
      - Run ID :  Brkly11, Brkly12 
      - Total Storage (in MB): 250 
      - Total Computer Time to Build (in hours): APPROX. 25 
      - Automatic Process? (If not, number of manual hours): YES 
      - Use of Term Positions? : NO 
      - Only Single Terms Used? : YES 
    - Clusters            
    - N-grams, Suffix arrays, Signature Files            
    - Knowledge Bases             
      - Use of Manual Labor                   
    - Special Routing Structures            
    - Other Data Structures built from TREC text            

  Query construction

    Automatically Built Queries (Ad-Hoc)

    - Topic Fields Used: DESCRIPTION 
    - Average Computer Time to Build Query (in cpu seconds): APPROX. 3 
    - Method used in Query Construction           
      - Term Weighting (weights based on terms in topics)? : YES 
      - Tokenizer? :                  
      - Expansion of Queries using Previously-Constructed Data Structure? :               


    Automatically Built Queries (Routing)

    - Topic Fields Used: DOM, TITLE, DESC, NARR, CON, DEF, NAT TIME 
    - Average Computer Time to Build Query (in cpu seconds): APPROX. 50 
    - Method used in Query Construction           
      - Terms Selected From             
        - Topics: YES 
        - Only Documents with Relevance Judgments: YES 
      - Term Weighting with Weights Based on terms in             
      - Phrase Extraction from             
      - Syntactic Parsing             
      - Word Sense Disambiguation using             
      - Proper Noun Identification Algorithm from             
      - Tokenizer              
      - Heuristic Associations to Add Terms from             
      - Expansion of Queries using Previously-Constructed Data Structure:               
      - Automatic Addition of Boolean connectors or Proximity Operators               
        using information from              

    Manually Constructed Queries (Ad-Hoc)

    - Topic Fields Used: DESCRIPTION 
    - Average Time to Build Query (in Minutes): 25 
    - Type of Query Builder           
    - Tools used to Build Query           
      - Knowledge Base Browser? :                  
      - Other Lexical Tools? :                 
    - Method used in Query Construction           
      - Addition of Terms not Included in Topic? :                
        - Source of Terms: Boolean lookups in parallel collections 

  Searching

    Search Times

      - Run ID : Brkly9, Brkly10 
      - Computer Time to Search (Average per Query, in CPU seconds): APPROX. 30 

    Search Times

      - Run ID : Brkly11, Brkly12 
      - Computer Time to Search (Average per Query, in CPU seconds): APPROX. 50 

    Machine Searching Methods

      - Probabilistic Model? : YES 

    Factors in Ranking

      - Term Frequency? : YES 
      - Inverse Document Frequency? : YES 

    Machine Information

    - Machine Type for TREC Experiment: SPARC 10, SPARC 20 
    - Was the Machine Dedicated or Shared: SHARED 
    - Amount of Hard Disk Storage (in MB): 5,000 
    - Amount of RAM (in MB): 128 
    - Clock Rate of CPU (in MHz): 50 for SPARC 10, 90 for SPARC 20