System Summary and Timing
  Organization Name: HNC Software Inc.
  List of Run ID's: HNC11, HNC21

  Construction of Indices, Knowledge Bases, and other Data Structures 

    Methods Used to build Data Structures 

    - Length (in words) of the stopword list: 375 
    - Stemming Algorithm:  Lovins              
      - Morphological Analysis:  No 
    - Term Weighting:  Yes 
    -  Phrase Discovery? :              
      - Kind of Phrase:  Yes 
      - Method Used (statistical, syntactic, other): statistical 
    -  Syntactic Parsing? :  No 
    -  Word Sense Disambiguation? : No 
    -  Heuristic Associations (including short definition)? :  No 
    -  Spelling Checking (with manual correction)? : No 
    -  Spelling Correction? :  No 
    -  Proper Noun Identification Algorithm? : No 
    -  Tokenizer? : Yes            
      - Patterns which are tokenized:  No 
    -  Manually-Indexed Terms? :  No 

    Statistics on Data Structures built from TREC Text

    - Inverted index           
    - Clusters           
    - N-grams, Suffix arrays, Signature Files           
    - Knowledge Bases            
      - Use of Manual Labor                  
    - Other Data Structures built from TREC text           
      - Run ID : HNC11 
      - Type of Structure:  Word Context Vectors 
      - Total Storage (in MB):  300 
      - Total Computer Time to Build (in hours): 130 
      - Automatic Process? (If not, number of manual hours): Yes 
    - Other Data Structures built from TREC text           
      - Run ID : HNC21 
      - Type of Structure:  Word Context Vectors 
      - Total Storage (in MB):  300 
      - Total Computer Time to Build (in hours): 130 
      - Automatic Process? (If not, number of manual hours): Yes 

    Data Built from Sources Other than the Input Text

    -  Internally-built Auxiliary File            
      - Use of Manual Labor                   
    -  Externally-built Auxiliary File            

  Query construction

    Automatically Built Queries (Ad-Hoc)

    - Method used in Query Construction          
      - Tokenizer? :                 


      - Expansion of Queries using Previously-Constructed Data Structure? :              

    Automatically Built Queries (Routing)

    - Average Computer Time to Build Query (in cpu seconds): 10-20 
    - Method used in Query Construction          
      - Terms Selected From            
        - Only Documents with Relevance Judgments: Yes 
      - Term Weighting with Weights Based on terms in            
      - Phrase Extraction from            
      - Syntactic Parsing            
      - Word Sense Disambiguation using            
      - Proper Noun Identification Algorithm from            
      - Tokenizer             
      - Heuristic Associations to Add Terms from            
      - Expansion of Queries using Previously-Constructed Data Structure:              
      - Automatic Addition of Boolean connectors or Proximity Operators              
	using information from             

    Manually Constructed Queries (Ad-Hoc)

    - Type of Query Builder          
    - Tools used to Build Query          
      - Knowledge Base Browser? :                 
      - Other Lexical Tools? :                
    - Method used in Query Construction          
      - Addition of Terms not Included in Topic? :               

    Manually Constructed Queries (Routing)

    - Type of Query Builder          
    - Tools used to Build Query          
      - Knowledge Base Browser? :                 
      - Other Lexical Tools? :               
    - Data Used for Building Query from           
    - Method used in Query Construction          
      - Addition of Terms not Included in Topic? :               

    Interactive Queries

    - Type of Person doing Interaction            
    - Average Time to do Complete Interaction            
    - Methods used in Interaction         
      - Automatic Query Expansion from Relevant Documents? :                 
      - Manual Methods               

  Searching

    Search Times

      - Run ID :  HNC11 
      - Computer Time to Search (Average per Query, in CPU seconds): 20 
      - Component Times :  Context Vector dot product sorting 

    Machine Searching Methods

    - Machine Searching Methods       
      - Vector Space Model? :  Yes 


    Machine Information

    - Machine Type for TREC Experiment: Sun Sparc 10 
    - Was the Machine Dedicated or Shared:  Shared 
    - Amount of Hard Disk Storage (in MB):  2 GB 
    - Amount of RAM (in MB):  512 
    - Clock Rate of CPU (in MHz): 45 

    System Comparisons 

    - Amount of "Software Engineering" which went into the Development of 
      the System:  3-4 years 
    - Given appropriate resources            
      - Could your system run faster? :  Yes 
      - By how much (estimate)? :  2 or 3 times on faster hardware 

    Significant Areas of System

    - Brief Description of features in your system which you feel impact 
      the system and are not answered by above questions:  Routing method 
      uses an LVQ (neural network) algorithm given the Context Vectors of 
      judged documents to create Query Context Vector(s)