System Summary and Timing
  Organization Name: Dublin City University
  List of Run ID's: DCU961, DCU962, DCU963, DCU964, DCU965, DCU966, DCU967, 
DCU968, DCU969, DCU96C, DCU96D

  Construction of Indices, Knowledge Bases, and other Data Structures 

    Methods Used to build Data Structures 

    - Length (in words) of the stopword list: 410 
    - Controlled Vocabulary?: No 
    - Stemming Algorithm: Porters              
      - Morphological Analysis: No 
    - Term Weighting: Yes 
    -  Phrase Discovery?: Yes             
      - Method Used (statistical, syntactic, other): Statistical
    -  Syntactic Parsing?: No 
    -  Word Sense Disambiguation?: No 
    -  Heuristic Associations (including short definition)?: No 
    -  Spelling Checking (with manual correction)?: No 
    -  Spelling Correction?: No 
    -  Proper Noun Identification Algorithm?: No 
    -  Tokenizer?: No             
      - Patterns which are tokenized: No 
    -  Manually-Indexed Terms?: No 
    -  Other Techniques for building Data Structures: No 

    Statistics on Data Structures built from TREC Text

    - Inverted index           
      - Run ID: DCU968, DCU969, DCU96C, DCU96D 
      - Total Storage (in MB): 98 
      - Total Computer Time to Build (in hours): 7 
      - Automatic Process? (If not, number of manual hours): Yes 
      - Use of Term Positions?: No 
      - Only Single Terms Used?: No 
    - Inverted index           
      - Run ID: DCU961, DCU962, DCU963, DCU964 
      - Total Storage (in MB): 412 
      - Total Computer Time to Build (in hours): 28 
      - Automatic Process? (If not, number of manual hours): Yes 
      - Use of Term Positions?: No 
      - Only Single Terms Used?: No 
    - Inverted index           
      - Run ID: DCU965, DCU966, DCU967 
      - Total Storage (in MB): 78 
      - Total Computer Time to Build (in hours): 6.5 
      - Automatic Process? (If not, number of manual hours): Yes 
      - Use of Term Positions?: No 
      - Only Single Terms Used?: No 
    - Clusters           
    - N-grams, Suffix arrays, Signature Files           
    - Knowledge Bases            
      - Use of Manual Labor                  
    - Special Routing Structures           
    - Other Data Structures built from TREC text           

  Query construction

    Automatically Built Queries (Ad-Hoc)

    - Topic Fields Used: Description 
    - Average Computer Time to Build Query (in cpu seconds): < 1 
    - Method used in Query Construction          
      - Term Weighting (weights based on terms in topics)?: Yes 
      - Phrase Extraction from Topics?: Yes 
      - Syntactic Parsing of Topics?: No 
      - Word Sense Disambiguation?: No 
      - Proper Noun Identification Algorithm?: No 
      - Tokenizer?: No                
      - Heuristic Associations to Add Terms?: No 
      - Expansion of Queries using Previously-Constructed Data Structure?: No              
      - Automatic Addition of Boolean Connectors or Proximity Operators?: No 

    Manually Constructed Queries (Ad-Hoc)

    - Topic Fields Used: Title, Description, Narrative 
    - Average Time to Build Query (in Minutes): < 1 
    - Type of Query Builder          
    - Tools used to Build Query          
      - Knowledge Base Browser?:                 
      - Other Lexical Tools?:                
    - Method used in Query Construction          
      - Addition of Terms not Included in Topic?:               

  Searching

    Search Times

      - Run ID: DCU968, DCU969, DCU96C, DCU96D 
      - Computer Time to Search (Average per Query, in CPU seconds): 15 
    -  Search Times             
      - Run ID: DCU961, DCU962 
      - Computer Time to Search (Average per Query, in CPU seconds): 11 
    -  Search Times             
      - Run ID: DCU963, DCU964 
      - Computer Time to Search (Average per Query, in CPU seconds): 4 
    -  Search Times             
      - Run ID: DCU965 
      - Computer Time to Search (Average per Query, in CPU seconds): 5 
    -  Search Times             
      - Run ID: DCU966 
      - Computer Time to Search (Average per Query, in CPU seconds): 3 
    -  Search Times             
      - Run ID: DCU967 
      - Computer Time to Search (Average per Query, in CPU seconds): 8 

    Machine Searching Methods

      - Vector Space Model?: Yes 

    Factors in Ranking

      - Term Frequency?: Yes 
      - Inverse Document Frequency?: Yes 
      - Document Length?: Yes 

    Machine Information

    - Machine Type for TREC Experiment: Sparc Station 5 
    - Was the Machine Dedicated or Shared: Dedicated 
    - Amount of Hard Disk Storage (in MB): 6000 
    - Amount of RAM (in MB): 64 

    System Comparisons 

    - Given appropriate resources            
      - Could your system run faster?: Yes 
      - By how much (estimate)?: 100+% 
    - Features the System is Missing that would be beneficial: More 
sophisticated query document matching algorithms. 

    Significant Areas of System

    - Brief Description of features in your system which you feel impact the 
system and are not answered by above questions: Query and Document Accumulator 
thresholding techniques coupled with the modified Inverted Index structure 
which allows effective and efficient retrieval.