| 
        Text REtrieval Conference (TREC)  | 
| Organization Name: InsightSoft-M | Run ID: insight | 
| Section 1.0 System Summary and Timing | 
|---|
| Section 1.1 System Information | 
| Hardware Model Used for TREC Experiment:Pentium PC System Use:SHARED Total Amount of Hard Disk Storage:45 Gb Total Amount of RAM:256 MB Clock Rate of CPU:700 MHz | 
| Section 1.2 System Comparisons | 
| Amount of developmental "Software Engineering":NONE List of features that are not present in the system, but would have been beneficial to have: List of features that are present in the system, and impacted its performance, but are not detailed within this form: | 
| Section 2.0 Construction of Indices, Knowledge Bases, and Other Data Structures | 
|---|
| Length of the stopword list:200 words Type of Stemming:OTHER Controlled Vocabulary:NO Term weighting:YES 
 Phrase discovery:YES 
 Type of Spelling Correction:AUTOMATIC CORRECTION Manually-Indexed Terms:NO Proper Noun Identification:YES Syntactic Parsing:NO Tokenizer:NO Word Sense Disambiguation:YES Other technique:YES Additional comments:If finding no satisfactory result with a method of choice, the system automatically shifts to using additional methods. | 
| Section 3.0 Statistics on Data Structures Built from TREC Text | 
|---|
| Section 3.1 First Data Structure | 
| Structure Type:NONE Type of other data structure used: Brief description of method using other data structure: Total storage used:Gb Total computer time to build:hours Automatic process: Manual hours required:hours Type of manual labor:NONE Term positions used: Only single terms used: Concepts (vs. single terms) represented: 
 Type of representation: Auxilary files used: 
 Additional comments: | 
| Section 3.2 Second Data Structure | 
| Structure Type:NONE Type of other data structure used: Brief description of method using other data structure: Total storage used:Gb Total computer time to build:hours Automatic process: Manual hours required:hours Type of manual labor: Term positions used: Only single terms used: Concepts (vs. single terms) represented: 
 Type of representation: Auxilary files used: 
 Additional comments: | 
| Section 3.3 Third Data Structure | 
| Structure Type: Type of other data structure used: Brief description of method using other data structure: Total storage used:Gb Total computer time to build:hours Automatic process: Manual hours required:hours Type of manual labor: Term positions used: Only single terms used: Concepts (vs. single terms) represented: 
 Type of representation: Auxilary files used: 
 Additional comments: | 
| Section 4.0 Data Built from Sources Other than the Input Text | 
|---|
| File type: Domain type: Total Storage:Gb Number of Concepts Represented:concepts Type of representation: Automatic or Manual: 
 Type of Manual Labor used: Additional comments: | 
| File is: Total Storage:Gb Number of Concepts Represented:concepts Type of representation: Additional comments: | 
| Section 5.0 Computer Searching | 
|---|
| Average computer time to search (per query): CPU seconds | 
| Times broken down by component(s): | 
| Section 5.1 Searching Methods | 
| Vector space model: Probabilistic model: Cluster searching: N-gram matching: Boolean matching: Fuzzy logic: Free text scanning: Neural networks: Conceptual graphic matching: Other: Additional comments: | 
| Section 5.2 Factors in Ranking | 
| Term frequency: Inverse document frequency: Other term weights: Semantic closeness: Position in document: Syntactic clues: Proximity of terms: Information theoretic weights: Document length: Percentage of query terms which match: N-gram frequency: Word specificity: Word sense frequency: Cluster distance: Other: Additional comments: | 
| Section 6.0 Query Construction | 
|---|
| Section 6.1 Automatically Built Queries for Ad-hoc Tasks | 
| Topic fields used: Average computer time to build query CPU seconds Term weighting (weights based on terms in topics): Phrase extraction from topics: Syntactic parsing of topics: Word sense disambiguation: Proper noun identification algorithm: Tokenizer: Expansion of queries using previously constructed data structures: Automatic addition of: | 
| Section 6.2 Manually Constructed Queries for Ad-hoc Tasks | 
| Topic fields used: Average time to build query? minutes Type of query builder: Tool used to build query: Method used in intial query construction? Total CPU time for all iterations: seconds Clock time from initial construction of query to completion of final query: minutes Average number of iterations: Average number of documents examined per iteration: Minimum number of iterations: Maximum number of iterations: The end of an iteration is determined by: Automatic term reweighting from relevant documents: Automatic query expansion from relevant documents: Other automatic methods: Manual methods used: | 
| Disclaimer: Contents of this online document are not necessarily the official views of, nor endorsed by the U.S. Government, the Department of Commerce, or NIST. |