TREC 2009 Chemical IR Track

The following data are available for research use provided the creators are acknowledged. Please cite: Mihai Lupu, Florina Piroi, Xiangji Huang, Jianhan Zhu, and John Tait. Overview of the TREC 2009 Chemical IR Track. Proceedings of TREC 2009. NIST Special Publication SP 500-278. 2010.


Overview of Data Extract

Detailed description of patent fields

Type Definition Documents for patents

Description of how scientific articles were captured

Type Definition Documents for scientific articles


European Patents

US Patents between 2001 and 2004

US Patents between 2005 and 2006

All other US patents: Royal Society of Chemistry Scientific Articles


15 test topics for the 'Prior Art' task

Technology survey topics (manual topics, aka small topics set)

Technology survey topics readme

Prior art topics (automatic topics, aka large topics set)

As observed during the 2009 campaign, the fact that the above PA-topics set contained a vast majority of granted patent documents (i.e. B kind) might have had a detrimental effect on retrieval, since the references in the search report generally refer to the applications (i.e. A kind documents). We have compiled a new set of topics, where application files are also provided. Unfortunatley, it is not always the case that such a file is available. In particular, for US patents, applications were not published prior to the year 2000. This is why the archive below only has an additional 497 files instead of 1000. Please feel free to use these files to replace the granted patents ones (which are still provided in this archive).

Relevance Judgments

Prior art task relevance judgments These judgments are the judgments that were augmented after the 2009 track to include all of the documents in the collection, so TREC 2009 participant results will differ.

