The Session Track released 1021 query sessions for 60 different topics. The top 10 results from the first 100 sessions from the top 3 highest-priority submitted runs were pooled; the resulting set of URLs were judged against the general topic. Judging was conducted on a 6-grade scale: spam (-2), not relevant (0), relevant (1), highly relevant (2), key (3), and navigational (4). Note that topics numbered 1, 11, 12, 38, 41, 43, 45, 49, and 50 received no judgments, as they were not represented in the first 100 sessions. Based on the qrels provided by NIST, we evaluated runs by eight measures, (a) Average Precision (average_precision) (b) Expected Reciprocal Rank (err) -- as defined by Chapelle et al. at CIKM 2009, (c) ERR@10 (err_at_k), (d) nDCG (ndcg), (e) nDCG@10 (ndcg_at_k), (f) ERR normalised by the maximum ERR per query (nerr), (g) nERR@10 (nerr_at_k), and (h) Precision@10 (precision_at_k) Each run was evaluated over the first 100 sessions. nDCG@10 (ndcg_at_k) is the official measure for the track.