Probabilistic Learning Approaches for Indexing and Retrieval with the
                                 TREC-2 Collection
            Norbert Fuhr, Ulrich Pfeifer, Christoph Bremkamp, Michael Pollmann
                             University of Dortmund, Germany
                                      Chris
                                    Cornell

                                            Buckley
                                            University

Abstract
In this paper, we describe the application of probabilis-
tic models for indexing and retrieval with the TREC-2
collection.  This database consists of about a million
documents (2 gigabytes of data) and 100 queries (50
routing and 50 adhoc topics). For document indexing,
we use a description-oriented approach which exploits
relevance feedback data in order to produce a probabilis-
tic indexing with single terms as well as with phrases.
With the adhoc queries, we present a new query term
weighting method based on a training sample of other
queries. For the routing queries, the RPI model is ap-
plied which combines probabilistic indexing with query
term weighting based on query-specific feedback data.
The experimental results of our approach show very
good performance for both types of queries.


1   Introduction
The good TREC-1 results of our group described in
[Fuhr & Buckley 93] have confirmed the general concept
of probabilistic retrieval as a learning approach. In this
paper, we describe some improvements of the indexing
and retrieval procedures. For that, we first give a brief
outline of the document indexing procedure which is
based on description-oriented indexing in combination
with polynomial regression. Section 3 describes query
term weighting for adhoc queries, where we have devel-
oped a new learning method based on a training sam-
ple of other queries and corresponding relevance judge-
ments. In section 4, the construction of the routing
queries is presented, which is based on the probabilistic
RPI retrieval model for query-specific feedback data. In
the final conclusions, we suggest some further improve-
ments of our method.

tails): Let dm denote a document, t~ a term and R the
fact that a query-document pair is judged relevant, then
P(RIt~, dm) denotes the probability that document dm
will be judged relevant w.r.t. an arbitrary query that
contains term t~ Since these weights can hardly be es-
timated directly, we use the description-oriented index-
ing approach.   Here term-document pairs Yi~ dm) are
mapped onto so-called relevance descriptions ~ dm).
The elements X~ of the relevance description contain val-
ues of features of t~ dm and their relationship, like e.g.

if            within-documeut frequency (wdf) of ii~
logidf        = log(iuverse document frequency),
1ognumi~rms = log(number of different terms in dm),
imarif        = 1/(maximum wdf of a term in dm)
is~single     =1, if term is a single word, =0 otherwise
i8~phrase     =1, if term is a phrase, =0 otherwise.

(As phrases, we considered all adjacent non-stopwords
that occurred at least 25 times in the D1 +D2 (training)
document set.)

Based on these relevance descriptions,  we estimate
the probability P(Rjx}t~, dm)) that an arbitrary term-
document pair having relevance description i will be in-
volved in a relevant query-document relationship. This
probability is estimated by a s~called indexing func-
tion u(x~. Different regression methods or probabilistic
classification algorithms can serve as indexing function.

For our retrieval runs submitted to TREC-2, we used
polynomial regression for developing an indexing func-
tion of the form


                   u(i) =

(1)

2   Document indexing
The task of probabilistic document indexing can be de-
scribed as follows (see [Fuhr & Buckley 91] for more de-


                                          67

where the components of v~x) are products of elements
of i.

The indexing function actually used has the form

b0
b1 * is~single
b2  is~single
b3  is~single
b4  lQqnumierms
b5 * is~phrase
b6 * is~phrase

* if     * logidf  imaxif +
          imaxif         +

* if
* logid!
* imaxif                 +
* if     * logidf * imaxif +
* if      imaxif         +

     b7 is~phrase     logidf.
                                                     mula

The coefficient vector b is computed based on a training
sample of query-document-pairs with relevance judge-
ments
Since polynomial functions may yield results outside the
interval [0,1], these values were mapped onto the corre-
sponding boundaries of this interval.
For each phrase occurring in a document, indexing
weights for the phrase as well as for its two components
(as single words) were computed.
There are two major problems with this approach which
we are investigating currently:
 1. Which factors should be used for defining the in-
    dexing functions? We are developing a tool that
    supports a statistical analysis of single factors for
    this purpose.
 2. What is the best type of indexing function? Pre-
    vious experiments have suggested that regression
    methods outperform other probabilistic classifica-
    tion methods. As a reasonable alternative to poly-
    nomial regression, logistic regression seems to offer
    some advantages (see also [Fuhr & Pfeifer 94]). As
    a major benefit, logistic functions yield only values
    between 0 and 1, 50 there is no problem with out-
    liers. We are performing experiments with logistic
    regression and compare the results to those based
    on polynomial regression.


3    Query term weighting for ad-
     hoc queries
3.1   Theoretical background
The basis of our query term weighting scheme for ad-hoc
queries is the linear utility-theoretic retrieval function
described in [Wong & Yao 89].  Let qT~ denote the set
of terms occurring in the query, and Uim the indexing
weight ~(x}i~, dm)) (with Uim = 0 for terms i~ not oc-
curring in dm). If Cik gives the utility of term i~ for the
actual query q~, then the utility of document dm w.r.t.
query q~ can be computed by the retrieval function

            Q(qk,dm)= ~ Cik~ILim.
                        t

      For the estimation of the utility weights Cik, we applied
      two different methods.
      As a heuristic approach, we used tf weights (the num-
+     ber of occurrences of the term i~ in the query), which
      had shown good results in the experiments described in
      [Fuhr & Buckley 91].
+     As a second method, we applied linear regression to this
      problem. Based on the concept of polynomial retrieval
      functions as described in [Fuhr 89b], one can estimate
      the probability of relevance of q~ w.r.t. dm by the for-

                P(RIqk, dm) ~     Cik ~im

                              t~EqT~

(3)

If we had relevance feedback data for the specific query
(as is the case for the routing queries), this function
could be used directly for regression. For the ad-hoc-
queries, however, we have only feedback information
about other queries. For this reason, we regard query
features instead of specific queries.  This can be done
by considering for each query term the same features as
described before in the context of document indexing.
Assume that we have a set of features {fo,fi,..., fi}
and that x5~ denotes the value of feature fj for term
i~ Then we assume that query term weight Cik can be
estimated by linear regression according to the formula

                    1
              Cik = ~1~0a5x5~.                    (4)

Here the factor IqTk serves for the purpose of normaliza-
tion across different queries, since queries with a larger
number of terms tend to yield higher retrieval status val-
ues with formula 2. The factors a5 are the coefficients
that are to be derived by means of regression. Now we
have the problem that regression cannot be applied to
eqn 4, since we do not observe Cik values directly. In-
stead, we observe relevance judgements. This leads us
back to the polynomial retrieval function 3, where we
substitute eqn 4 for c~k:


   P(Rqk,d,,,)  ~
                   t EqT~    (~`0aixii) Uim


with

  =  Za3~jjT~1xiiuim
     ~=0   t.Eq~T


  =  ~a5y5                     (5)
     5=0


          1
yj = ~ ~qT~1xiiuirn            (6)
    t

(2)    Equation 5 shows that we can apply linear regression
       of the form P(RIqk, dm) ~ a~ ~ to a training sample


 68

of query-document pairs with relevance judgements in
order to determine the coefficients a~. The values Yj can

                                               much shorter than in the TREC collection. These facts
                                               may account for the different results.

be computed from the number of query terms, the values
of the query term features and the document indexing                                    test sample
weights.                                                    run learning sample   features Q1/D12 Q2/D3
For the experiments, the following parameters were con         1  every doc.              0.2698   0.2678-
sidered as query term features:                                2  every 100. doc.    i    0.2700   0.2678
XO =1 (constant)                                               3  every 1000. doc.        0.2662
                                                               4  judged docs only   i    0.2635   0.2677
Xi = if (within-query-frequency)
                                                               5  every doc.         x    0.2654
x2 =logif
                                                               6  every 100. doc.    x    0.2677
x3 = if idf
x4 = is~phrase                                            Table 2: Variations of the reg learning sample and the
x5 = inijile (=1, if term occurs in query title,          query features
              =0 otherwise)
For most of our experiments, we only used the parame-
                                                                                      run
ter vector x = ....... , x4)T. The full vector is denoted
as x~. Below, we call this query term weighting method      factor      1       2    3    4     S    6
reg.                                                        constant   -3.05 -2.96  1.04 -20.80 2.47 -1.71
This method is compared with the standard SMART             if         -7.54 -10.67 -2.40 14.58 -4.32 14.11
weighting schemes:                                          log if     -9.01 -5.69 -5.08 -81.58 -1.48 -0.70
                                                            if.idf     5.84     7.00 1.63 8.72 1.81  7.70
  nnn: Cik = if
                                                            is~phrase -8.06 -19.23 -2.73 19.81 -2.97 20.39
  ntc: Cik = if. idf
                                                            injille                            -1.70 3.44
  Inc: Cik = (1+logtf)
  ltc: Cik = (1 + logif)  idf                                     Table 3: Coefficients of query regression

3.2    Experiments
In order to have three different samples for learning
and/or testing purposes, we used the following com-
binations of query sets and document sets as samples:
Q3/D12 was used as training sample for the reg method,
and both Q1/D12 and Q2/D3 were used for testing. As
evaluation measure, we consider the 11-point average
of precision (i.e., the average of the precision values at
0.0,0.1,..., 1.0 recall).

                       sam - le p
             QTW    Q1/D12  Q2/D3
              nnn   0.2303
              ntc   0.2754
              inc   0.2291  0.2601
              lic   0.2826  0.2783
              reg   0.2698  0.2678

        Table 1: Global results for single words

First, we considered single words only. Table 1 shows
the results of the different query term weighting (QTW)
methods. First, it should be noted that the ntc and ltc
methods perform better than nnn and Inc. This find-
ing is somewhat different from the results presented in
[Fuhr & Buckley 91], where the nnn weighting scheme
gave us better results than the ntc method for lsp in-
dexing.  However, in the earlier experiments, we used
only fairly small databases, and the queries also were


                                            69

In a second series of experiments, we varied the sam-
ple size and the set of features of the regression method
(table 2). Besides using every document from the learn-
ing sample, we only considered every 100th and every
1000th document from the database, as well as only
those documents for which there were explicit relevance
judgements available.    As the results show almost no
differences, it seems to be sufficient to use only a small
portion of the database as training sample in order to
save computation time.     The additional consideration
of the occurrence of a term in the query title also did
not effect the results.  So query titles seem to be not
very significant. It is an open question whether or not
other parts of the queries are more significant, so that
the consideration as an additional feature would affect
retrieval quality.
The coefficients computed by the regression process for
the second series of experiments are shown in table 3.
It is obvious that the coeffients depend heavily on the
choice of the training sample   so it is quite surprising
that retrieval quality is not affected by this factor. The
only coefficient which does not change its sign through
all the runs is the one for the if. idf factor. This seems
to confirm the power of this factor. The other factors
can be regarded as being only minor modifications of
the if idf query term weight.
Overall, it must be noted that the regression method
does not yield an improvement over the ntc and ltc
methods. This seems to be surprising, since the regres-
sion is based on the same factors which also go into the

nt~ and ltc formulas. However, a possible explanation
could be the fact that the regression method tries to
minimize the quadratic error for all the documents in
the learning sample, but our evaluation measure con-
siders at most the top ranking 1000 documents for each
query; so regression might perform well for most of the
documents from the database, but not for the top of
the ranking list. There is some indication for this ex-
planation, since regression yields always slightly better
results at the high recall end.

                  &    result
                 0.00  0.3199
                 0.10  0.3707
                 0.15  0.3734
                 0.20  0.3700
                 0.25  0.3656
                 0.30  0.3610
                 0.50  0.3451
                 1.00  0.3147

Table 4: Effect of downweighting
Q2/D12)

(and thus also with ii and t2), a document dm containig
the phrase would yield Uim + U2m + U3m as value of the
retrieval function, where the weights Uim are computed
by the lsp method described before. In order to avoid
the effect of counting the single words in addition to
the phrase, we modified the original phrase weight as
follows:
           U3m = U3m - Ulm - U2m

and stored this value as phrase weight. Queries with
the single words t1 or t2 are not affected by this modi-
fication. For the query with phrase t3, however, the re-
trieval function now would yield the value Uim + U2m +
U'3m = U3m, which is what we would like to get.

                                     QTW    &    result
                                      reg  0.00  0.2724
                                      reg  1.00  0.2596
                                      ntc  0.00  0.2754
                                      ntc  0.15  0.3110
                                      ntc  1.00  0.2524

of phrases (sample     Table 6: Results for the subtraction method (sample
                       Q1/D12)

As described before, in our indexing process, we con-
sider phrases in addition to single words. This leads to
the problem that when a phrase occurs in a document,
we index the phrase in addition to the two single words
forming the phrase. As a heuristic method for overcom-
ing this problem, we introduced a factor for downweight-
mg query term weights for phrases. That is, the actual
query term weight of a phrase 15 C'ik = &Cik, where Cik
is the result of the regression process. In order to de-
rive a value for a, we performed a number of test runs
with varying values (see table 4). Obviously, weighting
factors between 0.1 and 0.3 gave the best results. For
the official runs, we choose &   0.15.

                                sample
          QTW     a   Q1/D12         Q2/D3
            ltc  0.15  0.3192        0.3131
            ltc  0.2   0.3220        0.3056
            reg  0.15  0.3080        0.3062

   Table 5: Results for single words and phrases

In table 5, this method is compared with the ltc formula,
where we also choose a weighting factor for phrases
which gave the best results. One can see that with the
sample Q2/D3, the differences between the methods are
smaller than on sample Q1/D12, but still ltc seems to
perform slightly better.
Finally, we investigated another method for coping with
phrases. For that, let us assume that we have binary
query weights only. Now as an example, the single words
ti and i2 form a phrase t3. For a query with phrase i3


                                                  70

Table 6 shows the corresponding results (a = 0 means
that single words only are considered). In contrast to
what we expected, we do not get an improvement over
single words only when phrases are considered fully.
The result for the ntc method shows that still phrases
should be downweighted. Possibly, there may be an im-
provement with this method when we would use binary
query term weights, but it is clear that other query term
weighting methods mostly give better results.

3.3   Official runs
As  document indexing method, we  applied        the
description-oriented approach as described in section 2.
In order to estimate the coefficients of the indexing func-
tion, we used the training sample Q12/D12, i.e.  the
query sets Qi and Q2 in combination with the docu-
ments from Di and D2.
Two runs with different query term weights were sub-
mitted. Run do rtL2 is based on the nnn method, i.e. tf
weights. Run dortq2 uses reg query term weights. For
performing the regression, we used the query sets Qi
and Q2 and a sample of 400,000 documents from D1.
Table 7 shows the results for the two runs (Numbers in
parentheses denote figures close to the best/worst re-
sults.). As expected, dortq2 yields better results than
dortL2. The recall-precision curves (see figure 1) show
that there is an improvement throughout the whole re-
call range. For precision average and precision at 1000
documents retrieved, run dortQ2 performs very well,
while precision at 100 documents retrieved is less good.
This confirms our interpretation from above, saying that

                1

              0.9-                                              dortL2" ~
                                                                dortQ2"
              0.8-


       Precision 0.5-


              0.3-

              0.2-

              0.1-

                0       I    I    I    I    I    I    I    I       I1
                  0   0.1  0.2  0.3  0.4  0.5 0.6   0.7  0.8  0.9  1
                                             Recall

                        Figure 1: Recall-precision curves of ad-hoc runs


                                              regression optimizes the overall performance, but not
                                              necessarily the retrieval quality when only the top rank-
                                              ing documents are considered. With regard to the mod-
                                              erate results for the reg query term weighting method,
                                              the good performance of dortQ2 obviously stems from
                                              the quality of our document indexing method.

run                    dortL2  dortq2
query term weighting   nnn     reg
average precision:
Prec. Avg.             0.3151  0.3340
query-wise comparison with median:
Prec. Avg.             37:2    45:4
Prec. @ 100 docs       35:11   34:7
Prec. @ 1000 docs    ~ 37:10   45:2
Best/worst results:
Prec. Avg.           I 3/0     3(1)10
Prec. ~ 100 docs       3(2)/i  4(1)/0(2)
Prec. @ 1000 docs    ~ 6(1)/0  9(1)/0
dortL2 vs. dortq2.
Prec. Avg.                   21:29
Prec. ~ 100 docs             22:24
Prec. ~ 1000 docs            17:29

    Table 7: Results for adhoc queries

4   Query term  weighting
    routing queries
4.1  Theoretical background

for

For the routing queries, the retrieval-with-probabilistic-
indexing (RPI) model described in [Fuhr 89a] was ap-
plied. The corresponding retrieval function is based on
the following parameters:
U~Tfl indexing weight of term t~ in document dm
DRk set of documents judged relevant for query qk,
Pik expectation of the indexing weight of term t~ in DR
DkN set of documents judged nonrelevant for query qk,
rjk expectation of the indexing weight of t~ in DN
The parameters Pik and r~k can be estimated based on
relevance feedback data as follows:
           Pik              Uim


                      d~~EDkR

                                                           __   1
                                                      qik  -           Uim
                                                                  d~EDkN


                                          Then the query term weight is computed by the formula

                                                       Cik - Pik(l - r~k) -1
                                                             rjk(l - Pik)


                                     71

                    1
                   0.9-                                            `dortVl"  ~
                                                                   "dortPl"  I
                   0.8~

                   0.7-

                   0.6-

           Precision 0.5-

                   0.4-

                   0.3-
                   0.2-
                   0.1-
                    0       I   i    I    I    I    I    I    I    I   I
                       0   0.1 0.2 0.3   0.4 0.5  0.6   0.7 0.8  0.9   1
                                                 Recall

                           Figure 2: Recall-precision curves of routing runs

and the RPI retrieval function yields

          Q(qk,dm)= ~ log(cikuim+1).
                    tIEq~T


4.2   Experiments
In principle, the RPI formula can be applied with
or without query expansion.  For our experiments in
TRECi, we did not use any query expansion.   The
final results showed that this was reasonable, mainly
with respect to the small amount of relevance feed-
back data available. In contrast, for TREC2 there were
about 2000 relevance judgements per query, so there was
clearly enough training data for applying query expan-
sion methods.
As basic criterion for selecting the expansion terms, we
considered the number of relevant documents in which a
term ocurs, which gave us a ranking of candidates; docu-
ment indexing weights were considered for tie-breaking.
Then we varied the the number of terms which are added
to the original query.

                expansion    result
                   0         0.2909
                   10        0.3047
                   30        0.3035
                   50        0.3002
                   100       0.2832

    Table 8: Effect of number of expansion terms

In a first series of experiments, we considered single
word only. We used Q2/D1 (lsp document indexing)
as training sample and Q2/D2 (ltc indexing) as test

        sample. As can be seen from table 8, query expansion
        clearly improves retrieval quality, but only for a limited
(7)     number of expansion terms. For larger numbers, we get
        worse results. This effect seems to be due to parameter
        estimation problems.

                      expansion
    phraseweight  single w.  phrases    result
        0.5          0         0        0.3476
        0.5          20        0        0.3713
        1.0          20        0        0.3730
        0.5          20        10       0.3728
        1.0          20        10       0.3605
        0.5          30        10       0.3729
        1.0          30        10       0.3626

      Table 9: Query expansion with phrases


In a second series of experiments, we looked at the
combination of single words and phrases.  These ex-
periments were performed as retrospective runs, with
Q2/D12 as training sample and Q2/D2 as test sample
(both with ltc document indexing). For the number of
expansion terms, we treated single words and phrases
separately. Furthermore, similar to the adhoc runs, we
used an additional factor for downweighting the query
term weights of phrases. The different parameter com-
binations tested and the corresponding results are given
in table 9. Obviously, phrases as expansion terms gave
no improvement, so we decided to have only single words
as expansion terms (but the phrases from the original
query still are used for retrieval). Furthermore, the re-
trieval quality reaches its optimum at about 20 terms.

72

4.3  Official runs                    procedures, this combination seems to be a prospective

Two different runs were submitted for the routing
queries, both based on the RPI model.
Run dortPi uses the same document indexing function
as for the adhoc queries.  Query terms were weighted
according to the RPI formula. In addition, each query
was expanded by 20 single words.   Phrases were not
downweighted.
Run dortVl is based on ltc document indexing. Here
no query expansion took place.

area of research.


A    Operational details of runs
A.1   Basic Algorithms
The algorithm A to find the coefficient vector a for the
ad-hoc query term weights can be given as follows:

run
document indexing
query expansion

I

dortVl     dortPl
ltc        lsp
none       20 terms

    average precision:
    Prec. Avg.        ~ 0.3516      0.3800
    query-wise comparison with median:
    Prec. Avg.          38:10       46:4
    Prec. � 100 docs    31:11       40:5
    Prec. � 1000 docs   32:9        37:7
    Best/worst results:
    Prec. Avg.          1/0         4(2)10
    Prec. � 100 docs    3(3)11(1)   7(5)11(1)
    Prec. � 1000 docs   6(2)/0(1)   10(2)/0(1)
    dortVl vs. dortPl:
    Prec. Avg.                    10:39
    Prec. � 100 docs              9:27
    Prec. � 1000 docs             7:33

       Table 10: Results for routing queries

Table 10 shows the results for the two runs. The recall-
precision curves are given in figure 2. Again, the results
confirm our expectations that LSP indexing and query
expansion yields better results.


5   Conclusions and outlook
The experiments described in this paper have shown
that probabilistic learning approaches can be applied
successfully to different types of indexing and retrieval.
For the ad-hoc queries, there seems to be still room for
further improvement in the low recall range. In order to
increase precision, a passage-wise comparison of query
and document text should be performed. For this pur-
pose, polynomial retrieval functions could be applied.
In the case of the routing queries, we first have to inves-
tigate methods for parameter estimation in combination
with query expansion. However, with the large number
of feedback documents given for this task, other types
of retrieval models may be more suitable, e.g. query-
specific polynomial retrieval functions.
Finally, it should be emphasized that we still use rather
simple forms of text analysis. Since our methods are
flexible enough to work with more sophisticated analysis


                                                 73

                   Algorithm A

1  For each query document pair (qk, dm)  ~
   (Qi U Q2) x D8 with D8 being a sample from
   (D1 UD2) do
    1.1  determine the relevance value ~km of
         the document dm with respect to the
         query qk.
    1.2  For each term t~ occuring in q~ do
          1.2.1 determine the feature vector ~j
                and the indexing weight Uim of the
                term t~ w.r.t. to document dm.
    1.3  For each feature j of the feature vectors
         x compute the value of YJ looping over
         the terms of the query.
    1.4  Add vector x and relevance value r~m
         to the least squares matrix.
         the least squares matrix to find the co-
2  Solve
   efficient vector a

The algorithm B to find the coefficient vector b for the
document indexing is sketched here:

                   Algorithm B

 1  Index D1 U D2 (the learning document set) and
    Qi U Q2 (the learning query set).

2  For each document d ~ D1 U D2
   2.1   For each q ~ Qi U Q2
         2.1.1  Determine the relevance value r of
                d to q
         2.1.2  For each term t in common be-
                tween qT (set of query terms) and
                ~T (set of document terms)
                 2.1.2.1 Find values of the ele-
                        ments of the relevance
                        description involved in
                        this run and add values
                        plus relevance  informa-
                        tion to the least squares
                        matrix being constructed
3  Solve the least squares matrix to find the coef-
   ficient vector b

The algorithm C to index a document set D can now
be given as:


                   Algorithm C

  1 For each document d E D
     1.1 For each term t E ~
          1.1.1 Find values of the relevance de-
                scription i(t, d) involved in run.
          1.1.2 Give I the weight b. v~x~t, d))
     1.1 Add d to the inverted file.


A.2    Ad-hoc runs

The algorithm D is used for indexing and retrieval for
the ad-hoc runs. Steps numbered with a trailing "A"
apply only for run dortq2, steps with trailing "B" only
to run dortL2.


                   Algorithm D

  1 Run al~orithm B to determine the coefficient
    vector b for document indexing.
  lA Run algorithm A to determine the coefficient
    vector a for query indexing.
  2 Call algorithm C for document set D1 U D2
  3 For each query q~ � Q~ do
     3.1 For each term t~ occuring in q~ do
          3.1.1A Determine the feature vector Xik
                and  compute    the query term
                weight Cik by multiplying it to d..
          3.1.1B Weight t~ w.r.t. qk (test query
                set) with if weights (nnn variant).
                Phrases where downweighted by
                multiplying the weights with ~ =
                0.15.
     3.2 Run an inner product inverted file sim-
         ilarity match of c~ against the inverted
         file formed in step 2, retrieving the top
         1000.


A.3    Routing runs

Algorithm E is used for indexing and retrieval for the
routing runs. Steps numbered with a trailing "A" apply
only for run dortPl, steps with trailing "B" only to run
dortVl..


                                              74

                    Algorithm E

lA Index query set Q2 and document set D1 U D2
   with if. idf weights.
lB Index query set Q2 and document set D1 U D2
   by calling algorithm C
2  For each query q E Q2
   2.1   For each term I ~ qT (set of query terms)
           2.1.1  Reweight term I using the RPI rel-
                  evance weighting formula andthe
                  relevance information supplied.
3A Index document set D3 by calling algorithm C.
3B Index document set D3 with if. idf weights.
   Note that the collection frequency information
   used was derived from occurrences in D1 U D2
   only (in actual routing the collection frequencies
   within D3 would not be known).
4  Run the reweighted queries of Q2 (step 2)
   against the inverted file (step 3), returning the
   top 1000 documents for each query.


References
F'iihr, N.; Buckley, C. (1991). A Probabilistic Learn-
 ing Approach for Document Indexing. ACM J'rans-
 actions on Informalion Systems 9(3), pages 223-248.
Ftihr, N.; Buckley, C. (1993). Optimizing Document
 Indexing and Search Term Weighting Based on Prob-
 abilistic Models.  In: ilarman, D. (ed.): The First
 Text ftEtrieval Conference (TREC-1), pages 89-100.
 National Institute of Standards and Technology Spe-
 cial Publication 500-207, Gaithersburg, Md. 20899.
Fiihr, N.; Pfeifer, U. (1994). Probabilistic Informa-
 tion Retrieval as Combination of Abstraction, Induc-
 tive Learning and Probabilistic Assumptions. ACM
 Transactions on Information Systems 1~(1).
Fiihr, N. (1989a)   Models for Retrieval with Proba-
 bilistic Indexing. Information Processing and Man-
 agement ~5(1), pages 55-72.
Fiihr, N.  (1989b).  Optimum Polynomial Retrieval
 Functions Based on the Probability Ranking Princi-
 ple. ACM Transactions on Information Systems 7(3),
 pages 183-204.
Wong, S.; Yao, Y. (1989). A Probability Distribu-
 tion Model for Information Retrieval.   Information
 Processing and Management ~5(1), pages 39-53.