Generative Query Expansion

Nov 15, 2017

Introduction

In many query expansion studies, the primary evaluation depends on its search result. This project tries to provide a second way¹ of measuring QE performance by measuring how many query expansion terms overlaps with ideal expansion terms, where ideal expansion terms are trained from a high performing QE from supervised learning. After expansion, we used Okapi BM25 for ranking(searching).

Approaches

(i) Try to find expansion cadidates using GQE model which tries find the terms that maximizes the probability of coocurrening terms.
(ii) Try to see if maximizing probabilities of coocurring terms is the right way
(iii) Try to find cohesiveness of coocurring terms

Finding Ideal Expansion Terms

We need to find ideal expansion terms first, so we have a criterion for comparing model's ouput. A supervised learning can find expansion terms that performs search well as it gives a result with high P@10 and MAP. The ideal expansion from training on AP88/89 performs as following:

Expansion	p@5	p@10	MAP@10	MAP@100	MAP@1k	Recall@1k	Recall@10k
+0 terms	0.41	0.371	0.28	< 0.25	< 0.25	0.662	0.861
+400	0.906	0.827	0.817	0.556	0.546	0.8578	0.9642
+2000	0.98	0.959	0.9832	0.8734	0.7935	0.8825	0.9495

In early work, the goal was to find 400-expansion terms. Later, we found that 400 was not the upper bound but 2000+ terms could be used for expansion.

Early Work

The first approach assumes that 400 expansion terms are enough quantity for QE. The top-100 terms that maxmizes the probability of coocurrence decides high amount of MAP. So, finding these top-100 terms are important as top-100 contributes most to the maximization. GQE finds top 100 as below:

The above graph is in query level measurment which shows how many exansion terms could be found that are in top-100 ideal expansion terms.

The larger model finds more ideal expansion terms, but so is the false positive expansion candidates.

Quality vs Quantity

The early work tries find terms that maximizes the probability of coocurrence, and it focused on top 100 terms. During the process, we found that the number of expandable terms can be more than 1000 and it gives a much better result as we see below:

Expansion	p@5	p@10	MAP@10	MAP@100	MAP@1k
+400	0.906	0.827	0.817	0.556	0.546
+2000	0.98	0.959	0.9832	0.8734	0.7935

Let's use Monte Carlo method to make sure that the quantity matters more than quality; we randomly pick n terms from the ideal expansion term pool that we got from supervised learning, and see how it performs below:

	Max Cooc 400		Monte Carlo @400	Monte Carlo @500	Monte Carlo @600	Monte Carlo @700	Monte Carlo @800	Monte Carlo @900	Monte Carlo @1000
MAP@100:	0.556		0.568	0.614	0.653	0.682	0.704	0.725	0.742

Max Cooc is using 400 expansion terms which come from maximized term co-occurence probability; this is no better than QE with 400 randomly picked expansion terms from ideal expansion pool as shown in Monte Carlo@400 method. As random picks are better than the co-occurence proability maximization method, it's better to focus on finding terms quantitatively.

Recent Work

The idea of finding top 100 ideal expansion is not the right direction as we found that 400 ideal expansion terms were at its local maxima for its performance. The quantity of expansion term mattered more than the quality, where the quality is high when the ideal expansion terms maximizes the co-occurrence probability of the expansion terms.

GQE seems to capture a lot of ideal expansion terms:

GQE first n-terms	0-50	0-100	0-200	0-300	0-400	0-Max
Avg ideal terms found	41	82	161	236	304	1156
Precision	0.82	0.82	0.805	07867	0.76	n/a

The precision of finding these ideal expansion terms with GQE is quite high, but its error is not low enough yet. But, with appropriate terms injected, GQE's extension can be useful. In the first feedback loop, we get enough positive terms that can lower the error rates of GQE. With one feedback and GQE expansion, the search results improves as below:

	No GQE	GQE+50	GQE+100	GQE+150	GQE+200	GQE+250	GQE+300	GQE+350
p@5	0.41	0.496	0.506	0.516	0.522	0.526	0.534	0.514
p@10	0.371	0.391	0.404	0.411	0.417	0.428	0.42	0.415

GQE+n is for queries expanded with n-GQE terms and 1 feedback. The injection of true positive terms from a feedback helps to reduce GQE's error rate, and improves the search performance.

Conclusion

So far, we've found that GQE captures a great portion of ideal expansion terms and is useful when it's used with the first feedback. If we find a way to increase the precision of finding GQE candidates(ie. filter out false positive terms), then the query expansion will improve even more as GQE's candidate terms are becoming closer to ideal expansion terms.

Acknowledgement

This project is advised by Prof. Cheng Xiang Zhai

This project is currently under progress. ↩︎

Byung Il Choi

My research interests include Deep Learning, Information Retreival, and Computer Graphics.