Michael Bendersky

Experimental Data and Annotations

In some of the published material annotated data was used, which was not readily available from traditional sources such as TREC. When possible, I will publish this data here, in order to promote the reproducibility of our research.

WIT : Wikipedia-based Image Text Dataset

GWikiMatch (Long-Form Document Matching)

Query Representation and Understanding Dataset (QRU-1)

Syntactic Annotation of Search Queries

Finding Text Reuse on the Web

Discovering Key Concepts in Verbose Queries