Thursday, July 22

Microsoft Releases Learning to Rank Datasets

Microsoft Research announced that it is releasing a new MS LTR dataset.
We release two large scale datasets for research on learning to rank: MSLR-WEB30k with more than 30,000 queries and a random sampling of it MSLR-WEB10K with 10,000 queries.

136 features have been extracted for each query-url pair.
The dataset is a retired dataset. What makes this quite interesting is that the features have been released. You can see the feature list.

See also the Y! LTR datasets.

1 comment:

  1. About the MSLR-WEB10K dataset its given as

    0 qid:1 1:3 ... 136:0

    Now i can understand that here 0 is the relevance level,qid is the Query ID and the rest are feature vectors...........

    But which one is the URL ID .......

    Can u please tell ?? :D

    ReplyDelete