We release two large scale datasets for research on learning to rank: MSLR-WEB30k with more than 30,000 queries and a random sampling of it MSLR-WEB10K with 10,000 queries.The dataset is a retired dataset. What makes this quite interesting is that the features have been released. You can see the feature list.
136 features have been extracted for each query-url pair.
See also the Y! LTR datasets.
About the MSLR-WEB10K dataset its given as
ReplyDelete0 qid:1 1:3 ... 136:0
Now i can understand that here 0 is the relevance level,qid is the Query ID and the rest are feature vectors...........
But which one is the URL ID .......
Can u please tell ?? :D