Neural word and entity embeddings for ad hoc retrieval

Source of Publication

Information Processing and Management


© 2018 Elsevier Ltd Learning low dimensional dense representations of the vocabularies of a corpus, known as neural embeddings, has gained much attention in the information retrieval community. While there have been several successful attempts at integrating embeddings within the ad hoc document retrieval task, yet, no systematic study has been reported that explores the various aspects of neural embeddings and how they impact retrieval performance. In this paper, we perform a methodical study on how neural embeddings influence the ad hoc document retrieval task. More specifically, we systematically explore the following research questions: (i) do methods solely based on neural embeddings perform competitively with state of the art retrieval methods with and without interpolation? (ii) are there any statistically significant difference between the performance of retrieval models when based on word embeddings compared to when knowledge graph entity embeddings are used? and (iii) is there significant difference between using locally trained neural embeddings compared to when globally trained neural embeddings are used? We examine these three research questions across both hard and all queries. Our study finds that word embeddings do not show competitive performance to any of the baselines. In contrast, entity embeddings show competitive performance to the baselines and when interpolated, outperform the best baselines for both hard and soft queries.

Document Type


First Page


Last Page


Publication Date