IDF revisited A simple new derivation within the Robertson-Sparck Jones probabilistic model.pdf


文档分类:研究报告 | 页数:约2页 举报非法文档有奖
1/2
下载提示
  • 1.该资料是网友上传的,本站提供全文预览,预览什么样,下载就什么样。
  • 2.下载该文档所得收入归上传者、原创者。
  • 3.下载的文档,不会出现我们的网址水印。
1/2
文档列表 文档介绍
arXiv: [] 8 May 2007 IDF Revisited: A Simple New Derivation within the Robertson-Sp ¨arck Jones Probabilistic Model Lillian Lee Dept. puter Science, Cornell University Ithaca, NY 14853-7501 USA .edu/home/llee ******@ ABSTRACT There have been a number of prior attempts to theoretically justify the e?ectiveness of the inverse document frequency (IDF). Those that take as their starting point Robertson and Sp¨arck Jones’s probabilistic model are based on strong plex assumptions. We show that a more intuitively plausible assumption su?ces. Moreover, the new assump- tion, while conceptually very simple, provides a solution to an estimation problem that had been deemed intractable by Robertson and Walker (1997). Categories and Subject Descriptors: [Informa- tion Search and Retrieval]: Retrieval models General Terms:Theory, Algorithms Keywords:inverse document frequency, IDF, probabilistic model, term weighting 1. INTRODUCTION The inverse document frequency (IDF) [12] has been “in- corporated in (probably) all information retrieval systems”([6], pg. 77). Attempts to theoretically explain its empirical esses abound ([2, 14, 1, 11, 5, 8, 4, 3],inter alia). Our focus here is on explanations based on Robertson and Sp¨arck Jones’sprobabilistic-model(RSJ-PM) paradigm of informa- tion retrieval [10], not because of any prejudice against other paradigms, but

IDF revisited A simple new derivation within the Robertson-Sparck Jones probabilistic model 来自淘豆网m.daumloan.com转载请标明出处.

相关文档 更多>>
非法内容举报中心
文档信息
  • 页数2
  • 收藏数0 收藏
  • 顶次数0
  • 上传人luyinyzha
  • 文件大小0 KB
  • 时间2016-07-19