Dealing with Diversity in Mining and Query Processing
Jeffrey Xu Yu (于旭)
Department of Systems Engineering and Engineering Management
The Chinese University of Hong Kong
******@
Books on works
Social and works by Matthew O. Jackon
work Data Analysis by Charu C. Aggarwal
Exploratory work Analysis with Pajek by Wouter de Nooy, Andrej Mrvar, and Vladimir works, Crowds, and Markets: Reasoning about a Highly Connected World by David Easley and John works An Introduction by . Newman
Some Online Courses
Mining of Massive Datasets (Anand Rajaraman and Jeff Ullman) /~ullman/, Crowds, and Markets: Reasoning about a highly connected world, by David Easley and Jon Kleinberg .edu/home/works-book
Topics in Data Management & Mining – works, Laks . Lakshmanan /~laks/534l/
Stanford work Dataset Collection
works
works
works
Web graphs
works
works
works
Autonomous systems
works
works and metadata
Twitter and Memetracker
Graph Database http://en./wiki/Graph_database
Pregel: Google’s internal graph processing platform
Trinity: Microsoft Research Asia
Neo4j: commercial graph database
…
Diversified Ranking
Why diversified ranking?
Information requirements diversity
Query plete
Problem Statement
For query dependent diversity ranking, the goal is to find K nodes in a graph that are relevant to the query node, and also they are dissimilar to each other.
For query independent diversity ranking, the goal is to find K prestige nodes in a graph that are dissimilar to each other.
Main applications
Ranking nodes in work, ranking papers, etc.
Challenges
Diversity measures
No wildly accepted diversity measures on graph in the literature.
Scalability
Most existing methods cannot be scalable to large graphs.
Lack of intuitive interpretation.
Existing Methods
Grasshopper [Zhu, et al., HLT-NAACL’07]
ManiRank [Zhu, et al., WWW’11]
DivRank [Mei, et al., KDD’10]
DRAGON [Tong, et al.,
于旭的报告 来自淘豆网m.daumloan.com转载请标明出处.