Outline
- Introduction
-
Social networks, information networks,
citation networks, actor-networks,
transportation networks,
biological networks.
Search Engines,
Recommender Systems
- Web content: words, images and hyperlinks
-
words, images, hyperlinks
Information Retrivel: Query and search,
Stop words, normalization, standardization
Efficient query processing, inverted index files
Bag of words model/ vector space model, Language model
Word occurrence distribution: multinomial distribution
Classification /categorization
kNN, Naive Bayes, Centroid method
*Bipartite graph/association graph
*Tripartite graph: users, items, tags
- Graphs and Basic network properties
-
Nodes, edges, paths, graph representation, connected components
-
directed graphs, hyperlinks
-
centrality
- Markov Chains and Random Walk on a graph
-
Random walks, transition probability,
stationary distribution
Albert Einstein show the existance of molecules
by random walks (Browning Motion)
before microscope was invented to see them
Random walks on directed graphs, sinks and sources
PageRank: random walk on directed graph with random jumps.
Reading:
"Random walks and electric networks (1984)"
[PDF]
- World Wide Web: hyperlink structure
-
Web is a directed graph. Diameter of Web is 19.
[In contrast, for a undirected graph, the diameter is usually 6].
Power law distributions of in-degrees.
Webpage ranking --- a key technology
co-citation, co-reference, hubs and authorites, HITS ranking
Web graph as an expected degree sequence random graph
Reading:
-
"Towards A Qualitive Search Engine: Hyperlink Vector Voting(1997)"
[PDF]
-
"The diameter of the world wide web (1999)"
[Link]
-
"PageRank"
[Link]
-
"Graph structure in the web (1999)"
[Link]
-
"The Web as a graph: measurements, models, and methods (1999)"
PDF
-
"Link Analysis: Hubs and Authorities on the World Wide Web (2001)"
[PDF]
-
"PageRank, HITS and a Unified Framework for Link Analysis (2002)"
[PDF]
- Social network strutures: Strength of Weak Ties
-
connection strength, depth of friendship,
reciprocity, homophily
Transitivity,
Transitive closure,
Transitive reduction,
measure of transitivity,
Strong links, weak links, local and global bridges.
Weak ties are the key links between
tightly-knit communities
"Who talks to Who" experiments
Delete strong edges cause communities to shrink;
Delete weak edges cause communities to split;
Strengths between Facebook users: (1) Reciprocal,
(2) one-way communication,
(3) maintained relationship
Twitter: short messages, (1) All friends, (2) targeted friends
Structural holes: organization cohesion, social conformity
Fringe nodes: access to different information, innovation fusion,
information gate keeper.
- Positive and negative relations
-
Structual balance and compatibilty. Basic partition theorem.
Weaker structual balance and compatibilty. Relation to clutering.
- Review of network centrality, clutering coefficents
-
Degree centrality, betweenness centrality, closeness centrality,
eigenvector centrality.
How to compute edge betweeness
- Models for social networks: Erdos-Renyi Random Graph
-
Degree distribution,
random variables, computing 2-chains and triangles
connectivity: chain formation, giant component, clique formation
thresholding functions
Branching process: 1st neighbor, 2nd neighbor, etc.
condition for giant component
Algorithm to generate ER graphs
- Models for social networks: Fixed/expected Degree Sequence Random Graph
-
A flexible model for networks
probabilty model: P(x_ij = 1) = d_i d_j / 2E.
compute many network quantities such as
number of triangles, common neighbors, etc.
Chenoff bounds on degrees
neighborhood growth,
condition for giant component
Algorithm to generate FDS random graphs
Reading:
-
"Connected components in random graphs with given expected degree sequences
(2002)" [PDF]
- Models for social networks: Small World graphs
-
Milgram's experiments.
Six degree of separation. From students to President.
Networks with small diameter.
Local connections and global connections.
Random rewiring of local connections to reduce network diameter.
Navigating the small world networks:
Global optimal path vs.
local information-based path.
Algorithm to generate Small World random graphs: 1D and 2D graphs
- Information Cascade and Herding
-
Game theory: irrational behavior could be in fact rational
Private information and public decisions
Details of the experiment on
red-majority vs blue-majority determination.
Bayes rule.
Cascade can be wrong; Cascade can happen based on little
true information; Cascade are fragile.
Sequencial decision making vs indepdent decision making.
Discret decision vs continuous/probabilis decision.
Problems in discussion and decision making in meetings
-
Power law and Rich-get-richer Phenomena
-
Norma distribution vs Zipf distribution
Power law, Pareto distribution,
A Website devoted to Zipf Distribution
Reading:
-
"Random texts exhibit Zipf's-law-like word frequency distribution",
W. Li (1992), IEEE Transactions on Information Theory.
-
"Power laws, Pareto distributions and Zipf's law", M. E. J. Newman,
Contemporary Physics 46, 323-351 (2005)).
-
"Universality of Rank-Ordering Distributions in the Arts and Sciences",
PLoS ONE. 2009; 4(3): e4791
PDF
Network growth models: preferential attachment and copy model
Unpredictability of rich-get-richer effects.
Effects of showing sales ranking.
Long tails: positive effects on internet economy
- Diffusion and Propagation on Networks
-
Diffusion in networks. Game theory model.
Clusters block diffusion.
Percolation
Disease Propagation on the network: SIR and SIS Models
- Finding communities on a network
-
Graph clustering
- Recommender Systems
-
Content-based Filtering
Collaborative Filtering