Node Embeddings <3
Overall Goals
Strengths Limitations
Conditions to prefer a methods
Is there the best method
Ideas, Nice to Haves
Performance on Adversarial Attacks
2 . Generate through P(A|Z) -> how to model P(A|Z)
Bernoulli
Sigmoid (z_i^T*z_j+b)
distance based (-gamma||z_i-z_j||)
Poisson (1-exp(-z_i^Tz_j) )
- How To embedd the Graph
Symmetric
Non-Symmetrix
Categorical
SVD-Based
SBM
- Optimization Procedure
Full-Batch
Negative Sampling
Noise-Contrastive Estimation
Applications
Unified Pipeline -> Z_init
Literature
- How to embed the graph
- how to model p(A|z)
- Optimization Procedure
"Learning Embedding with a generative model" [1]
Factorization Based
Locally Linear Embedding (LLE) [1]
Laplacian Eigenmaps [1], [2]
Cauchy Graph Embedding [1]
Structure Preserving Embedding (SPE) [1]
Graph Factorization (GF) [1], [2]
GraRep [1],[2]
HOPE [1],[2]
Random Walk Based
DeepWalk [1], [2]
Node2Vec [1], [2]
Hierarchical Representation Learning for Networks (HARP) [1], [2]
Walklets [1]
Deep Learning Based
Structural Deep Network Embedding (SDNE)
Deep Neural Networks for Learning Graph Representations (DNGR)
Graph Convolutional Networks (GCN)
Variational Graph Auto-Encoders (VGAE)
Other
LINE
- Application
Network Compression
Visualization
Clustering
Link Prediction
Node Classification
Datasets
SYN SBM
KARATE
BLOGCATALOG
YOUTUBE
HEP-TH
ASTRO-PH
Explore Non-Linear Models [1]
Study Evolution of Networks [1]
Interpretable Non-Linear (Deep Learning based) models [1], [2]
Earliest, DEC(zi,zj)=||zi-zj||^2), L=SUM(DEC(zi,zj)*sg(vi,vj)))
HOPE, GF, GraRep are inner product based -> DEC(zi,zj)=ziTzj
general Proximity measure
HOPE, GF, GraRep are inner product based -> DEC(zi,zj)=ziTzj
Proximity measure Aij
HOPE, GF, GraRep are inner product based -> DEC(zi,zj)=ziTzj
Proximity measure Aij, Aij^2, ..., Aij^k
direct encoding, decoder based on inner product, approximate loss with hierarchical softmax to compute norm. factor, binary tree structure for acceleration
direct encoding, decoder based on inner product approximate loss with negative sampling
Two encoder-decoder objectives that optim "first order" and "second order" graph proximity respectively
use decoder based on sigmoid
- proximity sg(vi,vj)=Aij
- proximity two hop adjacency neighborhoods
Convolutional Approaches [2]
Scalability [2]
Higher Order [2]
Dynamic, Temporal Graphs [2]
Discover, not pre decide Candidate Subgraphs [2]
graph preprocessing, collapse related nodes into supernodes, then other Random Walk based method
With quadratic penalty in Laplacian Eigenmaps, preservation of dissimilarity is emphasized compared to similarity, Use |Yi−Yj|2|Yi−Yj|2+σ2 instead, Emphasis on similar nodes rather than dissimilar
Every node is a linear combination of its neighbors
Obtain embedding ϕ(Y)=∑i|Yi−∑jWijYj|2, where Yi≈∑jWijYj
Perfectly reconstruct the graph, Use PSD Kernel Matrix K and a connectivity algorithm G, Introduce slack variable for noise handling
Combine idea of explicit modelling (Fact. Based) and random walks, Modifies DeepWalk by skipping nodes, Performed in different skip lengths, analogous to factorizing Ak in GraRep, Results are used to train similar to DeepWalk
Semi Supervised Classification
use multiple neighbours by taking 2,1,0 everywhere and use loss function from stanford paper
Outlook
Synthetic Data
Effect of Regularization
Effect of stochastic optimization