TLDR;
- Real-world scenarios entail complex relationships between multiple entities.
- Hypergraphs capture more than pairwise relationships, encoding complex relationships.
- You can construrct a hypergraph for non-graphical data; define your nodes and hyperedges using various similarity-based approaches.
Figure: Constructing your own hypergraph
Beyond Pairs: Exploring the Power of Graph Learning and Hypergraphs
Many real-world scenarios involve complex relationships that might not sufficiently represented by pair-wise associations. Graph learning offers a powerful framework to understand and leverage the interconnectedness of data.
Graphs are a way to both store and encode relationships. They allow us to move beyond individual data points and consider the intricate web of connections that define a system. A graph consists of:
-
Vertices (or nodes): These represent entities or objects. In the context of images, a vertex could be a pixel, a region of interest, or even an entire image.
-
Edges (or links): These define the relationships or connections between pairs of vertices. An edge might represent the spatial proximity of pixels, the similarity between images, or interactions between proteins.
Learning with Hypergraphs
Graph learning tasks are diverse, including classifying individual nodes, predicting links between nodes, grouping similar nodes, or even understanding the structure of entire graphs. Many deep learning models on graphs ultimately learn node representations (embeddings) by considering the graphβs structure and the features of connected nodes.
Stepping Beyond Pairwise Connections: While graphs effectively model pairwise relationships, many systems exhibit dependencies that involve more than just two entities simultaneously. Imagine a research collaboration involving multiple scientists, or a biological pathway where several proteins interact at once. Standard graphs fall short in capturing these higher-order correlations. This is where hypergraphs come into play.
A hypergraph is a generalization of a traditional graph where an edge (now called a hyperedge) can connect two or more vertices. This allows for a much richer representation of complex associations. Think of a hyperedge as a group of related entities, capturing the idea that their interaction is not just a series of pairwise relationships, but a collective one.
Figure: Formal representation of a hypergraph
Hypergraphs are often represented using an incidence matrix, where rows correspond to vertices and columns represent hyperedges. An entry in the matrix indicates whether a particular vertex belongs to a specific hyperedge. This is demonstrated in the figure above.
Hypergraph Estimation: Since data isnβt always naturally structured as a hypergraph, various hypergraph estimation methods exist to infer these higher-order relationships from data. These methods can be based on the similarity between entities (e.g., k-nearest neighbors, k-means clustering), attributes of the entities, or even hop-aggregates in an existing graph. This is illustrated in the very first figure above. From this estimation exercise, the hypergraph incidence matrix is constructed.
Formal Definition: Formally, a hypergraph πΊ = (π ,πΈ,π»,π ,π)
consists of a set of vertices π = {π£1,β¦,π£π}
and a set of hyperedges πΈ = {π1,β¦,ππ}
depicting the relationships between the vertices. The associated incidence matrix π» β βπΓπ
and hyperedge weights π β βπΓπ
encode relational structure, while the vertex features π β βπΓπ
capture content information. The incidence matrix π»
is a binary matrix and the strengths of the hyperedge connections are represented in the diagonal matrix π
. The feature matrix π
holds the input features of each vertex as a one-dimensional vector π(π£π) = π₯π β βπ
.
Laplacian of a Hypergraph: The Laplacian of graphs is a critical operator in graph signal processing and forms the basis of most graph neural networks taking on different normalization forms in some models and being an important research topic for enhancing the prediction and computational performance of deep learning models for graph data. It is a critical component of learning with graphs. Will delve into it in the next post.