欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 科技 > IT业 > [TPAMI 2022]HGNN: General Hypergraph Neural Networks+

[TPAMI 2022]HGNN: General Hypergraph Neural Networks+

2025/6/22 11:51:43 来源:https://blog.csdn.net/Sherlily/article/details/148807939  浏览:    关键词:[TPAMI 2022]HGNN: General Hypergraph Neural Networks+

论文网址:HGNN+: General Hypergraph Neural Networks | IEEE Journals & Magazine | IEEE Xplore

英文是纯手打的!论文原文的summarizing and paraphrasing。可能会出现难以避免的拼写错误和语法错误,若有发现欢迎评论指正!文章偏向于笔记,谨慎食用

目录

1. 心得

2. 论文逐段精读

2.1. Abstract

2.2. Introduction

2.3. Related Work

2.3.1. Graph Neural Networks

2.3.2. Hypergraph Learning

2.4. Preliminaries of Hypergraphs

2.5. The Framework of Hypergraph Neural Network HGNN+

2.5.1. Hypergraph Modeling

2.5.2. Hypergraph Convolution

2.6. Discussions

2.6.1. Hypergraph vs. Graph

2.6.2. HGNN/HGNN vs. GNN+

2.6.3. HGNN vs. HGNN+

2.7. Experiments and Discussions

2.7.1. Experimental Settings

2.7.2. Vertex Classification on the Data With Graph Structure

2.7.3. Vertex Classification on the Data Without Graph Structure

2.7.4. Vertex Classification on the Data With Hypergraph Structure

2.7.5. Visualization

2.8. THU-DeepHypergraph: An Open Toolbox of the HGNN Framework+

2.9. Conclusion

3. Reference

1. 心得

(1)经典还是得读一下

2. 论文逐段精读

2.1. Abstract

        ①Limitations of GNN: feature extraction on multi modal/type

        ②Limitations of existing HGNN: shared weight between modalities/types

        ③So, they proposed HGNN+ and THU-DeepHypergraph toolbox

2.2. Introduction

        ①High order relationship:

        ②Framework of HGNN+:

        ③The conception of graph and hypergraph:

        ④Hyperedge group: pairwise edge, attribute, k-Hop and neighbors

        ⑤Advantages of HGNN+: not only for adaptive aggregation, but also expand to directed hypergraph

2.3. Related Work

2.3.1. Graph Neural Networks

        ①Lists spectral and spatial based GNN

2.3.2. Hypergraph Learning

        ①Lists the development of hypergraph and hypergraph neural network

2.4. Preliminaries of Hypergraphs

        ①Notations:

for graph \mathcal{G}=(\mathcal{V},\mathcal{E},\mathbf{W})|\mathcal{V}|\times|\mathcal{E}| incidence matrix \mathbf{H} can be written as:

\mathbf{H}(v,e)=\left\{ \begin{array} {lr}1, & \mathrm{if~}v\in e \\ 0, & \mathrm{otherwise} \end{array}\right.

and the initial node feature is \mathbf{X}^0=\{x_1^0,x_2^0,\cdots,x_N^0\},x_i^0\in\mathbb{R}^{C_0}, where C_0 denotes the dimension of feature

        ②Optimization goal in node classfication task:

\arg\min_f\left\{\mathcal{R}_{emp}(f)+\Omega(f)\right\}

where \Omega(f) is a regularizer on hypergraph, \mathcal{R}_{emp}(f) denotes the supervised empirical loss and f\left ( \cdot \right ) is classification function

        ③Regularizer \Omega(f) can be represented by:

\Omega(f)=\frac{1}{2}\sum_{e\in\mathcal{E}}\sum_{\{u,v\}\in\mathcal{V}}\frac{w(e)\mathbf{H}(u,e)\mathbf{H}(v,e)}{\delta(e)}\left(\frac{f(u)}{\sqrt{d(u)}}-\frac{f(v)}{\sqrt{d(v)}}\right)^{2}

2.5. The Framework of Hypergraph Neural Network HGNN+

2.5.1. Hypergraph Modeling

(1)Hyperedge Group Generation

        ①When the data correlation is with graph structure:

(a top) pairwise edge

Similar to traditional graph:

\mathcal{E}_{\mathrm{pair}}=\{\{v_i,v_j\}\mid(v_i,v_j)\in\mathcal{E}_s\}

(a bottom) k-hop neighbor

The k-hop neighbor is:

N_{\mathrm{hop}_k}(v)=\{u\mid\mathbf{A}_{uv}^k\neq0,u\in\mathcal{V}_s\}, where k\in \left [ 2,n_v \right ]. The hyperedge can be represented by:

\mathcal{E}_{\mathrm{hop}_k}= \begin{Bmatrix} N_{\mathrm{hop}_k}(v)\mid v\in\mathcal{V} \end{Bmatrix}

        ②When the data correlation is without graph structure:

(b top) attributes

Hyperedges are constructed by the same attributes (geo-locations, time and other specific information)(比如每个节点都是一篇论文的话,综述论文共享一条超边,研究论文共享一条超边,workshop共享一条等等). The construction of hyperedge:

\mathcal{E}_\mathrm{attribute}=\{N_\mathrm{att}(a)\mid a\in\mathcal{A}\}

(b bottom) feature

k nearest neighbor in feature space:

\left.\left\{ \begin{array} {l}\mathcal{E}_{\mathrm{feature}}^{\mathrm{KNN}_k}=\{N_{\mathrm{KNN}_k}(v)\mid v\in\mathcal{V}\} \\ \mathcal{E}_{\mathrm{feature}}^{\mathrm{distance}_d}=\{N_{\mathrm{dis}_d}(v)\mid v\in\mathcal{V}\} \end{array}\right.\right.

(2)Combination of Hyperedge Groups

        ①Coequal fusion of each hyperedge group:

\mathbf{H}=\mathbf{H}_1\|\mathbf{H}_2\|\cdots\|\mathbf{H}_K

        ②Adaptive fusion:

\left\{ \begin{array} {l}\mathbf{w}_{k}=\operatorname{copy}(\operatorname{sigmoid}(w_{k}),M_{k}) \\ \mathbf{W}=\operatorname{diag}(\mathbf{w}_{1}^{1},\cdots,\mathbf{w}_{1}^{M_{1}},\cdots,\mathbf{w}_{K}^{1},\cdots,\mathbf{w}_{K}^{M_{K}}) \\ \mathbf{H}=\mathbf{H}_{1}\|\mathbf{H}_{2}\|\cdots\|\mathbf{H}_{K} \end{array}\right.

where \mathbf{w}_{k} \in \mathbb{R} denotes trainable parameter shared by all hyperedges inside a specified hyperedge group k\operatorname{copy}\left ( a,b \right ) returns a vector of size b with all the value of ab\mathbf{W}\in\mathbb{R}^{M\times M} denotes a diagonal weight matrix,

2.5.2. Hypergraph Convolution

(1)Spectral Convolution on Hypergraph

        ①The spectral convolution of signal \mathbf{x} and filter \mathbf{g} can be denoted as:

\mathbf{g}\star x=\mathbf{\Phi}((\Phi^\top\mathbf{g})\odot(\Phi^\top\mathbf{x}))=\mathbf{\Phi}g(\mathbf{\Lambda})\mathbf{\Phi}^\top\mathbf{x}

where \odot denotes the element-wise Hadamard product, g(\boldsymbol{\Lambda})=\mathrm{diag}(g(\lambda_1),\ldots,g(\lambda_n))

        ②Further simplify it by Chebyshv:

\mathbf{g}\star\mathbf{x}\approx\sum_{k=0}^K\theta_kT_k(\tilde{\mathbf{\Delta}})\mathbf{x}

        ③Limite Chebyshv in K=1:

\mathbf{g}\star x\approx\theta_0\mathbf{x}-\theta_1\mathbf{D}_v^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_e^{-1}\mathbf{H}^\top\mathbf{D}_v^{-1/2}\mathbf{x}

        ④Reduce parameter to avoid overfitting and simplify graph conv:

\left.\left\{ \begin{array} {l}\theta_1=-\frac{1}{2}\theta \\ \theta_0=\frac{1}{2}\theta\mathbf{D}_v^{-1/2}\mathbf{H}\mathbf{D}_e^{-1}\mathbf{H}^\top\mathbf{D}_v^{-1/2} \end{array}\right.\right.

\begin{gathered} \mathbf{g}\star x\approx\frac{1}{2}\theta\mathbf{D}_v^{-1/2}\mathbf{H}(\mathbf{W}+\mathbf{I})\mathbf{D}_e^{-1}\mathbf{H}^\top\mathbf{D}_v^{-1/2}\mathbf{x} \\ \approx\theta\mathbf{D}_v^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_e^{-1}\mathbf{H}^\top\mathbf{D}_v^{-1/2}\mathbf{x}, \end{gathered}

(2)General Spatial Convolution on Hypergraph

        ①Inter-Neighbor Relation:

N=\{ \begin{array} {c}(v,e)\mid\mathbf{H}(v,e)=1,v\in\mathcal{V}\mathrm{and~}e\in\mathcal{E} \end{array}\}

and the vertex inter-neighbor set of hyperedge is defined as:

\mathcal{N}_v(e)=\{v\mid vNe,v\in\mathcal{V}\mathrm{and}e\in\mathcal{E}\}

the hyperedge inter-neighbor set of vertex is defined as:

\mathcal{N}_e(v)=\{e\mid vNe,v\in\mathcal{V}\mathrm{and}e\in\mathcal{E}\}

        ②Spatial hypergraph convolution:

where m_\beta^t is the message of hyperedge \beta\in\mathcal{E}y_\beta^t is the hyperedge feature of hyperedge \beta which is a element of hyperedge feature set Y^t=\{y_1^t,y_2^t,\cdots,y_M^t\},y_i^t\in\mathbb{R}^{C_t} in layer tM_v^t(\cdot),U_e^t(\cdot),M_e^t(\cdot),U_v^t(\cdot) are the vertex message function, hyperedge update functions, hyperedge message function and vertex update function in t-th layer

(3)HGNN Convolution Layer Configurations+

        ①Functions in HGNN+:

\begin{cases} M_v^{t}(x_{\alpha}^{t}) =\frac{x_{\alpha}^{t}\vert}{\vert N_e(\beta)} \\ U_e^{t}(w_{\beta},m_{\beta}^{t}) =w_{\beta}\cdot m_{\beta}^{t} \\ M_e^{t}(x_{\alpha}^{t},y_{\beta}^{t}) =\frac{y_{\beta}^{t}\vert}{\vert N_e(\alpha)} \\ U_v^{t}(x_{\alpha}^{t},m_{\alpha}^{T+1}) =\sigma(m_{\beta}^{T+1}\cdot\Theta) \end{cases}

        ②Hyper graph conv in matrix format:

\mathbf{X}^{t+1}=\sigma(\mathbf{D}_v^{-1}\mathbf{H}\mathbf{W}\mathbf{D}_e^{-1}\mathbf{H}^\top X^t\mathbf{\Theta}^t)

2.6. Discussions

2.6.1. Hypergraph vs. Graph

        ①"from the random walks’ aspect, a hypergraph with edge-independent vertex weights is equivalent to a weighted graph, and a hypergraph with edge-dependent vertex weights cannot be reduced to a weighted graph."

        ②For undirected graph without relation to edge, \mathbf{H}\in\{0,1\}^{|\mathcal{V}|\times|\mathcal{E}|} and \mathcal{G}_{\mathrm{in}}=\{\mathcal{V},\mathcal{E},\mathbf{W}\} (edge-independent).

        ③For those edge-dependent graph, \mathbf{R}\in\mathbb{R}^{|\mathcal{V}|\times|\mathcal{E}|} and \mathcal{G}_{\mathrm{de}}=\{\mathcal{V},\mathcal{E},\mathbf{W},\gamma\}. Degree matrix of vertex d\left ( v \right ) and edge \delta \left ( e \right ) can be represented by:

\left.\left\{ \begin{array} {ll}d(v) & =\sum_{\beta\in\mathcal{N}_e(v)}w(\beta) \\ \delta(e) & =\sum_{\alpha\in\mathcal{N}_v(e)}\gamma_e(\alpha) \end{array}\right.\right.

        ④之后是用随机游走还有马尔科夫链证明超图blabla,就是①,这里就不誊抄了,不是数学达人

2.6.2. HGNN/HGNN vs. GNN+

        ①For simple hyper graph (traditional graph):

\begin{cases} \mathbf{HH}^\top =\mathbf{A}+\mathbf{D} \\ \mathbf{D}_e^{-1} =1/2\mathbf{I} \\ \mathbf{W} =\mathbf{I} \end{cases}

the HGNN can be:

\begin{aligned} \mathbf{X}^{t+1} & =\sigma(\mathbf{D}_v^{-1/2}\mathbf{H}\mathbf{W}\mathbf{D}_e^{-1}\mathbf{H}^\top\mathbf{D}_v^{-1/2}\mathbf{X}^t\boldsymbol{\Theta}^t) \\ & =\sigma\left(\mathbf{D}_v^{-1/2}\mathbf{H}\left(\frac{1}{2}\mathbf{I}\right)\mathbf{H}^\top\mathbf{D}_v^{-1/2}\mathbf{X}^t\boldsymbol{\Theta}^t\right) \\ & =\sigma\left(\frac{1}{2}\mathbf{D}^{-1/2}(\mathbf{A}+\mathbf{D})\mathbf{D}^{-1/2}\mathbf{X}^t\boldsymbol{\Theta}^t\right) \\ & =\sigma\left(\frac{1}{2}(\mathbf{I}+\mathbf{D}^{-1/2}\mathbf{A}\mathbf{D}^{-1/2})\mathbf{X}^t\boldsymbol{\Theta}^t\right) \\ & =\sigma(\mathbf{D}^{-1/2}\hat{\mathbf{A}}\mathbf{D}^{-1/2}\mathbf{X}^t\hat{\boldsymbol{\Theta}}^t) \end{aligned}

        ②Utilize rooted subtree to analogize 2-uniform hypergraph:

2.6.3. HGNN vs. HGNN+

        ①Difference between two convs:

\begin{cases} \mathbf{X^{t+1}}\xleftarrow{\mathrm{HGNNConv}}\sigma(\underline{\mathbf{D_v^{-1/2}HWD_e^{-1}H^\top D_v^{-1/2}}}\mathbf{X^t}\mathbf{\Theta^t}) \\ \mathbf{X^{t+1}}\xleftarrow{\mathrm{HGNNConv^+}}\sigma(\underline{\mathbf{D_v^{-1}HWD_e^{-1}H^\top}}\mathbf{X^t}\mathbf{\Theta^t}) & \end{cases}

上面的是对称(无向图)消息传递,而下面的是非对称(无向图)传递

        ②Define of directed graph:

\hat{\mathbf{H}}(v,e)= \begin{cases} -1 & \mathrm{if}v\in T(e) \\ 1 & \mathrm{if}v\in S(e) \\ 0 & \mathrm{otherwise} \end{cases}

where T\left ( e \right ) and S\left ( e \right ) are the set of target and source vertices for hyperedge e

        ③Specific message passing of directed hyper graph:

\mathbf{X}^{T+1}=\sigma(\mathbf{D}_t^{-1}\hat{\mathbf{H}}_t\mathbf{W}\mathbf{D}_s^{-1}\hat{\mathbf{H}}_s\mathbf{X}^t\mathbf{\Theta}^t)

where 

\begin{cases} \mathbf{D}_s=\operatorname{diag}(\operatorname{col\_sum}(\hat{\mathbf{H}}_s)) \\ \mathbf{D}_t=\operatorname{diag}(\operatorname{col\_sum}(\hat{\mathbf{H}}_t)) & \end{cases}

2.7. Experiments and Discussions

2.7.1. Experimental Settings

        ①Learning rate: 0.01 and 0.001 for 3 citation network datasets and 2 social media network datasets

        ②Dropout: 0.5

        ③Optimizer: Adam with 0.0005 weight decay

        ④Loss: CE:

\mathcal{L}=-\sum_{v\in V}\sum_{c=1}^CY_{vc}\log(O_{vc})

2.7.2. Vertex Classification on the Data With Graph Structure

        ①Statics of datasets:

        ②Performance on 5 datasets when 5 samples per category were trained:

        ③Performance on 5 datasets when 10 samples per category were trained:

        ④Comparison on the effectiveness of different hyperedge groups with HGNN+:

        ⑤Comparison of HGNN and HGNN+:

        ⑥Convolution strategy comparison:

2.7.3. Vertex Classification on the Data Without Graph Structure

        ①3D object datasets: ModelNet40 and NTU

        ②Performance:

2.7.4. Vertex Classification on the Data With Hypergraph Structure

        ①Dataset: Cooking-200 and MovieLens2k-v2

        ②Performance:

2.7.5. Visualization

        ①t-sne visualization on Cooking-200:

2.8. THU-DeepHypergraph: An Open Toolbox of the HGNN Framework+

        ~

2.9. Conclusion

        ~

3. Reference

@ARTICLE{9795251,
  author={Gao, Yue and Feng, Yifan and Ji, Shuyi and Ji, Rongrong},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={HGNN+: General Hypergraph Neural Networks}, 
  year={2023},
  volume={45},
  number={3},
  pages={3181-3199},
  keywords={Correlation;Convolution;Data models;Task analysis;Social networking (online);Mathematical models;Representation learning;Hypergraph;classification;hypergraph convolution;representation learning},
  doi={10.1109/TPAMI.2022.3182052}}

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

热搜词