Graph Convolution

The graph convolution algorithm is the core of the shape recognition in graph. It is based on the same idea as image convolution. But instead of convoluting on the pixels, the convolution apply on the nodes (and optionaly on the edges) of the graph. There exists many type of convolution but the most successful on HEP data is the message passing neural network (MPNN). This concept is defined in [12].

The principle is simple. Each node and edges are valuated with features. The process is iterative. At each iteration, new features are calculated from the old feature of each nodes its edges and its neighbor features. The global equation of the graph convolution can written as

x_i^{t+1}=\gamma_{\theta_{\gamma}}(x_i^{t},\underset{j \in N(i)}{\square} \phi_{\theta_{\phi}}(x_i^{t},x_j^{t},e_{i,j}))

  • The superset t or t+1 is the iteration number
  • i or j are the node id
  • x_i^t is the feature vector of the node i at iteration t
  • e_{ij} is the feature vector of the edge between node i and node j
  • \phi is the message function. It performs a combination of the features from two nodes and the edge features. It is parametrized by a vector of parameters \theta_{\phi}. These parameters are learned by retro-propagation. The function is usually implemented as a shallow MLP.
  • \square is the aggregator. It performs the mix from the different messages. As the neighbors in a graph are not sorted and their number can vary, this aggregation operator should be normalized and commutative. Typical function to be used there are max or mean.
  • \gamma is the update function. It performs a mix between the old value and the aggregation of the messages. It could be as simple as an addition or could be implemented as a shallow MLP. Its parameters, denoted as \theta_{\gamma} are learned by retro-propagation.
  • Note that x_i can be a vector of feature and that the size can be different for each iteration (t).

The application of the convolution can be achived sequentially or in parallel.

For each node i (//):
    a=0
    for neighbors j of i:
        a=aggreg(a,phi(x[i],x[j],e[i][j]))
     x[i]=gamma(x[i],a)