sgmatch.models

Graph Matching Base Class

class sgmatch.models.matcher.graphMatcher(av: Namespace)[source]

A Wrapper Class for all the Graph Similarity / Matching models implemented in the library

Parameters:

av (Namespace) – Object of Namespace containing arguments to be passed to models

Returns:

The initialized model selected by the user through the ‘model_name’ key in dict ‘args’

Graph Matching Networks

class sgmatch.models.GMN.GMNEmbed(node_feature_dim: int, enc_node_hidden_sizes: List[int], prop_node_hidden_sizes: List[int], prop_message_hidden_sizes: List[int], aggr_gate_hidden_sizes: List[int], aggr_mlp_hidden_sizes: List[int], edge_feature_dim: Optional[int] = None, enc_edge_hidden_sizes: Optional[List[int]] = None, message_net_init_scale: float = 0.1, node_update_type: str = 'residual', use_reverse_direction: bool = True, reverse_dir_param_different: bool = True, layer_norm: bool = False)[source]

End to end implementation of Graph Matching Networks - Embed from the “Graph Matching Networks for Learning the Similarity of Graph Structured Objects” paper.

Parameters:
  • node_feature_dim (int) – Input dimension of node feature embedding vectors

  • enc_node_hidden_sizes ([int]) – Hyperparameter for the number of tensor slices in the Neural Tensor Network. In this domain, it denotes the number of interaction (similarity) scores produced by the model for each graph embedding pair.

  • prop_node_hidden_sizes ([int]) – Number of filters per convolutional layer in the graph convolutional encoder model.

  • prop_message_hidden_sizes ([int]) – Number of hidden neurons in each linear layer of MLP for reducing dimensionality of concatenated output of neural tensor network and histogram features. Note that the final scoring weight tensor of size [mlp_neurons[-1], 1] is kept separate from the MLP, therefore specifying only the hidden layer sizes will suffice.

  • aggr_gate_hidden_sizes ([int]) – Hyperparameter controlling the number of bins in the node ordering histogram scheme.

  • aggr_mlp_hidden_sizes ([int]) – Type of graph convolutional architecture to be used for encoding ('GCN' or 'SAGE' or 'GAT')

  • edge_feature_dim (int, Optional) – Type of activation used in Attention and NTN modules. ('sigmoid' or 'relu' or 'leaky_relu' or 'tanh') (default: None)

  • enc_edge_hidden_sizes ([int], Optional) – Slope of function for leaky_relu activation. (default: None)

  • message_net_init_scale (float) – Flag for including Strategy Two: Nodewise comparison from SimGNN. (default: 0.1)

  • node_update_type (str) – Slope of function for leaky_relu activation. (default: 'residual')

  • use_reverse_direction (bool) – Flag if need to use messages in reverse direction for node updates. (default: True)

  • reverse_dir_param_different (bool) – Slope of function for leaky_relu activation. (default: True)

  • layer_norm (bool) – Slope of function for leaky_relu activation. (default: True)

class sgmatch.models.GMN.GMNMatch(node_feature_dim: int, enc_node_hidden_sizes: List[int], prop_node_hidden_sizes: List[int], prop_message_hidden_sizes: List[int], aggr_gate_hidden_sizes: List[int], aggr_mlp_hidden_sizes: List[int], edge_feature_dim: Optional[int] = None, enc_edge_hidden_sizes: Optional[List[int]] = None, message_net_init_scale: float = 0.1, node_update_type: str = 'residual', use_reverse_direction: bool = True, reverse_dir_param_different: bool = True, attention_sim_metric: str = 'euclidean', layer_norm: bool = False)[source]

End to end implementation of Graph Matching Networks - Match from the “Graph Matching Networks for Learning the Similarity of Graph Structured Objects” paper.

TODO: Provide description of implementation and differences from paper if any

Parameters:
  • node_feature_dim (int) – Input dimension of node feature embedding vectors

  • enc_node_hidden_sizes ([int]) – Hyperparameter for the number of tensor slices in the Neural Tensor Network. In this domain, it denotes the number of interaction (similarity) scores produced by the model for each graph embedding pair.

  • prop_node_hidden_sizes ([int]) – Number of filters per convolutional layer in the graph convolutional encoder model.

  • prop_message_hidden_sizes ([int]) – Number of hidden neurons in each linear layer of MLP for reducing dimensionality of concatenated output of neural tensor network and histogram features. Note that the final scoring weight tensor of size [mlp_neurons[-1], 1] is kept separate from the MLP, therefore specifying only the hidden layer sizes will suffice.

  • aggr_gate_hidden_sizes ([int]) – Hyperparameter controlling the number of bins in the node ordering histogram scheme.

  • aggr_mlp_hidden_sizes ([int]) – Type of graph convolutional architecture to be used for encoding ('GCN' or 'SAGE' or 'GAT')

  • edge_feature_dim (int, Optional) – Type of activation used in Attention and NTN modules. ('sigmoid' or 'relu' or 'leaky_relu' or 'tanh') (default: None)

  • enc_edge_hidden_sizes ([int], Optional) – Slope of function for leaky_relu activation. (default: None)

  • message_net_init_scale (float) – Flag for including Strategy Two: Nodewise comparison from SimGNN. (default: 0.1)

  • node_update_type (str) – Slope of function for leaky_relu activation. (default: 'residual')

  • use_reverse_direction (bool) – Slope of function for leaky_relu activation. (default: True)

  • reverse_dir_param_different (bool) – Slope of function for leaky_relu activation. (default: True)

  • attention_sim_metric (str) – Slope of function for leaky_relu activation. (default: 'euclidean')

  • layer_norm (bool) – Slope of function for leaky_relu activation. (default: True)

GraphSim

class sgmatch.models.GraphSim.GraphSim(input_dim: int, gnn: str = 'GCN', gnn_filters: List[int] = [64, 32, 16], conv_filters: Optional[ModuleList] = None, mlp_neurons: List[int] = [32, 16, 8, 4, 1], padding_correction: bool = True, resize_dim: int = 10, resize_mode='bilinear', gnn_activation: str = 'relu', mlp_activation: str = 'relu', gnn_dropout_p: float = 0.5, activation_slope: Optional[float] = 0.1)[source]

End to end implementation of GraphSim from the “Learning-based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching” paper.

NOTE: Model assumes that node features of input graph data are arranged according to Breadth-First Search of the graph

TODO: Provide description of implementation and differences from paper if any

Parameters:
  • input_dim (int) – Input dimension of node feature vectors.

  • gnn (str, optional) – Type of Graph Neural Network to use to embed the node features ("Neuro-PNA" or "PNA" or "GCN" or "GAT"`or :obj:”SAGE”` or "GIN" or "graph" or "gated"). (default: 'GCN')

  • gnn_filters ([int], optional) – Number of hidden neurons in each layer of the GNN for embedding input node features. (default: [64,32,16])

  • conv_filters (torch.nn.ModuleList, optional) – List of Convolution Filters to be applied to each similarity matrix generated from each GNN pass. (default: None)

  • mlp_neurons ([int], optional) – Number of hidden neurons in each layer of decoder MLP (default: [32,16,8,4,1])

  • padding_correction (bool, optional) – Flag whether to include padding correction as specified in the paper which is voided due to batching of graphs (default: True)

  • resize_dim (int, optional) – Dimension to resize the similarity image matrices to. (default: 10)

  • resize_mode (str, optional) – Interpolation method to resize the similarity images (nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area' | 'nearest-exact'). (default: 'bilinear')

  • gnn_activation (str, optional) – Activation to be used in the GNN layers (default: relu)

  • mlp_activation (str, optional) – Activation to be used in the MLP decoder layers (default: relu)

  • activation_slope (int, optional) – Slope of negative part in case of "leaky_relu" activation (default: 0.1)

class sgmatch.models.GraphSim.GraphSim_v2(input_dim: int, conv_kernel_sizes, conv_in_channels, conv_out_channels, conv_stride, maxpool_kernel_sizes, maxpool_stride, cnn_dropout_p=0.2, gnn: str = 'GCN', gnn_filters: List[int] = [64, 32, 16], mlp_neurons: List[int] = [32, 16, 8, 4, 1], padding_correction: bool = True, resize_dim: int = 10, resize_mode='bilinear', gnn_activation: str = 'relu', mlp_activation: str = 'relu', gnn_dropout_p: float = 0.5, activation_slope: Optional[float] = 0.1)[source]

A more efficient implementation of GraphSim from the “Learning-based Efficient Graph Similarity Computation via Multi-Scale Convolutional Set Matching” paper.

Uses the grouped convolution layer in :object:`PyTorch`to speed up the embedding of heirarchical similarity image matrices by parallelizing computations. Prefer using this variant over version 1 if the convolution network architecture is the same for all similarity image matrices.

TODO: Provide description of implementation and differences from paper if any and update argument description

Parameters:
  • input_dim (int) – Input dimension of node feature vectors.

  • gnn (str, optional) – Type of Graph Neural Network to use to embed the node features ("Neuro-PNA" or "PNA" or "GCN" or "GAT"`or :obj:”SAGE”` or "GIN" or "graph" or "gated"). (default: 'GCN')

  • gnn_filters ([int], optional) – Number of hidden neurons in each layer of the GNN for embedding input node features. (default: [64,32,16])

  • conv_filters (torch.nn.ModuleList, optional) – List of Convolution Filters to be applied to each similarity matrix generated from each GNN pass. (default: None)

  • mlp_neurons ([int], optional) – Number of hidden neurons in each layer of decoder MLP (default: [32,16,8,4,1])

  • padding_correction (bool, optional) – Flag whether to include padding correction as specified in the paper which is voided due to batching of graphs (default: True)

  • resize_dim (int, optional) – Dimension to resize the similarity image matrices to. (default: 10)

  • resize_mode (str, optional) – Interpolation method to resize the similarity images (nearest' | 'linear' | 'bilinear' | 'bicubic' | 'trilinear' | 'area' | 'nearest-exact'). (default: 'bilinear')

  • gnn_activation (str, optional) – Activation to be used in the GNN layers (default: relu)

  • mlp_activation (str, optional) – Activation to be used in the MLP decoder layers (default: relu)

  • activation_slope (int, optional) – Slope of negative part in case of "leaky_relu" activation (default: 0.1)

NeuroMatch

class sgmatch.models.NeuroMatch.SkipLastGNN(input_dim: int, hidden_dim: int, output_dim: int, num_layers: int, conv_type: str = 'Neuro-PNA', dropout: float = 0.0, skip: str = 'learnable')[source]

End to end implementation of NeuroMatch from the “Neural Subgraph Matching” paper

TODO: Provide argument description

Parameters:
  • input_dim (int) – Input dimension of node feature vectors.

  • hidden_dim (int) – Dimension of

  • output_dim (int) – Input dimension of node feature vectors.

  • num_layers (int) –

  • conv_type (str, optional) – Type of Graph Neural Network to encode input features ("Neuro-PNA" or "PNA" or "GCN" or "GAT"`or :obj:”SAGE”` or "GIN" or "graph" or "gated"). (default: "Neuro-PNA")

  • dropout (float, optional) – Dropout probability to prevent overfitting (default: 0.0)

  • skip (str, optional) – Type of skip (default: "learnable")

SimGNN

class sgmatch.models.SimGNN.SimGNN(input_dim: int, ntn_slices: int = 16, filters: list = [64, 32, 16], mlp_neurons: List[int] = [32, 16, 8, 4], hist_bins: int = 16, conv: str = 'GCN', activation: str = 'tanh', activation_slope: Optional[float] = None, include_histogram: bool = True)[source]

End to end implementation of SimGNN from the “SimGNN: A Neural Network Approach to Fast Graph Similarity Computation” paper

TODO: Provide description of implementation and differences from paper if any

Parameters:
  • input_dim (int) – Input dimension of node feature embedding vectors.

  • ntn_slices (int, optional) – Hyperparameter for the number of tensor slices in the Neural Tensor Network. In this domain, it denotes the number of interaction (similarity) scores produced by the model for each graph embedding pair.

  • filters ([int], optional) – Number of filters per convolutional layer in the graph convolutional encoder model. (default: [64, 32, 16])

  • mlp_neurons ([int], optional) – Number of hidden neurons in each linear layer of MLP for reducing dimensionality of concatenated output of neural tensor network and histogram features Note that the final scoring weight tensor of size [mlp_neurons[-1], 1] is kept separate from the MLP, therefore specifying only the hidden layer sizes will suffice. (default: [32,16,8,4])

  • hist_bins (int, optional) – Hyperparameter controlling the number of bins in the node ordering histogram scheme. (default: 16)

  • conv (str, optional) – Type of graph convolutional architecture to be used for encoding ('GCN' or 'SAGE' or 'GAT') (default: 'GCN')

  • activation (str, optional) – Type of activation used in Attention and NTN modules. ('sigmoid' or 'relu' or 'leaky_relu' or 'tanh') (default: 'tanh)

  • activation_slope (float, optional) – Slope of function for leaky_relu activation. (default: None)

  • include_histogram (bool, optional) – Flag for including Strategy Two: Nodewise comparison from SimGNN. (default: True)

ISONET

class sgmatch.models.ISONET.ISONET(node_feature_dim: int, enc_node_hidden_sizes: List[int], prop_node_hidden_sizes: List[int], prop_message_hidden_sizes: List[int], edge_feature_dim: Optional[int] = None, enc_edge_hidden_sizes: Optional[List[int]] = None, message_net_init_scale: float = 0.1, node_update_type: str = 'GRU', use_reverse_direction: bool = True, reverse_dir_param_different: bool = True, layer_norm: bool = False, lrl_hidden_sizes: List[int] = [16, 16], temp: float = 0.1, eps: float = 1e-20, noise_factor: float = 1, gs_num_iters: int = 20)[source]

End-to-End implementation of the ISONET model from the “Interpretable Neural Subgraph Matching for Graph Retrieval” paper.

Parameters:
  • node_feature_dim (int) – Input dimension of node feature embedding vectors.

  • enc_node_hidden_sizes ([int]) – Number of hidden neurons in each linear layer for transforming the node features.

  • prop_node_hidden_sizes ([int]) – Number of hidden neurons in each linear layer of node update MLP f_node. node_feature_dim is appended as the size of the final linear layer to maintain node embedding dimensionality

  • prop_message_hidden_sizes ([int]) – Number of hidden neurons in each linear layer of message computation MLP f_node. Note that the message vector dimensionality (prop_message_hidden_sizes[-1]) may not be equal to node_feature_dim.

  • edge_feature_dim (int, optional) – Input dimension of node feature embedding vectors. (default: None)

  • enc_edge_hidden_sizes ([int], optional) – Number of hidden neurons in each linear layer for transforming the edge features. (default: None)

  • message_net_init_scale (float, optional) – Initialisation scale for the message net output vectors. (default: 0.1)

  • node_update_type (str, optional) – Type of update applied to node feature vectors ("GRU" or "MLP" or "residual"). (default: "GRU")

  • use_reverse_direction (bool, optional) – Flag for whether or not to use the reverse message aggregation for propagation step. (default: True)

  • reverse_dir_param_different (bool, optional) – Flag for whether or not message computation model parameters should be shared by forward and reverse messages in propagation step. (default: True)

  • layer_norm (bool, optional) – Flag for applying layer normalization in propagation step. (default: False)

  • lrl_hidden_sizes ([int], optional) – List containing the sizes for LRL network to pass edge features of input graphs. (default: [16,16])

  • temp (float, optional) – Temperature parameter in the Gumbel-Sinkhorn Network. (default: 0.1)

  • eps (float, optional) – Small value for numerical stability and precision in the Gumbel-Sinkhorn Network. (default: 1e-20)

  • noise_factor (float, optional) – Parameter which controls the magnitude of the effect of sampled Gumbel Noise. (default: 1)

  • gs_num_iters (int, optional) – Number of iterations of Sinkhorn Row and Column scaling (in practice, as little as 20 iterations are needed to achieve decent convergence for N~100). (default: 20)