g3d_layer

Graphormer layer with scalar distance features as attention bias.

Implements the G3D layer as described in [M-OFDFT], based on [Graphormer].

class G3DLayer(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]

The G3D layer as described in [M-OFDFT].

out_channels are computed by dividing the input dimension by the number of heads.

Based on TransformerConv as implemented in torch_geometric.nn.conv.transformer_conv.

__init__(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]

Initialize the G3DLayer.

Parameters:

in_channels (int) – Size of each input sample.
heads (int, optional) – Number of multi-head-attentions. (default: 1)
edge_dim (int) – Edge feature dimensionality (in case there are any). Edge features are added to the attention weights before applying the soft(arg)max. (default: 1)
dropout (float, optional) – Dropout probability of the MLP. Defaults to 0.0.
attention_weight_dropout (float, optional) – Dropout probability of the attention weights. Defaults to 0.0.
mlp_hidden_dim (int, optional) – Hidden dimensionality of the MLP. If None, defaults to in_channels.
mlp_activation (torch.nn.Module, optional) – Activation function of the MLP. Defaults to torch.nn.GELU().
activation_dropout (float, optional) – Dropout probability of the activation function. Defaults to 0.0.
**kwargs (optional) – Additional arguments of torch_geometric.nn.conv.MessagePassing.

Raises:

ValueError – If the number of heads does not divide the number of input channels.

__repr__() → str[source]: Representation of the G3D layer.

__setstate__(state: dict) → None[source]

This method is called during unpickling.

If ‘cutoff_start’ is missing (as would be the case with an older checkpoint), it will be added with a default value.

compute_attention(query_i: Tensor, key_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None)[source]: Compute the attention weights.

forward(x: Tensor, edge_index: Tensor | SparseTensor, batch: Tensor, edge_attr: Tensor | None = None, length=None) → Tensor[source]

Runs the forward pass of the module.

The forward pass is defined as: x = MHAtt(LN(input) + edge_attr) + input output = x + MLP(LN(x))

Parameters:

x (torch.Tensor) – The input node features.
edge_index (torch.Tensor) – The edge indices.
batch (torch.Tensor) – Batch assigning each node to a specific graph.
edge_attr (torch.Tensor, optional) – The edge features. (default: None)

Returns:

The output node features.

Return type:

torch.Tensor

message(query_i: Tensor, key_j: Tensor, value_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None, length: Tensor | None = None) → Tensor[source]

Message function of the G3D layer. Computes the attention weights of each edge, added with the according edge_attr.

Parameters:

query_i – query edge tensor of shape (E, heads, channels_per_head)
key_j – key edge tensor of shape (E, heads, channels_per_head)
value_j – value edge tensor of shape (E, heads, channels_per_head)
edge_attr – edge features
index – the indices describing where edges end
ptr – pointer to indicate where graph in a batch ends and starts
size_i – The dimension in which the softmax normalizes.
length – The length of the edges sorted as in the edge index.

Returns:

reset_parameters()[source]: Resets all learnable parameters of the module.

class G3DLayerMul(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]

G3D layer with multiplicative attention bias.

compute_attention(query_i: Tensor, key_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None)[source]: Compute the attention weights.

class G3DLayerMulSilu(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]

G3D layer with SiLU activation function.

compute_attention(query_i: Tensor, key_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None)[source]: Compute the attention weights.

class G3DLayerSilu(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]

G3D layer with SiLU activation function.

compute_attention(query_i: Tensor, key_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None)[source]: Compute the attention weights.