g3d_layer
Graphormer layer with scalar distance features as attention bias.
Implements the G3D layer as described in [M-OFDFT], based on [Graphormer].
- class G3DLayer(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]
The G3D layer as described in [M-OFDFT].
out_channelsare computed by dividing the input dimension by the number of heads.Based on
TransformerConvas implemented intorch_geometric.nn.conv.transformer_conv.- __init__(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]
Initialize the G3DLayer.
- Parameters:
in_channels (
int) – Size of each input sample.heads (
int, optional) – Number of multi-head-attentions. (default: 1)edge_dim (
int) – Edge feature dimensionality (in case there are any). Edge features are added to the attention weights before applying the soft(arg)max. (default: 1)dropout (
float, optional) – Dropout probability of the MLP. Defaults to 0.0.attention_weight_dropout (
float, optional) – Dropout probability of the attention weights. Defaults to 0.0.mlp_hidden_dim (
int, optional) – Hidden dimensionality of the MLP. If None, defaults to in_channels.mlp_activation (
torch.nn.Module, optional) – Activation function of the MLP. Defaults to torch.nn.GELU().activation_dropout (
float, optional) – Dropout probability of the activation function. Defaults to 0.0.**kwargs (optional) – Additional arguments of torch_geometric.nn.conv.MessagePassing.
- Raises:
ValueError – If the number of heads does not divide the number of input channels.
- __setstate__(state: dict) None[source]
This method is called during unpickling.
If ‘cutoff_start’ is missing (as would be the case with an older checkpoint), it will be added with a default value.
- compute_attention(query_i: Tensor, key_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None)[source]
Compute the attention weights.
- forward(x: Tensor, edge_index: Tensor | SparseTensor, batch: Tensor, edge_attr: Tensor | None = None, length=None) Tensor[source]
Runs the forward pass of the module.
The forward pass is defined as: x = MHAtt(LN(input) + edge_attr) + input output = x + MLP(LN(x))
- Parameters:
x (
torch.Tensor) – The input node features.edge_index (
torch.Tensor) – The edge indices.batch (
torch.Tensor) – Batch assigning each node to a specific graph.edge_attr (
torch.Tensor, optional) – The edge features. (default: None)
- Returns:
The output node features.
- Return type:
- message(query_i: Tensor, key_j: Tensor, value_j: Tensor, edge_attr: Tensor, index: Tensor, ptr: Tensor | None, size_i: int | None, length: Tensor | None = None) Tensor[source]
Message function of the G3D layer. Computes the attention weights of each edge, added with the according edge_attr.
- Parameters:
query_i – query edge tensor of shape (E, heads, channels_per_head)
key_j – key edge tensor of shape (E, heads, channels_per_head)
value_j – value edge tensor of shape (E, heads, channels_per_head)
edge_attr – edge features
index – the indices describing where edges end
ptr – pointer to indicate where graph in a batch ends and starts
size_i – The dimension in which the softmax normalizes.
length – The length of the edges sorted as in the edge index.
Returns:
- class G3DLayerMul(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]
G3D layer with multiplicative attention bias.
- class G3DLayerMulSilu(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]
G3D layer with SiLU activation function.
- class G3DLayerSilu(in_channels: int, heads: int = 32, edge_dim: int = 1, dropout: float = 0.0, attention_weight_dropout: float = 0.0, mlp_hidden_dim: int | None = None, mlp_activation: ~torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.GELU'>, mlp_norm_layer: ~torch.nn.modules.module.Module = None, norm_layer_class: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.norm.layer_norm.LayerNorm'>, activation_dropout: float = 0.0, cutoff: float | None = None, cutoff_start: float = 0.0, **kwargs)[source]
G3D layer with SiLU activation function.