Multi-layer Graph Attention Network Model (Veličković et al. 2018)
model_gat.RdStacks multiple GAT layers with multi-head attention.
Usage
model_gat(
in_features,
hidden_dims,
out_features,
heads = 8,
out_heads = 1,
activation = nnf_elu,
out_activation = NULL,
dropout = 0.6,
att_dropout = 0.6,
negative_slope = 0.2
)Arguments
- in_features
Integer. Number of input features per node
Integer vector. Dimensions of hidden layers (length = L)
- out_features
Integer. Number of output features (typically 1 for regression)
- heads
Integer. Number of attention heads for hidden layers. Default: 8
- out_heads
Integer. Number of attention heads for output layer. Default: 1
- activation
Function. Activation for hidden layers. Default: torch::nnf_elu
- out_activation
Function or NULL. Activation for output layer. Default: NULL
- dropout
Numeric. Dropout rate (0-1) applied to attention and features. Default: 0.6
- att_dropout
Numeric. Dropout rate for attention coefficients. Default: 0.6
- negative_slope
Numeric. Negative slope for LeakyReLU in attention. Default: 0.2
- x
Tensor
n_nodes x in_features. Node feature matrix (dense or sparse)- adj
Sparse torch tensor
n_nodes x n_nodes. Adjacency matrix defining graph structure. Must be a sparse COO tensor.
Details
Architecture:
L hidden GAT layers with configurable activation
1 output GAT layer with optional output activation
Total layers = length(hidden_dims) + 1
Each layer uses multi-head attention to learn importance weights for neighbor aggregation. Hidden layers typically concatenate attention heads, while the output layer averages them.
References
Veličković P., Cucurull, G., Casanova, A., Romero, A., Li, P., & Bengio, Y. (2018). Graph Attention Networks. International Conference on Learning Representations. doi:10.48550/arXiv.1710.10903
Examples
if (FALSE) { # \dontrun{
# Binary classification with 8-head attention
model <- gat_model(14, c(8, 8), 1, output_activation = nnf_sigmoid)
# Multi-class with 4 heads
model <- gat_model(
14,
c(16, 16),
3,
heads = 4,
output_activation = function(x) nnf_softmax(x, dim = -1)
)
# Regression with custom dropout
model <- gat_model(14, c(32, 32), 1, dropout = 0.5, att_dropout = 0.5)
} # }