tabnet
tabnet copied to clipboard
high-performance {tabnet} profvis on CPU
This issue aims at improving tabnet performance through common tools and understanding of where to put effort on
proposed performance script
The goal here is to have the largest batch available to run on a CPU, in order to favor time spent in compute over time spent in data movement.
library(tabnet)
# use local caching
d_train <- data.table::fread(pins::pin("https://s3.amazonaws.com/benchm-ml--main/train-0.1m.csv"), stringsAsFactors=TRUE)
d_test <- data.table::fread(pins::pin("https://s3.amazonaws.com/benchm-ml--main/test.csv"))
## align cat. values (factors)
d_train_test <- rbind(d_train, d_test)
n1 <- nrow(d_train)
n2 <- nrow(d_test)
d_train <- d_train_test[1:n1,]
d_test <- d_train_test[(n1+1):(n1+n2),]
system.time({
md <- tabnet_fit(dep_delayed_15min ~ . ,d_train, device="cpu",
epochs = 5, batch_size = 1024^2,
virtual_batch_size=262144, verbose = TRUE)
})
result table proposed
CPU Linux
| Actual CPU profile | Expected CPU profile | Actual profvis flame graph |
|---|---|---|
![]() |
![]() |
!!![]() |
Profviz Data

CPU Windows
| Actual CPU profile | Expected CPU profile | Actual profvis flame graph |
|---|---|---|
Profviz Data
CPU MacOS
| Actual CPU profile | Expected CPU profile | Actual profvis flame graph |
|---|---|---|


