Bayesian regularization support in DALEX
In some social science fields large data do not exist and researchers must make decisions using small number of samples (p >> n problem) Good to see support in R (tfprobability, brnn packages) Wondering if the DALEX team has any thoughts/comments on this?
@asheetal size of the data shall not matter in the implemented XAI techniques (nor local nor global), but let's try, do you have any trained models for tests?
In a recent experiment with p >> n what I did was as follows
create an p x l array p = predictor, l = 1000 below
for (i in 1:1000) {
randomize the seed
build a keras model
generate variable importance rank with DALEX
against each predictor append the rank number from DALEX into its list
}
sort the predictor array based on how many times that predictor has received ones, followed by twos etc etc
It indeed helped. The final rank was a histogram against each predictor. I found that if I had run it once (l=1) I would have gotten completely inaccurate results.
Forgot to add. The problem is not within DALEX. The problem is the model itself. For p >> n, the model must be Bayesian probabilitic. So must work in conjunction tfprobability etc models, so that now the variable importance is not a rank rather a probabilistic range of ranks. The researcher can now choose to decide how to infer the rank - median, max, min, overlapping.