Sebastian Franz
Sebastian Franz
It would be nice to have a tutorial how to use custom embedders with biotrainer. This way, new protein language models can be used directly in biotrainer without having to...
The ppi interaction mode is not yet compatible with all protocols yet. `sequence_to_class` have been tested throughout. Other per-sequence protocols should work as well. However, for per-residue tasks (`residue_to_class`), changes...
After the cross_validation PR will be merged, parameter search for nested cross validation will be enabled. It would be nice to extend this behaviour also to hold_out cross validation. A...
As a researcher, it would be nice to have an automatic random baseline as a comparison for every run. This could be included in the final test metrics: `test set...
The LightAttention model used for residues_to_class protocol uses BatchNorm1D. However, if using a batch size of 1 is not possible with BatchNorm1D. Because a batch size of 1 is an...
Currently, at first the config file is loaded (but not completely sanity checked yet, for example biotrainer does not care if the input files actually exist, so embeddings might be...
Implement residue to value protocol: ```text residue_to_value --> Predict a value V for each residue encoded in D dimensions in a sequence of length L. Input BxLxD --> output BxLxV...
Many machine learning researchers are using different platforms to visualize their parameters and model output and training. At the moment, we are only supporting tensorboard. It would be possible to...
Currently, embeddings must be pre-calculated before the training process starts. For some users, it might be beneficial to calculate the embeddings on the fly, especially if they only require low...
Currently, if a user already ran biotrainer and thus there already exist pre-computed embeddings, these embeddings do not get re-computed if the sequences.fasta file has changed. This might be a...