Elaborate a test scenario coherent with fertility context
-
[x] Elaborate the scenario
-
[x] Identify a public dataset of choice for running this experimental scenario
-
[ ] Adapt library for working with this public dataset (in particular the NN architecture)
Hello ! Here is a proposition of scenario:
4 nodes containing text and image for each cycle. We can consider 5 images per cycle. Node 1: 5000 cycles so 25000 images (48% of the 'complete' database) Node 2: 900 cycles so 4500 images (9%) Node 3: 3000 cycles so 15000 images (29%) Node 4: 1500 cycles so 7500 images (14%)
One cycle can't be in two nodes. So when we set up this scenario, we need to be careful on how we split the database to create the four nodes.
We may use this database as there are texts & images. It is a dataset used in this paper which @Thomas-Galtier shared. But I'm not sure they use the metadata in their model. We can also use only images.
What do you think?
@celinejacques I am recycling this issue and linking it to your PyTorch PR!
Hi guys @celinejacques @jeromechambost , doing some sorting in the old issues - is this one still of interest to you, and realistic?