distributed-learning-contributivity icon indicating copy to clipboard operation
distributed-learning-contributivity copied to clipboard

Elaborate a test scenario coherent with fertility context

Open bowni opened this issue 5 years ago • 3 comments

  • [x] Elaborate the scenario

  • [x] Identify a public dataset of choice for running this experimental scenario

  • [ ] Adapt library for working with this public dataset (in particular the NN architecture)

bowni avatar Mar 26 '20 08:03 bowni

Hello ! Here is a proposition of scenario:

4 nodes containing text and image for each cycle. We can consider 5 images per cycle. Node 1: 5000 cycles so 25000 images (48% of the 'complete' database) Node 2: 900 cycles so 4500 images (9%) Node 3: 3000 cycles so 15000 images (29%) Node 4: 1500 cycles so 7500 images (14%)

One cycle can't be in two nodes. So when we set up this scenario, we need to be careful on how we split the database to create the four nodes.

We may use this database as there are texts & images. It is a dataset used in this paper which @Thomas-Galtier shared. But I'm not sure they use the metadata in their model. We can also use only images.

What do you think?

celinejacques avatar Mar 27 '20 16:03 celinejacques

@celinejacques I am recycling this issue and linking it to your PyTorch PR!

bowni avatar Nov 06 '20 10:11 bowni

Hi guys @celinejacques @jeromechambost , doing some sorting in the old issues - is this one still of interest to you, and realistic?

bowni avatar Aug 31 '21 09:08 bowni