Ting Chen

Results 65 comments of Ting Chen

for augmentation, we find random crop and flip seem sufficient; adding color or blur yields pretty similar results. this is likely due to the augmentation used in pretraining

hidden1 and hidden2 are both in (bsz, dim) so corresponding rows in hidden1 and hidden2 are positive.

No, they are trained with standard resnet settings (e.g. random crop augmenttion, no color augmentation or gaussian blur). With the extra augmentations and a longer training schedule, you could expect...

offsetting/masking the identical pair similarity computation so that each augmented example compares with the other augmented view from the same original example, and augmented views of all other examples in...

We haven't ablated this choice empirically but it seems obvious to exclude the cosine similarity computation of identical examples as the logits will always be a constant 1. On Wed,...

Using more positive examples would converge faster (in iterations) but at a cost of more compute per iteration. There's a trick for doing it more efficient in SWAV (i.e. multi-crop).

It should in principle be able to train on multiple GPUs (using tf2 code with the mirrored strategy), but we might not have tested it.

Are you using [TF2 implementation](https://github.com/google-research/simclr/tree/master/tf2)? It should print logging information as it runs.

creating a dummy label may suffice. label won't be used anyway for the pretraining of backbone network.

These are checkpoints and you could specify `FLAGS.checkpoint=/path/to/directory` to load the checkpoint for fine-tuning. On Wed, Feb 9, 2022 at 11:16 AM fclb ***@***.***> wrote: > Finally, I was able...