Barry (Xuanyi) Dong comments

Results 65 comments of


                                            Barry (Xuanyi) Dong

Hyperparameters

Same thing happens when I rewrite this model in Pytorch. But I train models with RGB images, does this effect the training a lot?

Difference between GDAS variants

Make a [commitment](https://github.com/D-X-Y/AutoDL-Projects/commit/8d0799dfb168d4410d71c889207b95b17d2ea511) to ablatively study this issue.

Difference between GDAS variants

TODO: Run `python ./exps/NATS-algos/search-cell.py --dataset cifar10 --data_path $TORCH_HOME/cifar.python --algo gdas_v1 --rand_seed 777` and `python ./exps/NATS-algos/search-cell.py --dataset cifar10 --data_path $TORCH_HOME/cifar.python --algo gdas --rand_seed 777` to see the performance difference.

Fail to generate the onnx file

Unfortunately, I'm not familiar with ONNX.

Questions about DARTS

Thanks for pointing out these questions. (1). (k+1)k/2 is because for the k-th node, you have (k+1) preceding nodes. Selecting two from them has C(K+1, 2) possibilities. 2 input nodes...

Questions about DARTS

Because for each cell, they also allow the output of two previous cells as inputs, so for the 1-th first node in a cell, its preceding nodes are `[last-cell-outputs, second-last-cell-outputs]`....

Questions about DARTS

The `last-cell-outputs` is the output of green box `c_{k-1}`. The `first-node-outputs` is the output of blue box `0`.

Questions about DARTS

You could have a look at our code: https://github.com/D-X-Y/AutoDL-Projects/blob/main/lib/models/cell_searchs/search_model_gdas.py#L89

Questions about DARTS

> @D-X-Y > > in your [coding](https://github.com/D-X-Y/AutoDL-Projects/blob/main/xautodl/models/cell_searchs/search_model_gdas.py#L121), would you be able to describe how the logic of `hardwts = one_h - probs.detach() + probs` is used in the forward search...

Questions about DARTS

> For the question on `hardwts` , see the note section inside https://pytorch.org/docs/stable/nn.functional.html#gumbel-softmax > > ``` > The main trick for hard is to do y_hard - y_soft.detach() + y_soft...