Adam
Adam
Hi, Thanks for the awesome tool. I see that there is a `set_style` method for setting the style for a specific column by index. Is there a way to do...
Hi! I noticed in `make-wsj-test.sh` and `make-brown-test.sh` that we try to zcat a `props`, `null`, and `ne` file from `test.wsj`. However, in the `extract_test_from_ptb.sh` and `extract_test_from_brown.sh` scripts, none of these...
Throwing this WIP up to store all vocabs in a single embeddings matrix shared between source, target, and features. This will fix the current pointer-generator issues when we have disjoint...
It would be convenient to allow the encoder [output_size](https://github.com/CUNY-CL/yoyodyne/blob/master/yoyodyne/models/modules/lstm.py#L99) to be different from the TransformerDecoder embedding size. To illustrate the issue with this, the below code snippet ```python import torch...
Currently we create a vocabulary of all items in all datapaths specified to the training script. However, we may want to study how models perform when provided unknown symbols. In...
For models where the features are concatenated to the source string, we now handle this in the collator. We simply add the source_token vocabulary length to each feature index in...
Transformer inference (i.e. with no teacher forcing) is slow. In practice I think people typically implement some kind of caching so that at each timestep, we do not need to...
With the decoupling of encoders and decoders, we have added a `Linear` encoder, which seems to just embed the inputs and pass them along. We should also add a `SelfAttention`...
I was just thinking, though our pointer-generator implementation(s) take care to encode features separately so that they are not used in the attention distribution for the pointer probabilities, I think...