Adam issues

Results 12 issues of


                                            Adam

Set formatting for specific cell

Hi, Thanks for the awesome tool. I see that there is a `set_style` method for setting the style for a specific column by index. Is there a way to do...

question

Where do test dirs props, null, and ne come from?

Hi! I noticed in `make-wsj-test.sh` and `make-brown-test.sh` that we try to zcat a `props`, `null`, and `ne` file from `test.wsj`. However, in the `extract_test_from_ptb.sh` and `extract_test_from_brown.sh` scripts, none of these...

WIP: single embeddings matrix design

Throwing this WIP up to store all vocabs in a single embeddings matrix shared between source, target, and features. This will fix the current pointer-generator issues when we have disjoint...

Different sized encoder for TransformerDecoder

It would be convenient to allow the encoder [output_size](https://github.com/CUNY-CL/yoyodyne/blob/master/yoyodyne/models/modules/lstm.py#L99) to be different from the TransformerDecoder embedding size. To illustrate the issue with this, the below code snippet ```python import torch...

enhancement

Dynamic gradient accumulation

Training UNK tokens

Currently we create a vocabulary of all items in all datapaths specified to the training script. However, we may want to study how models perform when provided unknown symbols. In...

enhancement

Concatenated features break symbol decoding

For models where the features are concatenated to the source string, we now handle this in the collator. We simply add the source_token vocabulary length to each feature index in...

enhancement

Add caching for transformer inference

Transformer inference (i.e. with no teacher forcing) is slow. In practice I think people typically implement some kind of caching so that at each timestep, we do not need to...

enhancement

Add self attention encoder

With the decoupling of encoders and decoders, we have added a `Linear` encoder, which seems to just embed the inputs and pass them along. We should also add a `SelfAttention`...

enhancement

new architecture

Add option to concatenate features for pointer generator

I was just thinking, though our pointer-generator implementation(s) take care to encode features separately so that they are not used in the attention distribution for the pointer probabilities, I think...

enhancement