Zhiting Hu issues

Results 6 issues of


                                            Zhiting Hu

Texar's implementation of BERT

Just came across the awesome list. [Texar](https://github.com/asyml/texar), a general-purpose text generation toolkit, has also implemented BERT [here](https://github.com/asyml/texar/tree/master/examples/bert) for classification, and text generation applications by combining with Texar's other modules.

Polish error msg

https://github.com/asyml/texar/blob/bd2dbe42d27a86b3ec1cc048ac843aa6f255d8fa/texar/tf/modules/pretrained/xlnet.py#L175 Here, `"model name"` is confusing. Should it be "model variable name"? Same applies to other pretrained modules, and texar pytorch (if any).

enhancement

topic: modules

SentencePieceTokenizer vocab

SentencePieceTokenizer should also be able to directly take a vocab. Now it looks only take a vocab file as [here](https://texar-pytorch.readthedocs.io/en/latest/code/data.html#texar.torch.data.SentencePieceTokenizer.default_hparams)

enhancement

topic: modules

A bunch of doc issues

- [ ] `TransformerDecoder.forward`: where does `self.training` come from? https://github.com/asyml/texar-pytorch/blob/d17d502b50da1d95cb70435ed21c6603370ce76d/texar/torch/modules/decoders/transformer_decoders.py#L448-L449 - [ ] All arguments should say their types **explicitly** in the docstring. E.g., what is the type of `infer_mode`?...

priority: high

topic: docs

Doc polish: "Data Loaders" --> "Datasets"

The section is titled "Data Loaders" https://texar-pytorch.readthedocs.io/en/latest/code/data.html#data-loaders Would "Datasets" be better? Or does "Data Loaders" fit the Pytorch convention better? @huzecong @AvinashBukkittu

enhancement

topic: docs

Texar: a general-purpose text generation toolkit based on TF

The list is awesome. It'd be great if you can add Texar to the list :) Texar is an open source toolkit aiming to support a broad set of machine...