Ken Tsui

Results 15 comments of Ken Tsui

I am interested to contribute into this project. Is the goal to make dummy more data flexible? How about just read from a json file, whose path is configurable in...

@bitplane Yea, that's what I am thinking as well, so that it's easier to manage and can decouple so from the main.py I will propose this data structure. backend/test_data/ -...

> I've only been able to find the full wikihowAll.csv in one location that seems to require manually downloading, I'm not sure if there's some reason for it not being...

Thats what my notebook could generate now. I am still finetuning the formating/cleaning, prompt type and yet to add more template. Prompt types so far. Feel free to suggest. -...

To extend further, the pipeline can be applied to each dataset individually and all datasets as an aggregated one. Functionalities: Score: - leverage multiple reward models (our own, others in...

I had thought more about it, and started writing some interface, and some quick implementation. Please let me know if you have any comment. I am going to propose a...

@pruksmhc Thanks for your question! Yes the FilterPipeline can filter everything that we score in the ScorerPipeline based on absolute statistics and relative statistics. So its flexible enough to include...

For `3. Evaluation` The [repo ](https://github.com/HLTCHKUST/chatgpt-evaluation)can be a benchmark with 23 datasets and they had tested against ChatGPT. Looks like a good framework and baseline for us to start with,...

Also added a POC I had done: [REALM encoded wikipedia data](https://github.com/kenhktsui/open-information-retrieval)

Adding one more consideration here: there are (at least) three ways of incorporating retrieval into LLM, with different degrees of coupling. 1. Embedding used for retrieval is trained jointly with...