User study bugfix

Open Ar4l opened this issue 1 year ago • 1 comments

Fixes/updates the following server-side components:

[x] Upgrade CUDA drivers 11.4 -> 12.2 and NVIDIA to gpgpu (not actually part of this PR code-wise; but was necessary)
[x] use vLLM for batching requests and Paged Attention. Engines at 0.9 fractional GPU utilisation; 20GB swap space.
[x] Add StarCoder2-3b as a backend model, replacing CodeGPT and UniXCoder.
- [x] Load in float16.
- [ ] Ensure infilling-mode works correctly (see hf thread)
[ ] Why do we store ground truths only for accepted completions?
[x] Store v1 user requests under data/user_uuid/json_uuid.json, to avoid counting all invocations on every request. However, this brings two issues:
- [ ] Need to convert previous data from user_uuid-json_uuid.json to user_uuid/json_uuid.json; but this can be done with a simple replacement command on the server.
- [ ] Mark's data analysis scripts may need to be updated to follow this convention. (the first thing mine do is sort the data into this user/json structure to make processing locally manageable).
[x] Fix User Study passthrough filter; I forgot to save before amending my last commit on the aral_user_study branch.

Client side (vsc):

[x] Fix shown_times is used before declared.

Mar 20 '24 12:03 Ar4l

You can also cache the counts per user for the current runtime; that way the user's folder is only globbed once. This can be done since there's only one instance of the app running on the backend, so we don't run into the split-brain problem. The counts can then be incremented in memory when a new completion is done. Ideally, all this data should actually be stored in a proper database instead of a flat file system, but I am not sure if that's a transition it is worthwhile to make at this stage.

Mar 20 '24 12:03 FrankHeijden