User study bugfix
Fixes/updates the following server-side components:
-
[x] Upgrade CUDA drivers
11.4 -> 12.2and NVIDIA to gpgpu (not actually part of this PR code-wise; but was necessary) -
[x] use
vLLMfor batching requests and Paged Attention. Engines at0.9fractional GPU utilisation;20GBswap space. -
[x] Add
StarCoder2-3bas a backend model, replacingCodeGPTandUniXCoder.- [x] Load in
float16. - [ ] Ensure infilling-mode works correctly (see hf thread)
- [x] Load in
-
[ ] Why do we store ground truths only for accepted completions?
-
[x] Store
v1user requests underdata/user_uuid/json_uuid.json, to avoid counting all invocations on every request. However, this brings two issues:- [ ] Need to convert previous data from
user_uuid-json_uuid.jsontouser_uuid/json_uuid.json; but this can be done with a simple replacement command on the server. - [ ] Mark's data analysis scripts may need to be updated to follow this convention. (the first thing mine do is sort the data into this
user/jsonstructure to make processing locally manageable).
- [ ] Need to convert previous data from
-
[x] Fix User Study passthrough filter; I forgot to save before amending my last commit on the
aral_user_studybranch.
Client side (vsc):
- [x] Fix
shown_timesis used before declared.
You can also cache the counts per user for the current runtime; that way the user's folder is only globbed once. This can be done since there's only one instance of the app running on the backend, so we don't run into the split-brain problem. The counts can then be incremented in memory when a new completion is done. Ideally, all this data should actually be stored in a proper database instead of a flat file system, but I am not sure if that's a transition it is worthwhile to make at this stage.