Generate history conversation filenames in Chinese properly.
Describe the changes you have made:
There's a problem with generating history conversation filenames in Chinese and other languages without blanks between words for a long time. If users input their first request in languages like Chinese, they would get a history conversation json file named as something like __March_28_2024_19-59-01.json almost every time. This was caused by the old way to generate the first part of filename: self.messages[0]["content"][:25].split(" ")[:-1]. This would get a blank string if there's no " "(blank space) in users' first input. I made a small patch to fix this. Now it will name history conversation files like 这是一句中文__March_28_2024_19-59-01.json if got users' first input in Chinese.
Reference any relevant issues (e.g. "Fixes #000"):
Pre-Submission Checklist (optional but appreciated):
- [x] I have included relevant documentation updates (stored in /docs)
- [x] I have read
docs/CONTRIBUTING.md - [x] I have read
docs/ROADMAP.md
OS Tests (optional but appreciated):
- [ ] Tested on Windows
- [x] Tested on MacOS
- [x] Tested on Linux
Is it a good idea to request LLM summarizing the first turn of conversation as the filename automatically?
Great catch @Steve235lab. I think having the LLM summarize the first turn is tough because it uses an LLM call, which folks should be super aware of. Let's think about it in the future if we move into a more advanced UI like in @Notnaton's PR: https://github.com/OpenInterpreter/open-interpreter/pull/976
Great catch @Steve235lab. I think having the LLM summarize the first turn is tough because it uses an LLM call, which folks should be super aware of. Let's think about it in the future if we move into a more advanced UI like in @Notnaton's PR: #976
@KillianLucas Ok, I see. What if we make it a configurable optional feature and turn it off by default? Maybe I will implement this later. It won't make huge improvements anyway because we cannot use a very long file name to store much information, which would cause problems as different file systems support different max filename length. But there's a solution: we use UUID or hash value as filename of conversation json files, and add one more file to store a K-V structure of UUID and conversation meta data. In this way we can have more detailed information of history conversations stored.