What is the context window?

So 4096 tokens.
Any experience in using it for more than 4096 tokens? any idea when checkpoints trained on more than 1 trillion tokens will be ready?
Any experience in using it for more than 4096 tokens? any idea when checkpoints trained on more than 1 trillion tokens will be ready?
If you have a conversation with it and lets say the 'story' exceeds 4096 tokens then the agent will start forgetting the story in the beginning. Think of the context length as a window which represents the models memory, as you keep having a discussion with the bot the window slides down, and it won't have the first messages in the input anymore and then it will slide down more etc.