edsl
edsl copied to clipboard
For each inference service, add methods to learn tokens-per-minute and requests-per-minute limits
We need be able to adjust TPM/RMP limits dynamically.
If too hard, these should be user settings on coop.
@zer0dss This naturally goes w/ price look-ups.