Correctly validating new real-world OpenAI API Key format, relaxing negative tests
Why are these changes needed?
Using a real-world OpenAI API key, which successfully works with the OpenAI hosted models, I encountered a warning message that the API key was not valid. This red-herring lead to slow debugging, as the issue was in fact a space character in an agent name rather than anything to do with the API key.
Approach
I've removed two negative tests. Writing a validation implementation that passes these tests is possible, but in my opinion it is preferable to have a simple validation function that doesn't cover all possibilities. It seems like 99.9% of the time, a user would copy/paste their API Key, and the updated function should catch common positive and negative tests.
Additionally, in my opinion, false positives that log the warning message incorrectly are as bad as false negatives which would not indicate the API Key when it is not working.
Reviewers
@wrfly and @AaronWard PTAL TY!
Related issue number
n/a See https://github.com/microsoft/autogen/pull/3078 for last changes
Checks
- [X] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
- [X] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [X] I've made sure all auto checks have passed.
⚠️ GitGuardian has uncovered 6 secrets following the scan of your pull request.
Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.
Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard. Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.
🔎 Detected hardcoded secrets in your pull request
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 12853598 | Triggered | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
| - | - | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
| - | - | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
| 12853600 | Triggered | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
| 12853601 | Triggered | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
| 12853602 | Triggered | Generic High Entropy Secret | f5522a9163974d8ccd5c60eb9df7f0473d3c8211 | test/oai/test_utils.py | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
@microsoft-github-policy-service agree
This PR would normally fix this issue #3345. I'm facing the same problem and can't wait for this PR to be merged 👍 .
Thanks for connecting to that issue @AcePeed.
@sonichi Can you assign me as a reviewer?
Also facing same issue a valid API key format is out of the hands of this library and OpenAI returns an error when one is valid.
The updated regex still rejects my valid OpenAI api_key. It would need to be adjusted to allow for multiple underscores, possibly like ^sk-(?!.--)([a-zA-Z0-9]+(?:-[a-zA-Z0-9_]+))$
@buddycat can you give a test-case assertion for your key?
Haha, FYI, I tried to get autogen agent to fix this. I tweaked the code execution to track changes in git. You can view the iterations here: https://github.githistory.xyz/nstielau/autogen_iterations/blob/main/ai_regex_quiz__temp_0.95.py
It couldn't solve for the current test cases with my new test case :/
Here's a test-case that should be true:
sk-proj-111111111122222aaaaaaaa_7777hhhhh_111111111122222aaaaaaa_99900000fggfffhgg
Superseded by #3569