Improve ability to properly handle failures and retries for API calls that are not idempotent
Describe the solution you'd like
Many Reddit API calls can be repeated without anything bad happening (e.g., removing a post is more or less idempotent, you might get multiple log entries and a few timestamps might be updated, but nothing bad happens), but some API calls are not so friendly. This is particularly the case for API calls that create content such as submitting a post, making a comment, or sending a message.
The problem is that when you call something like subreddit.submit(...) there's currently no way to find out how many times the underlying Reddit API call was made, what intermediate errors happened, etc. Most of the time when you call submit, you create one post or you get one failure. But sometimes, you create multiple posts for one reason or another. It could be due to retries after failures that weren't actually failures, time outs when the call actually worked, etc. Sometimes it was a failure and the retry was definitely needed, but the failure was a "partial success" that might need to be deleted (I've seen multiple examples of PRAW creating two posts, but the first post that only shows up in the user history and/or via search, but isn't indexed in /new).
Two ideas for how this could be improved:
- Add an option that causes PRAW to raise an exception any time an API call has been repeated after the call is done. This would allow running check/recovery logic that could do whatever needs to be done for that particular situation (e.g., remove any extra submissions found via the user history, /new, or /search). Something like this:
reddit.raise_exception_after_retries = True
try:
submission = subreddit.submit(...)
except prawcore.exceptions.retry as e:
# check/recovery logic to make sure we haven't made multiple submissions
except Exception as e:
# other error handling
reddit.raise_exception_after_retries = False
The exception object would ideally also include some information about what happened: the number of retries, any errors returned prior to the call finally succeeding, etc.
- Add an option that disables retries completely and all errors, timeouts, etc. get raised the first time. This would allow writing your own retry logic instead. Something like this:
reddit.retries = False
for attempt in range(3):
try:
submission = subreddit.submit(...)
break
except Exception as e:
# call a check function to make sure it wasn't created or half-created so we can delete and try again and succeed without any errors or timeouts
check, result = submission_exists(title=mytitle, url=myurl)
if check:
if result == "good":
break
else:
check.delete()
reddit.retries = True
The logic would probably be more complex than that, of course.
Describe alternatives you've considered
I believe the only way to do this right now would be writing a logging filter or monkey patching praw or prawcore. Those do not seem like good solutions.
Additional context
Duplicate submissions and comments are the worst. That is all.
I really like the simplicity of the disable_retries option and would love to see a PR that introduces that capability for specific actions.
This issue is stale because it has been open for 20 days with no activity. Remove the Stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stale for 10 days with no activity.
This issue is stale because it has been open for 20 days with no activity. Remove the Stale label or comment or this will be closed in 10 days.
This issue was closed because it has been stale for 10 days with no activity.