Office365-REST-Python-Client icon indicating copy to clipboard operation
Office365-REST-Python-Client copied to clipboard

Is there a way to patch GraphClient to include timeout, retry and backoff?

Open dekiesel opened this issue 1 year ago • 3 comments

I am using GraphClient to connecto to Sharepoint and Planner. I check a long list of local files and compare them to files in SharePoint. If the files don't exist I create them in SharePoint and create a task in planner.

The issue I am facing is that this oftentimes timeouts, even in my unit tests. Is there away to patch the underlying requests object so that I can add timeout, retry and backoff?

The process is supposed to run unattended and these constant timeout issues are hanging the process indefinitely.

Thanks!

dekiesel avatar May 27 '24 07:05 dekiesel

I monkey patched the library to support timeout and retry.

def outer_execute_request_direct(self, request):
    retry = 5
    for i in range(retry):
        try:
            # response = self.inner_execute_request_direct(request)
            response = inner_execute_request_direct(self, request)
            break
        except requests.exceptions.ReadTimeout as rt:
            if i < retry - 1:
                continue
            else:
                raise rt
    return response


def inner_execute_request_direct(self, request):
    # type: (RequestOptions) -> requests.Response
    timeout = 60
    """Execute the client request"""
    self.beforeExecute.notify(request)
    if request.method == HttpMethod.Post:
        if request.is_bytes or request.is_file:
            response = requests.post(
                url=request.url,
                headers=request.headers,
                data=request.data,
                auth=request.auth,
                verify=request.verify,
                proxies=request.proxies,
                timeout=timeout,
            )
        else:
            response = requests.post(
                url=request.url,
                headers=request.headers,
                json=request.data,
                auth=request.auth,
                verify=request.verify,
                proxies=request.proxies,
                timeout=timeout,
            )
    elif request.method == HttpMethod.Patch:
        response = requests.patch(
            url=request.url,
            headers=request.headers,
            json=request.data,
            auth=request.auth,
            verify=request.verify,
            proxies=request.proxies,
            timeout=timeout,
        )
    elif request.method == HttpMethod.Delete:
        response = requests.delete(
            url=request.url,
            headers=request.headers,
            auth=request.auth,
            verify=request.verify,
            proxies=request.proxies,
            timeout=timeout,
        )
    elif request.method == HttpMethod.Put:
        response = requests.put(
            url=request.url,
            data=request.data,
            headers=request.headers,
            auth=request.auth,
            verify=request.verify,
            proxies=request.proxies,
            timeout=timeout,
        )
    else:
        response = requests.get(
            url=request.url,
            headers=request.headers,
            auth=request.auth,
            verify=request.verify,
            stream=request.stream,
            proxies=request.proxies,
            timeout=timeout,
        )
    return response




ClientRequest.execute_request_direct = outer_execute_request_direct

Now you can instantiate your client object.

With this my unittests don't fail if SPO doesn't answer.

The drawback is that the values are hardcoded. It would be very nice if this library supported this (and maybe even backoff) ootb.

dekiesel avatar May 27 '24 14:05 dekiesel

I wasn't happy with my results from yesterday so I improved upon the code by adding requests.Session.

tl;dr: sessions seem to perform much better (see analysis at the end).

class EnhancedSession(requests.Session):
    def __init__(self, timeout=60):
        self.timeout = timeout
        return super().__init__()

    def request(self, method, url, **kwargs):
        if "timeout" not in kwargs:
            kwargs["timeout"] = self.timeout
        return super().request(method, url, **kwargs)

EnhancedSession adds a timeout of 60 seconds to every session-call (can be overwritten at time of calling .get, .post ...).

I then subclassed ClientRequest to add an EnhancedSession and a Retry object that retries 5 times with a backoff_factor of 0.1.

The Retry object is then mounted to the session.

class MyClientRequest(ClientRequest):
    def __init__(
        self,
    ):
        self.session = EnhancedSession()
        retries = Retry(
            total=5,
            backoff_factor=0.1,
        )
        self.session.mount("http://", HTTPAdapter(max_retries=retries))
        self.session.mount("https://", HTTPAdapter(max_retries=retries))
        return super().__init__()

execute_request_direct is basically the same function as the one ClientRequest has, but this one uses the session object to make calls.

    def execute_request_direct(self, request):
        """Execute the client request"""
        self.beforeExecute.notify(request)  # type: ignore
        if request.method == HttpMethod.Post:
            if request.is_bytes or request.is_file:
                response = self.session.post(
                    url=request.url,
                    headers=request.headers,
                    data=request.data,
                    auth=request.auth,
                    verify=request.verify,
                    proxies=request.proxies,
                )
            else:
                response = self.session.post(
                    url=request.url,
                    headers=request.headers,
                    json=request.data,
                    auth=request.auth,
                    verify=request.verify,
                    proxies=request.proxies,
                )
        elif request.method == HttpMethod.Patch:
            response = self.session.patch(
                url=request.url,
                headers=request.headers,
                json=request.data,
                auth=request.auth,
                verify=request.verify,
                proxies=request.proxies,
            )
        elif request.method == HttpMethod.Delete:
            response = self.session.delete(
                url=request.url,
                headers=request.headers,
                auth=request.auth,
                verify=request.verify,
                proxies=request.proxies,
            )
        elif request.method == HttpMethod.Put:
            response = self.session.put(
                url=request.url,
                data=request.data,
                headers=request.headers,
                auth=request.auth,
                verify=request.verify,
                proxies=request.proxies,
            )
        else:
            response = self.session.get(
                url=request.url,
                headers=request.headers,
                auth=request.auth,
                verify=request.verify,
                stream=request.stream,
                proxies=request.proxies,
            )
        return response

The last step before instantiating client is monkey patching ClientRequest:

ClientRequest = MyClientRequest

I ran my tests with yesterdays code five times (runtimes in seconds, rounded up):

# runtime (no sessions)
1 109
2 91
3 91
4 89
5 90
--- ---
Average 94

Then, I did the same with the code using sessions:

# runtime (sessions)
1 49
2 45
3 45
4 46
5 48
--- ---
Average 47

All in all an improvement of 50%. As these are very short unit tests I expect the improvement in busy applications to be even bigger due to the reduced overhead of creating the connection and logging in.

Let me know of you would like me to submit a PR.

dekiesel avatar May 28 '24 07:05 dekiesel

Another option to get timeouts in would be to add it to RequestOptions, if I understand the code correctly.

At the moment my code is unusable without timeouts as SPO intermittently stops answering and office365 waits for answers indefinitely.

I still think it would be beneficial to use sessions though.

dekiesel avatar May 29 '24 08:05 dekiesel