ElasticPress Some feedback and ideas around the dummy request in Autosuggest.php

Hey folks! I was reviewing the code in Autosuggest.php recently and a few things stood out to me, which I'd like to share with you below. Please note that I do not have deep knowledge of the ElasticPress codebase so some of the points may be quite off, but either way I think a few of them will be useful.

The primary purpose of the generate_search_query() method seems to be able to obtain a JSON payload with a ploceholder, which can later be modified and sent to an ES endpoint via JavaScript. I understand the need for a dummy new WP_Query to be able to generate that JSON payload, but what I don't understand is the fact that it's being followed-through in intercept_search_request with an actual wp_remote_request(). We already have the $this->autosuggest_query set, so why are we doing a remote request, then caching that request (with the stripped response) and never using it anywhere?
A call to wp_remote_request() may result in a WP_Error object, and the caching techniques in intercept_search_request() may result in that error object being stored in Redis/Memcached indefinitely, because a WP_Error is not equal to false. Luckily though, it doesn't seem to be used anyway, as mentioned earlier.
In that same intercept_search_request() method, the cache key for transients is different from a cache key to be used with a persistent cache backend. Not sure why that's the case, nor why we have a dedicated ep_autosuggest group to potentially hold a single item for a fake dummy request that's not even needed.
There's some logic inside WordPress transients that already defaults to using object cache functions instead of the DB if a persistent backend is available, so the whole split on wp_using_ext_object_cache() is really unnecessary, but we do have to settle on a cache key.
The way persistent object cache is handled in delete_cached_query() is a bit questionable. Most persistent backends, even ones extending WP_Object_Cache will not do anything if their local ->$cache is touched directly. There is no uniform way to delete an entire group unfortunately, as many backends do not even have a concept of a "group", however, if we use a single cache key, just like with transients, we will not have to delete any groups.
The remove_filter( 'ep_intercept_remote_request', '__return_true' ); in generate_search_query() may remove an explicit __return_true set by something else (at the default priority) earlier. If we want to add a filter and remove it soon after, it's best to avoid such global callbacks. Perhaps an anonymous $return_true function, or a unique method, similar to return_empty_posts would work better.

I'll be happy to elaborate on any of these problems, so let me know if you have any questions or ideas.

Jul 12 '22 12:07 kovshenin

Hey @kovshenin, thanks for taking the time to leave your comments here!

As this code is a result of several different iterations, I had to do some investigation to answer some of your questions but here are some of the reasons why things are like they are.

As autosuggest queries run completely unauthorized from the final user's browser, that request (authenticated) is fired to make the ES server aware of what a "normal" query would look like. The hosting service could use that request as a template, for example.
Although I get your point, the objective here is simply to avoid sending to the ES hosting service the same query. As the cache should be erased if a potential change in the autosuggest query is introduced (the delete_cached_query method), the request would be fired again then. (more on that method in #5.)
I found the reason for that in this PR: https://github.com/10up/ElasticPress/pull/1565. FWIW, in some specific scenarios, more complex sites could have different autosuggest queries.
Yup, if we decide to use a single key we can use a simple call to set_transient.
It is questionable indeed and deserves to be revisited as soon as possible. Hopefully deleting a group will be easier in the near future.
Noted, thanks

That all said, we probably need to revisit most of this implementation and potentially deprecate at least a part of it. I'll bring this to our next internal sync and leave any updates right after. Thanks again!

Jul 15 '22 15:07 felipeelia

We are going to send the authenticated request during syncs and after saving the weighting dashboard.

Feb 27 '23 16:02 anjulahettige