Hopefully addresses some aspects of #6985.

This PR is primarily designed to address some CI unit testing stability issues. Since our browser-based unit testing is handled via an external service, it's quite possible that our unit tests may be run alongside others in a manner that causes context-switching. Noting that we've had issues arise from context switching before - see #5467 / #5479 - it's quite possible that some of our prediction instability is arising from the same underlying problem - a user often does have background threads and processes running on their mobile device, after all.

Noting the similarity between this and intermittent failures for predictive text to produce no results when it often otherwise produces something with identical text, if my suspicions are correct, this may actually help mitigate some of the aspects of #6985.

The inner class ExecutionTimer here is an attempt to determine the predictive-text search's active execution time (that is, "actual time on the processor") by detecting possible context-switching time intervals and ignoring them. If the longest-interval observed at any point is at least twice as long as the next two longest-intervals observed, we'll call that an outlier. "Zero-length intervals" will also be ignored, since otherwise even a 1ms interval would automatically be an outlier.

[1, 1, 4]: the '4' is considered an outlier
[1, 2, 4]: the '4' isn't - 3*2 > 4, after all.

Note that excluded outliers aren't considered for any future outliers. The expectation/assumption is that each measured loop iteration should take a consistent amount of time... unless affected by context switching. To facilitate this, I've set things up for the timer (as used) to avoid tracking time in the search's yield state.

@keymanapp-test-bot skip

I think we should be fine without user testing... though I could see an argument for a mild bit of user testing for predictive text in the mobile apps if someone wanted to push for it.

Aug 04 '22 08:08 jahorton

User Test Results

Test specification and instructions

User tests are not required

Test Artifacts

Android
Developer
iOS
- Keyman for iOS (simulator image)
- FirstVoices Keyboards for iOS (simulator image)
- TestFlight internal PR build version - 16.0.43 (0.7037.6741)
Keyboards
- Test Keyboards
Web
- KeymanWeb Test Home
Windows
- Keyman for Windows
- FirstVoices Keyboards for Windows

Aug 04 '22 08:08 keymanapp-test-bot[bot]

I've re-run the lm-worker test multiple times now - the unit test that's addressed by these changes hasn't failed yet out of at least 5 attempts.

...not that this prevented BrowserStack from failing to run against one of the configured test browser configs for some of those attempts. 😞

Aug 05 '22 03:08 jahorton

This comment (https://github.com/keymanapp/keyman/pull/7036#issuecomment-1207594363) on the base branch also occurs here, including for the LMLayer's in-browser tests.

Aug 08 '22 05:08 jahorton

I think we should be fine without user testing... though I could see an argument for a mild bit of user testing for predictive text in the mobile apps if someone wanted to push for it.

It's a functional change internally. I'd like to see some user tests.

Aug 08 '22 18:08 mcdurdin

I've got no idea what sort of behaviour we are going to see on mobile devices for these scenarios. Sounds like some testing on low-end devices is seriously needed.

What did you have in mind for this? Do we have access to suitable devices for such tests? I assume that we'd need physical devices to test this; after all, this PR's changes are aimed at handling OS behavior that'd be very difficult to emulate directly.

Or... is that why it's been given approval - for a 'test in the wild' on alpha? Not the sort of thing we'd usually do, so unless explicitly stated otherwise, I'll avoid merging for now.

Aug 16 '22 01:08 jahorton

What did you have in mind for this? Do we have access to suitable devices for such tests? I assume that we'd need physical devices to test this; after all, this PR's changes are aimed at handling OS behavior that'd be very difficult to emulate directly.

Yeah, I am struggling with this. We do have a couple of old devices in the Keyman office there don't we?

Or... is that why it's been given approval - for a 'test in the wild' on alpha? Not the sort of thing we'd usually do, so unless explicitly stated otherwise, I'll avoid merging for now.

I'm not sure what else we can do here. So if possible, test on those old devices in the office there. And then go ahead and merge.

Aug 16 '22 04:08 mcdurdin

(Of course, the alternative is to avoid timing-based approaches altogether, and go with a 'number of iterations' approach where we determine a suitable number based on empirical testing of what would be a typically acceptable response time on some old devices. (And then as devices get faster over time, we can gradually raise the threshold.)

Aug 16 '22 04:08 mcdurdin

Changes in this pull request will be available for download in Keyman version 16.0.60-alpha

Sep 10 '22 18:09 keyman-server

fix(web): enhanced timer for prediction algorithm

User Test Results

Test Artifacts