pyfrc icon indicating copy to clipboard operation
pyfrc copied to clipboard

[BUG]: 2 second timeout on robot init is occasionally hit in tests

Open auscompgeek opened this issue 1 year ago • 3 comments

Problem description

We very occasionally hit an AssertionError here:

https://github.com/robotpy/pyfrc/blob/bc9ecadb8186e23f2a5240581d2c8d47e8e919c7/pyfrc/test_support/controller.py#L73-L75

So I am now dutifully following the instruction in that code comment.

We most recently hit this in CI. All our robot tests passed, except for the 1 test_all_autonomous that failed (the test runner then subsequently deadlocked): https://github.com/thedropbears/pycrescendo/actions/runs/8290744044/job/22689297401?pr=195#step:5:136

I can't anything obviously wrong with my team's code.

(I've somehow never replicated this on my own machines; only on team member's laptops, and now CI.)

Operating System

Windows, MacOS, Linux

Installed Python Packages

╭──────────────────────────┬────────────┬──────────╮
│ name                     │ version    │ location │
├──────────────────────────┼────────────┼──────────┤
│ attrs                    │ 23.2.0     │          │
│ bcrypt                   │ 4.1.2      │          │
│ cffi                     │ 1.16.0     │          │
│ cryptography             │ 41.0.7     │          │
│ hypothesis               │ 6.97.4     │          │
│ iniconfig                │ 2.0.0      │          │
│ mypy                     │ 1.8.0      │          │
│ mypy-extensions          │ 1.0.0      │          │
│ numpy                    │ 1.26.3     │          │
│ packaging                │ 23.2       │          │
│ paramiko                 │ 3.4.0      │          │
│ phoenix6                 │ 24.2.0     │          │
│ photonlibpy              │ 2024.2.6   │          │
│ Pint                     │ 0.23       │          │
│ pip                      │ 23.3.1     │          │
│ pluggy                   │ 1.3.0      │          │
│ pycparser                │ 2.21       │          │
│ pyfrc                    │ 2024.0.1   │          │
│ PyNaCl                   │ 1.5.0      │          │
│ pynetconsole             │ 2.0.4      │          │
│ pyntcore                 │ 2024.3.1.0 │          │
│ pytest                   │ 8.0.0      │          │
│ pytest-integration       │ 0.2.3      │          │
│ pytest-reraise           │ 2.1.2      │          │
│ robotpy                  │ 2024.3.1.0 │          │
│ robotpy-apriltag         │ 2024.3.1.0 │          │
│ robotpy-cli              │ 2024.0.0   │          │
│ robotpy-hal              │ 2024.3.1.0 │          │
│ robotpy-halsim-gui       │ 2024.3.1.0 │          │
│ robotpy-installer        │ 2024.2.2   │          │
│ robotpy-navx             │ 2024.1.0   │          │
│ robotpy-rev              │ 2024.2.1   │          │
│ robotpy-wpilib-utilities │ 2024.0.0   │          │
│ robotpy-wpimath          │ 2024.3.1.0 │          │
│ robotpy-wpinet           │ 2024.3.1.0 │          │
│ robotpy-wpiutil          │ 2024.3.1.0 │          │
│ setuptools               │ 69.0.3     │          │
│ sortedcontainers         │ 2.4.0      │          │
│ tomli                    │ 2.0.1      │          │
│ tomlkit                  │ 0.12.3     │          │
│ typing_extensions        │ 4.9.0      │          │
│ wpilib                   │ 2024.3.1.0 │          │
╰──────────────────────────┴────────────┴──────────╯

Reproducible example code

No response

auscompgeek avatar Mar 15 '24 10:03 auscompgeek

Image

Extending.

  1. Also the exit timeout
  2. Very frequently while we are debugging.

Strictly speaking, unless the thread timing is tied to some realtime process priority, you can't be guaranteed that sim robot code runs within any particular time frame, it's just probability and whether or not the OS thinks it has something more important to do.

gerth2 avatar Jan 17 '25 04:01 gerth2

I agree that you can't guarantee that the robot exits in any particular time frame, but if it's not exiting quickly then that's some issue that needs to be resolved -- though, I will concede that the issue is likely in a vendor dependency because nobody other than python actually tests robot teardown.

If we adopted a variation of https://github.com/robotpy/pyfrc/pull/236, then I think it would just kill the remote robot process directly, and this timeout wouldn't matter anymore.

virtuald avatar Jan 17 '25 15:01 virtuald

Please try isolated mode introduced in pyfrc 2025.1.0 to see if this goes away.

virtuald avatar Feb 12 '25 05:02 virtuald