datajoint-python icon indicating copy to clipboard operation
datajoint-python copied to clipboard

Pytest `test_datetime_serialization_speed` occasionally fails

Open ethho opened this issue 2 years ago • 6 comments

Bug Report

Description

The pytest tests/test_blob.py::test_datetime_serialization_speed fails non-deterministically.

Reproducibility

Include:

  • OS: Ubuntu 23.04
  • CPU: Intel i7-7600U (4) @ 3.900GHz
  • Mem: 16GB
  • Python: 3.8.17
  • MySQL Version: 8.0
  • MySQL Deployment Strategy (local-native | local-docker | remote): local-docker (docker://datajoint/mysql:8.0)
  • DataJoint Version: 0.14.1 commit 03db252aaf956e6cd71e08241eb16b3de43a1d10

Steps to Reproduce

In a terminal window in the VSCode DevContainer on dev-tests branch (03db252aaf956e6cd71e08241eb16b3de43a1d10), issue for i in {1..10}; do pytest tests/test_blob.py::test_datetime_serialization_speed; done to run the test 10 times. On my laptop, the test fails once out of 10 times. Since this issue seems hardware-dependent, it might be impossible to reproduce on some systems.

Error Stack

Console Logs
$ for i in {1..10}; do pytest tests/test_blob.py::test_datetime_seria
lization_speed; done
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.57s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.73s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.72s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py F                                                                                                                [100%]

================================================================ FAILURES =================================================================
____________________________________________________ test_datetime_serialization_speed ____________________________________________________

    def test_datetime_serialization_speed():
        # If this fails that means for some reason deserializing/serializing
        # np arrays of np.datetime64 types is now slower than regular arrays of datetime
    
        optimized_exe_time = timeit.timeit(
            setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
            stmt="unpack(myarr)",
            number=10,
            globals=globals(),
        )
        print(f"np time {optimized_exe_time}")
        baseline_exe_time = timeit.timeit(
            setup="myarr2=pack(np.array([datetime(2022,10,13,3,3,13) for _ in range (0, 10000)]))",
            stmt="unpack(myarr2)",
            number=10,
            globals=globals(),
        )
        print(f"python time {baseline_exe_time}")
    
>       assert optimized_exe_time * 900 < baseline_exe_time
E       assert (0.00273081300110789 * 900) < 2.3093061309991754

tests/test_blob.py:233: AssertionError
---------------------------------------------------------- Captured stdout call -----------------------------------------------------------
np time 0.00273081300110789
python time 2.3093061309991754
========================================================= short test summary info =========================================================
FAILED tests/test_blob.py::test_datetime_serialization_speed - assert (0.00273081300110789 * 900) < 2.3093061309991754
============================================================ 1 failed in 2.75s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 2.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.70s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.91s ============================================================
=========================================================== test session starts ===========================================================
platform linux -- Python 3.8.17, pytest-7.4.3, pluggy-1.3.0
rootdir: /workspaces/ethho-datajoint-python
plugins: cov-4.1.0, Faker-20.1.0
collected 1 item                                                                                                                          

tests/test_blob.py .                                                                                                                [100%]

============================================================ 1 passed in 3.09s ============================================================

Expected Behavior

The pytest tests/test_blob.py::test_datetime_serialization_speed passes every time without failing, regardless of hardware.

Screenshots

N/A

Additional Research and Context

This test fails often (~10% of invocations) when it is run on my laptop (2017 ThinkPad X1 Carbon) in a DevContainer. The test fails very infrequently (~1% of invocations) when run in CI; one example is on this CI run in #1133. I suspect that a slow or busy CPU causes packing of np.datetime64 arrays to happen unexpectedly slowly:

https://github.com/ethho/datajoint-python/blob/e755b9763d950137718fb8447967ceef5172f57d/tests/test_blob.py#L218-L223

    optimized_exe_time = timeit.timeit(
        setup="myarr=pack(np.array([np.datetime64('2022-10-13 03:03:13') for _ in range(0, 10000)]))",
        stmt="unpack(myarr)",
        number=10,
        globals=globals(),
    )

Additional Context

As of 12/18/2023, this test doesn't exist on branch master. This issue was discovered during development of #1142.

ethho avatar Dec 18 '23 22:12 ethho

I don't think the execution time is the right thing to test since it's not deterministic. The test should verify correctness and validity. Should we remove the time-based tests? What are the best practices?

dimitri-yatsenko avatar Sep 16 '24 19:09 dimitri-yatsenko

This is not a bug since this is only an inadequate test.

dimitri-yatsenko avatar Sep 16 '24 20:09 dimitri-yatsenko

Should we remove the time-based tests? What are the best practices?

I say we loosen the testing constraints (I think you already did this on master branch), or remove the test entirely. Tests will sometimes share physical hardware (e.g. in GitHub Actions CI runners), so it's not safe to assume that a test has access to >=1 CPU without process switching for the entire duration of the test.

ethho avatar Sep 19 '24 18:09 ethho

I would remove this test altogether. The tests should be for correctness and not for timing since it cannot be guaranteed.

dimitri-yatsenko avatar Sep 23 '24 19:09 dimitri-yatsenko

Closing due to inactivity (older than 365 days), please reopen if it's still applicable.

drewyangdev avatar Mar 14 '25 03:03 drewyangdev

This issue is stale because it has been open for 45 days with no activity.

github-actions[bot] avatar Apr 29 '25 02:04 github-actions[bot]