pymodbus icon indicating copy to clipboard operation
pymodbus copied to clipboard

Previous response from modbus device returned as current response when using sync TCP over network with lag

Open thormick opened this issue 5 years ago • 11 comments

Versions

  • Python: 3.8.2
  • OS: Ubuntu 20.04.1 LTS
  • Pymodbus: 2.4.0
  • Modbus Hardware (if used): Hardware independent, verified with local testing against dummy sync server

Pymodbus Specific

  • Server: tcp - sync
  • Client: tcp - sync

Description

TCP sync requests over a laggy network connection against different ranges of registers eventually give the timed-out result of one register range performed previously as the reading of the other register range.

Because the transaction manager and socket framer ignore the transaction ID of the response (my client's app uses synchronized TCP requests) and just returns the response that has come in last, pymodbus can under certain circumstances silently return the result of the previous request made as if it was the result to the one being currently made. This is Bad for my client, whose application makes sequential modbus requests to different ranges of registers of a single unit and needs to communicate over networks with sporadic periods of high lag (problematic VPN connections, mobile networks, etc...).

A step-by-step description of what happens, where an application is set up to read two or more separate ranges of registers in sequential fashion against a single device, and there is network lag. What will happen sooner or later is:

  1. Read against register range 1 fails due to timeout. ModbusTransactionManager._recv raises InvalidMessageReceivedException. This is caught in ModbusTransactionManager._transact, which closes the connection, and as such the data will not cause an issue when it does arrive. ModbusTransactionManager.execute registers this as that the device didn't respond in a timely fashion, and adds it to the _no_response_devices list. My client's application sees that there is no valid response, logs an error and sleeps a few seconds.
  2. Read against register range 1. pymodbus sees the connection is closed and opens it. There is another timeout, but because the device this time is already on the _no_response_devices, there is no exception from ModbusTransactionManager._recv, even if it fails to read any data before the timeout, and the connection is not closed. ModbusTransactionManager.execute just returns an exception with "No Response received from the remote unit/Unable to decode response". My client's application sees that there is no valid response from this register range, logs an error and sleeps a few seconds.
  3. Read against register range 1. This time the response to this request, again, does not arrive before the timeout. But: The connection has been left open and the response from previous request does arrive in time (or arrived while the application was sleeping for some seconds, more likely). The transaction ID is one less than it should be, but this is ignored by pymodbus, and it's a result from the right range of registers, just delayed whatever duration the main loop is configured to sleep for. ModbusTransactionManager.execute removes the device from the _no_response_devices list. My client's application parses this as normal and stores it to processes it and pass it on when all reads are done.
  4. Read against register range 2 is attempted. Again there is a delay and the response to this does not arrive before the timeout, but the previous one has already arrived. But this is a response to the previous request against the other register range, and still it's just parsed and processed as normal data from register range 2. The TID is still one less than it should be and it might even be longer than expected, but all of this is ignored. Hilarity ensures, ints are interpreted as floats and the wrong scaling is applied by my client's application and inserted into their database, chaos propagates down the line, alarm messages are sent indicating that more power has been drawn from an outhouse light bulb in fifteen minutes than what the Eurasian continent normally uses in a year, my client's client's boss calls my client's boss and complains that there is junk in reports, human sacrifice, cats and dogs living together, etc, etc...
  5. This keeps on for as long as the network lag is over the timeout threshold. Once it falls below that two responses arrive at the same time, the first of which is ignored (addTransaction is invoked with the same TID twice, overwriting the first), and so everything seems to magically fix itself.

The case where a late-returned response is shorter than expected is also Not Good, but in this case the application will hit a "IndexError: list index out of range" before it passes on junk.

Code and Logs

To test this I I've used a local dummy server based on synchronous server on port 5020, that just returns constants:

    values = [1, 0, 2, 0, 3, 0, 4, 0 ...etc, etc... 63, 0, 64, 0]
    msdb = ModbusSequentialDataBlock(0x00, values)

msdb is then used for each initializer of the ModbusSlaveContext.

Then I add lag to port 5020:

sudo tc qdisc add dev lo root handle 1: prio priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
sudo tc qdisc add dev lo parent 1:2 handle 20: netem delay 550ms 500ms 25%
sudo tc filter add dev lo parent 1:0 protocol ip u32 match ip sport 5020 0xffff flowid 1:2

To highlight the issue I've added the following lines to DictTransactionManager.addTransaction just before the last line:

        if tid != request.transaction_id:
            _logger.warning("Adding transaction with incorrect TID!!!! "
                            f"Using TID {tid}, response TID: {request.transaction_id}")
        if tid in self.transactions:
            _logger.warning(f"Replacing transaction {tid}")

Then I configure the client's application with timeout=0.75, to output to console instead of a database, and use one of their configurations that does three register ranges against a single modbus device. I start the application, which creates a modbus client and tries to do three read_holding_registers calls against different register ranges in a loop. Eventually this causes one of the ranges to be returned as another range.

A log of a run like this, edited to remove most logging by my client's application, (mostly debug-type messages that they use to make sure they've got scaling set right on their side and such). The first run is successful, the next two fail, and the last one returns junk. My client uses the term "group" to describe preconfigured ranges of registers:

2020-10-06 14:03:59,325 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:03:59,326 - DEBUG - pymodbus.transaction - Running transaction 9
2020-10-06 14:03:59,326 - DEBUG - pymodbus.transaction - SEND: 0x0 0x9 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x0 0x0 0x28
2020-10-06 14:03:59,326 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:00,078 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:00,078 - DEBUG - pymodbus.transaction - RECV: 0x0 0x9 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x2 0x0 0x0 0x0 0x3 0x0 0x0 0x0 0x4 0x0 0x0 0x0 0x5 0x0 0x0 0x0 0x6 0x0 0x0 0x0 0x7 0x0 0x0 0x0 0x8 0x0 0x0 0x0 0x9 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0xb 0x0 0x0 0x0 0xc 0x0 0x0 0x0 0xd 0x0 0x0 0x0 0xe 0x0 0x0 0x0 0xf 0x0 0x0 0x0 0x10 0x0 0x0 0x0 0x11 0x0 0x0 0x0 0x12 0x0 0x0 0x0 0x13 0x0 0x0 0x0 0x14 0x0 0x0 0x0 0x15
2020-10-06 14:04:00,078 - DEBUG - pymodbus.transaction - Adding transaction 9
2020-10-06 14:04:00,079 - DEBUG - pymodbus.transaction - Getting transaction 9
2020-10-06 14:04:00,079 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:00,079 - INFO - util.ms_input - Read register with values [0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21]
2020-10-06 14:04:00,086 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:00,086 - DEBUG - pymodbus.transaction - Running transaction 10
2020-10-06 14:04:00,086 - DEBUG - pymodbus.transaction - SEND: 0x0 0xa 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x28 0x0 0x28
2020-10-06 14:04:00,086 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:00,200 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:00,200 - DEBUG - pymodbus.transaction - RECV: 0x0 0xa 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x16 0x0 0x0 0x0 0x17 0x0 0x0 0x0 0x18 0x0 0x0 0x0 0x19 0x0 0x0 0x0 0x1a 0x0 0x0 0x0 0x1b 0x0 0x0 0x0 0x1c 0x0 0x0 0x0 0x1d 0x0 0x0 0x0 0x1e 0x0 0x0 0x0 0x1f 0x0 0x0 0x0 0x20 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x22 0x0 0x0 0x0 0x23 0x0 0x0 0x0 0x24 0x0 0x0 0x0 0x25 0x0 0x0 0x0 0x26 0x0 0x0 0x0 0x27 0x0 0x0 0x0 0x28 0x0 0x0 0x0 0x29
2020-10-06 14:04:00,200 - DEBUG - pymodbus.transaction - Adding transaction 10
2020-10-06 14:04:00,200 - DEBUG - pymodbus.transaction - Getting transaction 10
2020-10-06 14:04:00,200 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:00,200 - INFO - util.ms_input - Read register with values [0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41]
2020-10-06 14:04:00,205 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:00,205 - DEBUG - pymodbus.transaction - Running transaction 11
2020-10-06 14:04:00,205 - DEBUG - pymodbus.transaction - SEND: 0x0 0xb 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x50 0x0 0x10
2020-10-06 14:04:00,205 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:00,659 - WARNING - util.gapit_modbus_client - <MODBUS_READ_DELAY> Modbus device read took 0.4535 seconds, returned 8 bytes of 8 expected
2020-10-06 14:04:00,659 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:00,660 - DEBUG - pymodbus.transaction - RECV: 0x0 0xb 0x0 0x0 0x0 0x23 0x1 0x3 0x20 0x0 0x0 0x0 0x2a 0x0 0x0 0x0 0x2b 0x0 0x0 0x0 0x2c 0x0 0x0 0x0 0x2d 0x0 0x0 0x0 0x2e 0x0 0x0 0x0 0x2f 0x0 0x0 0x0 0x30 0x0 0x0 0x0 0x31
2020-10-06 14:04:00,660 - DEBUG - pymodbus.transaction - Adding transaction 11
2020-10-06 14:04:00,660 - DEBUG - pymodbus.transaction - Getting transaction 11
2020-10-06 14:04:00,661 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:00,661 - INFO - util.ms_input - Read register with values [0, 42, 0, 43, 0, 44, 0, 45, 0, 46, 0, 47, 0, 48, 0, 49]
-------------------------Reading-------------------------------
Slave id: 1 with label label1
Tstamp: 0
Group name: Gapit-03-07 - Phase values
0x0 - Voltage phase L1 = 0.2 [V]
0x2 - Current L1 = 0.003 [A]
0x4 - Active power L1 = 4.0 [W]
0x6 - Reactive power L1 = 5.0 [var]
0x8 - Power factor L1 = 0.06 [PF]
0x0A - Voltage phase L2 = 0.7 [V]
0x0C - Current L2 = 0.008 [A]
0x0E - Active power L2 = 9.0 [W]
0x10 - Reactive power L2 = 10.0 [var]
0x12 - Power factor L2 = 0.11 [PF]
0x14 - Voltage phase L3 = 1.2 [V]
0x16 - Current L3 = 0.013 [A]
0x18 - Active power L3 = 14.0 [W]
0x1A - Reactive power L3 = 15.0 [var]
0x1C - Power factor L3 = 0.16 [PF]
0x1E - 3 phase - Active power = 17.0 [W]
0x20 - 3 phase - Inductive power = 18.0 [varL]
0x22 - 3 phase - Capacitive power = 19.0 [varC]
0x24 - 3 phase - Cos fi = 0.2 [Cos fi]
0x26 - 3 phase - Power factor = 0.21 [PF]
----
Tstamp: 0
Group name: Gapit-03-07 - Total values
0x28 - Frequency = 2.2 [Hz]
0x2A - Voltage line L1-L2 = 2.3 [V]
0x2C - Voltage line L2-L3 = 2.4 [V]
0x2E - Voltage line L3-L1 = 2.5 [V]
0x30 - % THD V L1 = 2.6 [%]
0x32 - % THD V L2 = 2.7 [%]
0x34 - % THD V L3 = 2.8 [%]
0x36 - % THD A L1 = 2.9 [%]
0x38 - % THD A L2 = 3.0 [%]
0x3A - % THD A L3 = 3.1 [%]
0x3C - 3 phase - Active energy = 32.0 [Wh]
0x3E - 3 phase - Inductive reactive energy = 33.0 [varLh]
0x40 - 3 phase - Capacitive reactive energy = 34.0 [varCh]
0x42 - 3 phase - Apparent power = 35.0 [VA]
0x44 - Maximum demand = 36.0 [w/VA/mA]
0x46 - Three-phase current (average) = 0.037 [A]
0x48 - Neutral current = 0.038 [A]
0x4A - Apparent power L1 = 39.0 [VA]
0x4C - Apparent power L2 = 40.0 [VA]
0x4E - Apparent power L3 = 41.0 [VA]
----
Tstamp: 0
Group name: Gapit-03-07 - Energy
0x50 - Temperature = 4.2 [A]
0x52 - Maximum demand A2 = 0.043 [A]
0x54 - Maximum demand A3 = 0.044 [A]
0x56 - 3 phase - Apparent energy = 45.0 [Vah ]
0x58 - 3 phase - Active energy generated = 46.0 [Wh]
0x5A - 3 phase - Inductive energy generated = 47.0 [varLh]
0x5C - 3 phase - Capacitive energy generated = 48.0 [varCh]
0x5E - 3 phase - Apparent energy generated = 49.0 [Vah]
----
2020-10-06 14:04:00,719 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:00,719 - DEBUG - pymodbus.transaction - Running transaction 12
2020-10-06 14:04:00,719 - DEBUG - pymodbus.transaction - SEND: 0x0 0xc 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x0 0x0 0x28
2020-10-06 14:04:00,720 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:01,017 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:01,017 - DEBUG - pymodbus.transaction - RECV: 0x0 0xc 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x2 0x0 0x0 0x0 0x3 0x0 0x0 0x0 0x4 0x0 0x0 0x0 0x5 0x0 0x0 0x0 0x6 0x0 0x0 0x0 0x7 0x0 0x0 0x0 0x8 0x0 0x0 0x0 0x9 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0xb 0x0 0x0 0x0 0xc 0x0 0x0 0x0 0xd 0x0 0x0 0x0 0xe 0x0 0x0 0x0 0xf 0x0 0x0 0x0 0x10 0x0 0x0 0x0 0x11 0x0 0x0 0x0 0x12 0x0 0x0 0x0 0x13 0x0 0x0 0x0 0x14 0x0 0x0 0x0 0x15
2020-10-06 14:04:01,018 - DEBUG - pymodbus.transaction - Adding transaction 12
2020-10-06 14:04:01,018 - DEBUG - pymodbus.transaction - Getting transaction 12
2020-10-06 14:04:01,018 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:01,018 - INFO - util.ms_input - Read register with values [0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21]
2020-10-06 14:04:01,035 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:01,035 - DEBUG - pymodbus.transaction - Running transaction 13
2020-10-06 14:04:01,035 - DEBUG - pymodbus.transaction - SEND: 0x0 0xd 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x28 0x0 0x28
2020-10-06 14:04:01,035 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:01,787 - DEBUG - pymodbus.transaction - Transaction failed. (Modbus Error: [Invalid Message] Incomplete message received, expected at least 8 bytes (0 received))
2020-10-06 14:04:01,787 - DEBUG - pymodbus.transaction - Getting transaction 13
2020-10-06 14:04:01,787 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:01,788 - ERROR - util.ms_input - IO error reading register with start decimal 40 and length 40 from unit with ID 1, is_socket_open: False, msg: Modbus Error: [Input/Output] Modbus Error: [Invalid Message] Incomplete message received, expected at least 8 bytes (0 received)
2020-10-06 14:04:01,788 - WARNING - __main__ - Register read failed for Gapit-03-07 - Total values
2020-10-06 14:04:01,788 - WARNING - __main__ - Setting value to f for addresses ['0x0', '0x2', '0x4', '0x6', '0x8', '0x0A', '0x0C', '0x0E', '0x10', '0x12', '0x14', '0x16', '0x18', '0x1A', '0x1C', '0x1E', '0x20', '0x22', '0x24', '0x26']
2020-10-06 14:04:01,788 - WARNING - __main__ - Setting value to f for addresses ['0x28', '0x2A', '0x2C', '0x2E', '0x30', '0x32', '0x34', '0x36', '0x38', '0x3A', '0x3C', '0x3E', '0x40', '0x42', '0x44', '0x46', '0x48', '0x4A', '0x4C', '0x4E']
2020-10-06 14:04:01,788 - WARNING - __main__ - Setting value to f for addresses ['0x50', '0x52', '0x54', '0x56', '0x58', '0x5A', '0x5C', '0x5E']
-------------------------Reading-------------------------------
Slave id: 1 with label label2
Tstamp: 0
Group name: Gapit-03-07 - Phase values
0x0 - Voltage phase L1 = f [V]
0x2 - Current L1 = f [A]
0x4 - Active power L1 = f [W]
0x6 - Reactive power L1 = f [var]
0x8 - Power factor L1 = f [PF]
0x0A - Voltage phase L2 = f [V]
0x0C - Current L2 = f [A]
0x0E - Active power L2 = f [W]
0x10 - Reactive power L2 = f [var]
0x12 - Power factor L2 = f [PF]
0x14 - Voltage phase L3 = f [V]
0x16 - Current L3 = f [A]
0x18 - Active power L3 = f [W]
0x1A - Reactive power L3 = f [var]
0x1C - Power factor L3 = f [PF]
0x1E - 3 phase - Active power = f [W]
0x20 - 3 phase - Inductive power = f [varL]
0x22 - 3 phase - Capacitive power = f [varC]
0x24 - 3 phase - Cos fi = f [Cos fi]
0x26 - 3 phase - Power factor = f [PF]
----
Tstamp: 0
Group name: Gapit-03-07 - Total values
0x28 - Frequency = f [Hz]
0x2A - Voltage line L1-L2 = f [V]
0x2C - Voltage line L2-L3 = f [V]
0x2E - Voltage line L3-L1 = f [V]
0x30 - % THD V L1 = f [%]
0x32 - % THD V L2 = f [%]
0x34 - % THD V L3 = f [%]
0x36 - % THD A L1 = f [%]
0x38 - % THD A L2 = f [%]
0x3A - % THD A L3 = f [%]
0x3C - 3 phase - Active energy = f [Wh]
0x3E - 3 phase - Inductive reactive energy = f [varLh]
0x40 - 3 phase - Capacitive reactive energy = f [varCh]
0x42 - 3 phase - Apparent power = f [VA]
0x44 - Maximum demand = f [w/VA/mA]
0x46 - Three-phase current (average) = f [A]
0x48 - Neutral current = f [A]
0x4A - Apparent power L1 = f [VA]
0x4C - Apparent power L2 = f [VA]
0x4E - Apparent power L3 = f [VA]
----
Tstamp: 0
Group name: Gapit-03-07 - Energy
0x50 - Temperature = f [A]
0x52 - Maximum demand A2 = f [A]
0x54 - Maximum demand A3 = f [A]
0x56 - 3 phase - Apparent energy = f [Vah ]
0x58 - 3 phase - Active energy generated = f [Wh]
0x5A - 3 phase - Inductive energy generated = f [varLh]
0x5C - 3 phase - Capacitive energy generated = f [varCh]
0x5E - 3 phase - Apparent energy generated = f [Vah]
----
2020-10-06 14:04:03,524 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:03,524 - DEBUG - pymodbus.transaction - Running transaction 14
2020-10-06 14:04:03,525 - DEBUG - pymodbus.transaction - SEND: 0x0 0xe 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x0 0x0 0x28
2020-10-06 14:04:03,525 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:04,276 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:04,276 - DEBUG - pymodbus.transaction - RECV:
2020-10-06 14:04:04,277 - DEBUG - pymodbus.transaction - Getting transaction 14
2020-10-06 14:04:04,277 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:04,277 - ERROR - util.ms_input - IO error reading register with start decimal 0 and length 40 from unit with ID 1, is_socket_open: True, msg: Modbus Error: [Input/Output] No Response received from the remote unit/Unable to decode response
2020-10-06 14:04:04,277 - WARNING - __main__ - Register read failed for Gapit-03-07 - Phase values
2020-10-06 14:04:04,277 - WARNING - __main__ - Setting value to f for addresses ['0x0', '0x2', '0x4', '0x6', '0x8', '0x0A', '0x0C', '0x0E', '0x10', '0x12', '0x14', '0x16', '0x18', '0x1A', '0x1C', '0x1E', '0x20', '0x22', '0x24', '0x26']
2020-10-06 14:04:04,277 - WARNING - __main__ - Setting value to f for addresses ['0x28', '0x2A', '0x2C', '0x2E', '0x30', '0x32', '0x34', '0x36', '0x38', '0x3A', '0x3C', '0x3E', '0x40', '0x42', '0x44', '0x46', '0x48', '0x4A', '0x4C', '0x4E']
2020-10-06 14:04:04,277 - WARNING - __main__ - Setting value to f for addresses ['0x50', '0x52', '0x54', '0x56', '0x58', '0x5A', '0x5C', '0x5E']
-------------------------Reading-------------------------------
Slave id: 1 with label label1
Tstamp: 0
Group name: Gapit-03-07 - Phase values
0x0 - Voltage phase L1 = f [V]
0x2 - Current L1 = f [A]
0x4 - Active power L1 = f [W]
0x6 - Reactive power L1 = f [var]
0x8 - Power factor L1 = f [PF]
0x0A - Voltage phase L2 = f [V]
0x0C - Current L2 = f [A]
0x0E - Active power L2 = f [W]
0x10 - Reactive power L2 = f [var]
0x12 - Power factor L2 = f [PF]
0x14 - Voltage phase L3 = f [V]
0x16 - Current L3 = f [A]
0x18 - Active power L3 = f [W]
0x1A - Reactive power L3 = f [var]
0x1C - Power factor L3 = f [PF]
0x1E - 3 phase - Active power = f [W]
0x20 - 3 phase - Inductive power = f [varL]
0x22 - 3 phase - Capacitive power = f [varC]
0x24 - 3 phase - Cos fi = f [Cos fi]
0x26 - 3 phase - Power factor = f [PF]
----
Tstamp: 0
Group name: Gapit-03-07 - Total values
0x28 - Frequency = f [Hz]
0x2A - Voltage line L1-L2 = f [V]
0x2C - Voltage line L2-L3 = f [V]
0x2E - Voltage line L3-L1 = f [V]
0x30 - % THD V L1 = f [%]
0x32 - % THD V L2 = f [%]
0x34 - % THD V L3 = f [%]
0x36 - % THD A L1 = f [%]
0x38 - % THD A L2 = f [%]
0x3A - % THD A L3 = f [%]
0x3C - 3 phase - Active energy = f [Wh]
0x3E - 3 phase - Inductive reactive energy = f [varLh]
0x40 - 3 phase - Capacitive reactive energy = f [varCh]
0x42 - 3 phase - Apparent power = f [VA]
0x44 - Maximum demand = f [w/VA/mA]
0x46 - Three-phase current (average) = f [A]
0x48 - Neutral current = f [A]
0x4A - Apparent power L1 = f [VA]
0x4C - Apparent power L2 = f [VA]
0x4E - Apparent power L3 = f [VA]
----
Tstamp: 0
Group name: Gapit-03-07 - Energy
0x50 - Temperature = f [A]
0x52 - Maximum demand A2 = f [A]
0x54 - Maximum demand A3 = f [A]
0x56 - 3 phase - Apparent energy = f [Vah ]
0x58 - 3 phase - Active energy generated = f [Wh]
0x5A - 3 phase - Inductive energy generated = f [varLh]
0x5C - 3 phase - Capacitive energy generated = f [varCh]
0x5E - 3 phase - Apparent energy generated = f [Vah]
----
2020-10-06 14:04:04,328 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:04,328 - DEBUG - pymodbus.transaction - Running transaction 15
2020-10-06 14:04:04,329 - DEBUG - pymodbus.transaction - SEND: 0x0 0xf 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x0 0x0 0x28
2020-10-06 14:04:04,329 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:05,080 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:05,080 - DEBUG - pymodbus.transaction - RECV: 0x0 0xe 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x2 0x0 0x0 0x0 0x3 0x0 0x0 0x0 0x4 0x0 0x0 0x0 0x5 0x0 0x0 0x0 0x6 0x0 0x0 0x0 0x7 0x0 0x0 0x0 0x8 0x0 0x0 0x0 0x9 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0xb 0x0 0x0 0x0 0xc 0x0 0x0 0x0 0xd 0x0 0x0 0x0 0xe 0x0 0x0 0x0 0xf 0x0 0x0 0x0 0x10 0x0 0x0 0x0 0x11 0x0 0x0 0x0 0x12 0x0 0x0 0x0 0x13 0x0 0x0 0x0 0x14 0x0 0x0 0x0 0x15
2020-10-06 14:04:05,080 - DEBUG - pymodbus.transaction - Adding transaction 15
2020-10-06 14:04:05,080 - WARNING - pymodbus.transaction - Adding transaction with incorrect TID!!!! Using TID 15, response TID: 14
2020-10-06 14:04:05,081 - DEBUG - pymodbus.transaction - Getting transaction 15
2020-10-06 14:04:05,081 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:05,081 - INFO - util.ms_input - Read register with values [0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21]
2020-10-06 14:04:05,088 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:05,089 - DEBUG - pymodbus.transaction - Running transaction 16
2020-10-06 14:04:05,089 - DEBUG - pymodbus.transaction - SEND: 0x0 0x10 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x28 0x0 0x28
2020-10-06 14:04:05,089 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:05,219 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:05,220 - DEBUG - pymodbus.transaction - RECV: 0x0 0xf 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x2 0x0 0x0 0x0 0x3 0x0 0x0 0x0 0x4 0x0 0x0 0x0 0x5 0x0 0x0 0x0 0x6 0x0 0x0 0x0 0x7 0x0 0x0 0x0 0x8 0x0 0x0 0x0 0x9 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0xb 0x0 0x0 0x0 0xc 0x0 0x0 0x0 0xd 0x0 0x0 0x0 0xe 0x0 0x0 0x0 0xf 0x0 0x0 0x0 0x10 0x0 0x0 0x0 0x11 0x0 0x0 0x0 0x12 0x0 0x0 0x0 0x13 0x0 0x0 0x0 0x14 0x0 0x0 0x0 0x15
2020-10-06 14:04:05,220 - DEBUG - pymodbus.transaction - Adding transaction 16
2020-10-06 14:04:05,220 - WARNING - pymodbus.transaction - Adding transaction with incorrect TID!!!! Using TID 16, response TID: 15
2020-10-06 14:04:05,220 - DEBUG - pymodbus.transaction - Getting transaction 16
2020-10-06 14:04:05,220 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:05,221 - INFO - util.ms_input - Read register with values [0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21]
2020-10-06 14:04:05,234 - DEBUG - pymodbus.transaction - Current transaction state - TRANSACTION_COMPLETE
2020-10-06 14:04:05,234 - DEBUG - pymodbus.transaction - Running transaction 17
2020-10-06 14:04:05,234 - DEBUG - pymodbus.transaction - SEND: 0x0 0x11 0x0 0x0 0x0 0x6 0x1 0x3 0x0 0x50 0x0 0x10
2020-10-06 14:04:05,234 - DEBUG - pymodbus.transaction - Changing transaction state from 'SENDING' to 'WAITING FOR REPLY'
2020-10-06 14:04:05,771 - DEBUG - pymodbus.transaction - Changing transaction state from 'WAITING FOR REPLY' to 'PROCESSING REPLY'
2020-10-06 14:04:05,771 - DEBUG - pymodbus.transaction - RECV: 0x0 0x10 0x0 0x0 0x0 0x53 0x1 0x3 0x50 0x0 0x0 0x0 0x16 0x0 0x0 0x0 0x17 0x0 0x0 0x0 0x18 0x0 0x0 0x0 0x19 0x0 0x0 0x0 0x1a 0x0 0x0 0x0 0x1b 0x0 0x0 0x0 0x1c 0x0 0x0 0x0 0x1d 0x0 0x0 0x0 0x1e 0x0 0x0 0x0 0x1f 0x0 0x0 0x0 0x20 0x0 0x0 0x0 0x21 0x0 0x0 0x0 0x22 0x0 0x0 0x0 0x23 0x0 0x0 0x0 0x24 0x0 0x0 0x0 0x25 0x0 0x0 0x0 0x26 0x0 0x0 0x0 0x27 0x0 0x0 0x0 0x28 0x0 0x0 0x0 0x29
2020-10-06 14:04:05,772 - DEBUG - pymodbus.transaction - Adding transaction 17
2020-10-06 14:04:05,772 - WARNING - pymodbus.transaction - Adding transaction with incorrect TID!!!! Using TID 17, response TID: 16
2020-10-06 14:04:05,772 - DEBUG - pymodbus.transaction - Getting transaction 17
2020-10-06 14:04:05,772 - DEBUG - pymodbus.transaction - Changing transaction state from 'PROCESSING REPLY' to 'TRANSACTION_COMPLETE'
2020-10-06 14:04:05,772 - INFO - util.ms_input - Read register with values [0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41]
-------------------------Reading-------------------------------
Slave id: 1 with label label2
Tstamp: 0
Group name: Gapit-03-07 - Phase values
0x0 - Voltage phase L1 = 0.2 [V]
0x2 - Current L1 = 0.003 [A]
0x4 - Active power L1 = 4.0 [W]
0x6 - Reactive power L1 = 5.0 [var]
0x8 - Power factor L1 = 0.06 [PF]
0x0A - Voltage phase L2 = 0.7 [V]
0x0C - Current L2 = 0.008 [A]
0x0E - Active power L2 = 9.0 [W]
0x10 - Reactive power L2 = 10.0 [var]
0x12 - Power factor L2 = 0.11 [PF]
0x14 - Voltage phase L3 = 1.2 [V]
0x16 - Current L3 = 0.013 [A]
0x18 - Active power L3 = 14.0 [W]
0x1A - Reactive power L3 = 15.0 [var]
0x1C - Power factor L3 = 0.16 [PF]
0x1E - 3 phase - Active power = 17.0 [W]
0x20 - 3 phase - Inductive power = 18.0 [varL]
0x22 - 3 phase - Capacitive power = 19.0 [varC]
0x24 - 3 phase - Cos fi = 0.2 [Cos fi]
0x26 - 3 phase - Power factor = 0.21 [PF]
----
Tstamp: 0
Group name: Gapit-03-07 - Total values
0x28 - Frequency = 0.2 [Hz]
0x2A - Voltage line L1-L2 = 0.3 [V]
0x2C - Voltage line L2-L3 = 0.4 [V]
0x2E - Voltage line L3-L1 = 0.5 [V]
0x30 - % THD V L1 = 0.6 [%]
0x32 - % THD V L2 = 0.7 [%]
0x34 - % THD V L3 = 0.8 [%]
0x36 - % THD A L1 = 0.9 [%]
0x38 - % THD A L2 = 1.0 [%]
0x3A - % THD A L3 = 1.1 [%]
0x3C - 3 phase - Active energy = 12.0 [Wh]
0x3E - 3 phase - Inductive reactive energy = 13.0 [varLh]
0x40 - 3 phase - Capacitive reactive energy = 14.0 [varCh]
0x42 - 3 phase - Apparent power = 15.0 [VA]
0x44 - Maximum demand = 16.0 [w/VA/mA]
0x46 - Three-phase current (average) = 0.017 [A]
0x48 - Neutral current = 0.018 [A]
0x4A - Apparent power L1 = 19.0 [VA]
0x4C - Apparent power L2 = 20.0 [VA]
0x4E - Apparent power L3 = 21.0 [VA]
----
Tstamp: 0
Group name: Gapit-03-07 - Energy
0x50 - Temperature = 2.2 [A]
0x52 - Maximum demand A2 = 0.023 [A]
0x54 - Maximum demand A3 = 0.024 [A]
0x56 - 3 phase - Apparent energy = 25.0 [Vah ]
0x58 - 3 phase - Active energy generated = 26.0 [Wh]
0x5A - 3 phase - Inductive energy generated = 27.0 [varLh]
0x5C - 3 phase - Capacitive energy generated = 28.0 [varCh]
0x5E - 3 phase - Apparent energy generated = 29.0 [Vah]
----

As one can see here, the values for "Group name: Gapit-03-07 - Phase values" are read into "Group name: Gapit-03-07 - Total values". My client has other more complex modbus devices and setups that are more problematic than this too.

Ideas for actions:

  • Change the last bit of ModbusSocketFramer._process to:

              callback(result, tid=result.transaction_id)
    

    But I see this triggers the following in ModbusTransactionManager.execute:

                      response = self.getTransaction(request.transaction_id)
                      if not response:
                          if len(self.transactions):
                              response = self.getTransaction(tid=0)
                          else:
    

    Which I interpret as meaning that if there is no response to the current TID, but there are other active transactions, don't get that other active transaction but get some different transaction with TID zero, which doesn't exist and in effect will give None? I assume this is to handle something else than sync TCP.

  • Ignore responses when requested TID and response TID is different: Change ModbusSocketFramer.processIncomingPacket to take a optional transaction ID (or a range/list; or the whole request, not just the unit ID of it), and also ignore if it doesn't match, (like it checks the unit ID today).

  • Always close the connection if there is no response at all before timeout, not just only if the device is not on the _no_response_devices list.

  • Log something when a transaction response is overwritten in DictTransactionManager.addTransaction.

On my side I've added a check that the returned response has the right size in the client, but it would be nice if I could rely on pymodbus not returning responses with the wrong length. I'd add a check on the transaction ID there but that's not exposed.

I'm also discussing a switch to using async TCP with my client.

thormick avatar Oct 06 '20 14:10 thormick

Quoting myself:

I'd add a check on the transaction ID there but that's not exposed.

Even though the TID used for the request isn't returned as part of the response object, I noted that the ModbusTcpClient of course has the transaction instance with the last TID used readable there, that I can check against. The client's app is a quite single threaded and synchronous affair, so there shouldn't be any issue with race conditions.

My client's code now checks the response to read_holding_registers/read_input_registers this way:

            if response.transaction_id != modbusTcpClient.transaction.tid:
                raise Exception(f"pymodbus returned a response with transaction ID {response.transaction_id} but the "
                                f"highest transaction manager transaction ID is "
                                f"{modbusTcpClient.transaction.tid}. This could happen because of a pymodbus "
                                "bug triggered by network lag; increase timeout.")
            elif len(response.registers) != register_length:
                raise Exception(f"pymodbus returned a response of {len(response.registers)} word length "
                                f"to a request of {register_length} words. Values: {response.registers}")

Local testing suggests that this resolves the issue, but I've yet to try it against actual modbus hardware in production.

thormick avatar Oct 12 '20 14:10 thormick

I am observing similar problems. Checking the transaction IDs gives me:

AssertionError('
    pymodbus returned a response with transaction ID 16292 but the highest transaction manager 
    transaction ID is 1. 
    This could happen because of a pymodbus bug triggered by network lag; increase timeout.
)

I did not do the analysis as you did, i just wondered why sometimes the length of the returned response does not fit to the requested length.

mahadi avatar Oct 16 '20 12:10 mahadi

Had a few minutes to peek at this today. I can reproduce this in 1.5.0 8f3fc71035ef0a23dca1214cef67bdd67d5ff83e, but not in 1.4.0 4ed1f1a9750bc931c100ed24f5cc0fc032ab5f4d

thormick avatar Oct 26 '20 14:10 thormick

I've toyed around with minimal fix to this in this repository (along with some other changes), but it's tricky to do a proper fix. I've gotten the "Always close the connection..." alternative I've outlined done in this commit, and it seems to improve the issue. But I've only tested this with a local dummy server, and have no idea yet if it resolves the issue sufficiently with real hardware on a real problematic network.

thormick avatar Oct 27 '20 15:10 thormick

You are writing that the transaction management currently is a mess. So this sounds like a redesign is needed?

mahadi avatar Oct 27 '20 19:10 mahadi

@thormick thanks for the effort, I will take a look when I get some time.

dhoomakethu avatar Oct 28 '20 07:10 dhoomakethu

@mahadi Well not a full redesign, the basic design seems sound. There just seems to be a lot of code for handling special cases that is just jammed sideways into the execute, _transact and _recv methods, which could be beneficial to refactor. This makes it difficult to cut through it to address a specific issue.

There's a few other issues that clutter the source, for instance I question the validity of having isinstance if-trees instead of having the framers provide this information. I'm aware it's a legitimate style in special cases, but if moving it into the framers is a better approach I'd be happy to refactor this too.

That said, I'm still looking at this so I might come up with a non-invasive and more complete fix.

thormick avatar Oct 29 '20 11:10 thormick

I've tweaked my patch a bit, and it shouldn't close the socket more than reasonable now. Available here (note that branch contains all the tweaks I've done for Gapit). I haven't verified that there aren't still weird corner case issues if there is packet fragmentation and only the first part the message arrives before the timeout.

As an aside, AFAICT the dict transaction manager isn't actually used as advertised by the sync clients. With how it only uses the current TID that it itself generates, and the socket framer doesn't use the response TID, the DTM only uses one slot and then reads and empties that slot. So it just becomes an awkward way to pass the response from one part of the execute method and a couple of lines down. The sync part of the transaction manager is over-engineered anyhow, but it's tricky to see how it could be improved without moving things into the framers, which is also used by the async code too and as such needs to be handled with care.

Either way, without a more complete check that the response is what the caller wants this is still very fragile and sensitive to future changes in the code, so I would recommend that one just avoids the sync clients in favor of the async, even with my fixes.

And to clarify a few points regarding this for people looking at this and wondering whether it applies to them. This bug report and my fix only pertains when:

  • Using synchronized TCP (Might affect synchronized UDP, fix might improve the condition but shouldn't fix it; I've only tested ModbusTcpClient anyhow)
  • There are first at least two failed attempts to do a read from a modbus unit
  • One gets a response which is a correct response to one of those earlier failed read attempts
  • The issue goes away if the timeout is set high enough

If not all of this is the case, the issue you're facing is something else than what this bug report and fix addresses.

thormick avatar Nov 06 '20 14:11 thormick

Refer #560 . This is one of the identified bug and a fix is under testing.

dhoomakethu avatar Nov 06 '20 14:11 dhoomakethu

Oops, gave the wrong link to the repo, should be here. Updated my previous message.

thormick avatar Nov 06 '20 15:11 thormick

I still observe this issue on TCP sync client if transmission is interrupted by KeyboardInterrupt. I think this is caused by the reconnect code not catching this exception. Maybe consider making try catch clause more indiscriminate?

mabl avatar Jan 10 '22 16:01 mabl

Previous response are no longer returned on current dev.

janiversen avatar Aug 27 '22 09:08 janiversen