protobuf icon indicating copy to clipboard operation
protobuf copied to clipboard

Python ParseFromString not throwing DecodeError for some errors on Linux

Open illbebach opened this issue 5 years ago • 2 comments

What version of protobuf and what language are you using? Version: protobuf 3.13.0 Language: Python 3.8.5

What operating system (Linux, Windows, ...) and version?

Status Operating System
ok Windows 10 Enterprise, version 1909, build 18363.1198
fails Ubuntu 20.04.1 LTS 5.4.0-53-generic
fails Kali Linux 4.4.0-18362-Microsoft running under WSL 4.19.128 on Windows 10

What runtime / compiler are you using (e.g., python version or gcc version)

Python 3.8.5

What did you do? Steps to reproduce the behavior:

  1. Save the program supplied below as test.py
  2. keyin python test.py
  3. See error

Commentary

When running on Linux, ParseFromString is not consistently raising a DecodeError exception for input parsing errors. The python test program is not able to detect a parsing error.

When run from Windows 10, using cmd or git bash, ParseFromString raises the DecodeError exception as expected.

What did you expect to see

trying with data, length 10
DecodeError: Truncated string.

trying with data, length 100
DecodeError: Truncated message.

trying with data, length 1000
DecodeError: Field number 0 is illegal.

What did you see instead?

trying with data, length 10
DecodeError: Error parsing message

trying with data, length 100
DecodeError: Error parsing message

trying with data, length 1000
test.py:32: RuntimeWarning: Unexpected end-group tag: Not all data was converted
  fds.ParseFromString(_data)
no error reported

Anything else we should know about your project / environment

The binary data supplied in the test program below is not important. It is only supplied to demonstrate the error. Further the exact error Field number 0 is illegal vs RuntimeWarning is not important. Only that an exception is always passed to the caller.

What I hope is the Linux implementation is fixed so that

  1. a DecodeError is raised for all parsing errors in ParseFromString
  2. no message is no longer printed to stderr

Possibly related issues

  • #5744
  • #2281

Source code for test.py

#!/usr/bin/env python3
import google.protobuf.descriptor_pb2 as pb2
import google.protobuf.message as pbm


blob = b'\x1a\x1fgoogle/protobuf/timestamp.proto"\x87\x02\n\x06Person\x12\x0c\n\x04name' + \
    b'\x18\x01 \x01(\t\x12\n\n\x02id\x18\x02 \x01(\x05\x12\r\n\x05email\x18\x03 \x01(\t\x12,' + \
    b'\n\x06phones\x18\x04 \x03(\x0b2\x1c.tutorial.Person.PhoneNumber\x120\n\x0clast_updated' + \
    b'\x18\x05 \x01(\x0b2\x1a.google.protobuf.Timestamp\x1aG\n\x0bPhoneNumber\x12\x0e\n\x06n' + \
    b'umber\x18\x01 \x01(\t\x12(\n\x04type\x18\x02 \x01(\x0e2\x1a.tutorial.Person.PhoneType"' + \
    b'+\n\tPhoneType\x12\n\n\x06MOBILE\x10\x00\x12\x08\n\x04HOME\x10\x01\x12\x08\n\x04WORK\x10' + \
    b'\x02"/\n\x0bAddressBook\x12 \n\x06people\x18\x01 \x03(\x0b2\x10.tutorial.PersonBP\n\x14' + \
    b'com.example.tutorialB\x11AddressBookProtos\xaa\x02$Google.Protobuf.Examples.AddressBookb' + \
    b'\x06proto3\x00addressbook.proto\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00' + \
    b'\x00\x00\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00CHECK failed: GetArena() == nullptr:' + \
    b' \x00\x00\x00tutorial.Person.PhoneNumber.number\x00\x00\x00\x00\x00\x00CHECK failed: (&' + \
    b'from) != (this): \x00tutorial.Person.name\x00tutorial.Person.email\x00\x00\x00\x00\xc0' + \
    b'\x97\xff\xffx\x95\xff\xff\xf9\x95\xff\xff=\x96\xff\xff\xbe\x96\xff\xff_\x97\xff\xffCHECK' + \
    b' failed: (n) >= (0): \x00\x00\x00\x00\x00\x00CHECK failed: (index) >= (0): \x00\x00CHECK' + \
    b' failed: (index) < (current_size_): \x00\x00\x00\x00\x00\x00\x00CHECK failed: (&other)' + \
    b' != (this): \x00\x00\x00\x00\x00\x00N8tutorial11AddressBookE\x00\x00\x00\x00\x00\x00\x00' + \
    b'\x00N8tutorial6PersonE\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' + \
    b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00N8tutorial18Person_PhoneNumberE' + \
    b'\x00\x01\x1b\x03;d\x0b\x00\x00k\x01\x00\x00`g\xff\xff\xb0\x0b\x00\x00\xf0k\xff\xff\xd8' + \
    b'\x0b\x00\x00\x00l\xff\xff\x80\x0b\x00\x00\xe5l\xff\xff\xb0\x0f\x00\x00\x19p\xff\xff\xd8' + \
    b'\x0f\x00\x00\xe0r\xff\xff \x12\x00\x00)s\xff\xff@\x12\x00\x00>'

for _data in [blob[:10], blob[:100], blob]:
    print('\ntrying with data, length {}'.format(len(_data)))
    try:
        fds = pb2.FileDescriptorProto()
        fds.ParseFromString(_data)
        print('no error reported')

    except pbm.DecodeError as exp:
        print('DecodeError: {}'.format(exp))

    except:
        print('** unexpected error')

illbebach avatar Nov 13 '20 20:11 illbebach

Hi, I'm facing the same behavior. Any updates?

scruper avatar Aug 30 '21 06:08 scruper

Hello! I am having a very similar issue, I depend on that exception to check if a message is correctly parsed but I only get that runtime warning message. Could you solve it?

tlifschitz avatar Oct 24 '22 18:10 tlifschitz

Using version 3.18.1 and encountering the same RuntimeWarning: Unexpected end-group tag: Not all data was converted issue here. Expecting it to be a decodeError. I believe it comes from this line: https://github.com/protocolbuffers/protobuf/blob/b141bf9b1ea19272fde0cfbbdc7888d3f16526ac/python/google/protobuf/pyext/message.cc#L1928 ?

Can someone from Google try prioritizing this ? :P

tabletenniser avatar Nov 28 '22 18:11 tabletenniser

Using version 3.18.1 and encountering the same RuntimeWarning: Unexpected end-group tag: Not all data was converted issue here. Expecting it to be a decodeError. I believe it comes from this line:

https://github.com/protocolbuffers/protobuf/blob/b141bf9b1ea19272fde0cfbbdc7888d3f16526ac/python/google/protobuf/pyext/message.cc#L1928

? Can someone from Google try prioritizing b/27494216 ? :P

any update ? did you find a soultion

itzikban avatar Jan 22 '23 14:01 itzikban

Using version 3.18.1 and encountering the same RuntimeWarning: Unexpected end-group tag: Not all data was converted issue here. Expecting it to be a decodeError. I believe it comes from this line: https://github.com/protocolbuffers/protobuf/blob/b141bf9b1ea19272fde0cfbbdc7888d3f16526ac/python/google/protobuf/pyext/message.cc#L1928

? Can someone from Google try prioritizing b/27494216 ? :P

any update ? did you find a soultion

Nope, I ended up some code on our side to treat this RuntimeWarning as an error...

tabletenniser avatar Jan 23 '23 08:01 tabletenniser

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please add a comment.

This issue is labeled inactive because the last activity was over 90 days ago.

github-actions[bot] avatar Feb 26 '24 10:02 github-actions[bot]

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please reopen it.

This issue was closed and archived because there has been no new activity in the 14 days since the inactive label was added.

github-actions[bot] avatar Mar 15 '24 10:03 github-actions[bot]