python-pkcs11 icon indicating copy to clipboard operation
python-pkcs11 copied to clipboard

unpack throw exception

Open Benjamin-Shengming opened this issue 6 years ago • 26 comments

I have tried this pkcs11 lib on ubuntu 19.04 with python 3.7.3, run with a vendor HSM(EngageBlack), all works good.

However, when I use it on an Linux from scratch system, using same HSM, I got following error.

bash-4.3# python3
Python 3.7.4 (default, Sep 24 2019, 10:51:40) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkcs11
>>> lib = pkcs11.lib("/usr/lib/libbvpkcs.so")
>>> token = lib.get_token()
>>> session = token.open()
>>> key = session.generate_key(pkcs11.KeyType.AES, 256)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkcs11/_pkcs11.pyx", line 409, in pkcs11._pkcs11.Session.generate_key
  File "pkcs11/_pkcs11.pyx", line 564, in pkcs11._pkcs11.Object._make
  File "pkcs11/_pkcs11.pyx", line 615, in pkcs11._pkcs11.Object.__getitem__
  File "pkcs11/_utils.pyx", line 35, in pkcs11._pkcs11._unpack_attributes
  File "/usr/lib/python3.7/site-packages/python_pkcs11-0.5.0-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 129, in <lambda>
    lambda v: type_(unpack(v)))
  File "/usr/lib/python3.7/site-packages/python_pkcs11-0.5.0-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 114, in <lambda>
    _ulong = (Struct('L').pack, lambda v: Struct('L').unpack(v)[0])
struct.error: unpack requires a buffer of 8 bytes

are there any other dependencies I need to take care of? Thanks

Benjamin-Shengming avatar Sep 24 '19 23:09 Benjamin-Shengming

What architecture are you on?

danni avatar Sep 24 '19 23:09 danni

virtualbox linux 64 bits

bash-4.3# uname -m
x86_64

bash-4.3# lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                3
On-line CPU(s) list:   0-2
Thread(s) per core:    1
Core(s) per socket:    3
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Stepping:              3
CPU MHz:               3292.396
BogoMIPS:              6584.79
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              6144K
NUMA node0 CPU(s):     0-2


Benjamin-Shengming avatar Sep 25 '19 22:09 Benjamin-Shengming

What I'm guessing is that one of the #defines we use to set the sizes of things in pkcs11.h on your LFS system is missing. And so it's ending up as the wrong size. You might have to print out the contents of the #define environment the shared object is being compiled with.

danni avatar Sep 27 '19 06:09 danni

Do you mean these defines?

#define CK_PTR            *
#define CK_DEFINE_FUNCTION(returnType, name) returnType name
#define CK_DECLARE_FUNCTION(returnType, name) returnType name
#define CK_DECLARE_FUNCTION_POINTER(returnType, name) returnType (* name)
#define CK_CALLBACK_FUNCTION(returnType, name) returnType (* name)

#ifndef NULL_PTR
#define NULL_PTR          0
#endif

I find that cryptoki.h has been added to master branch but not in release 0.5.0

or some other defines used in cython?

Benjamin-Shengming avatar Sep 28 '19 09:09 Benjamin-Shengming

The defines in 0.5.0 were in setup.py, in master this has been changed. Does it work better in master?

danni avatar Oct 03 '19 00:10 danni

tried master branch still get same error:

Python 3.7.4 (default, Oct  3 2019, 04:30:14) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkcs11
>>> lib = pkcs11.lib("/opt/bvhsm/usr/lib64/libbvpkcs.so")
>>> token = lib.get_token()
>>> session = token.open()
>>> from pkcs11 import KeyType, Attribute
>>> 
>>> key = session.generate_key(KeyType.AES, 256, template={
...     Attribute.SENSITIVE: False,
...     Attribute.EXTRACTABLE: True,
... })
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "pkcs11/_pkcs11.pyx", line 433, in pkcs11._pkcs11.Session.generate_key
  File "pkcs11/_pkcs11.pyx", line 586, in pkcs11._pkcs11.Object._make
  File "pkcs11/_pkcs11.pyx", line 637, in pkcs11._pkcs11.Object.__getitem__
  File "pkcs11/_utils.pyx", line 35, in pkcs11._pkcs11._unpack_attributes
  File "/usr/lib/python3.7/site-packages/python_pkcs11-0.5.1.dev33+g46ed66a-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 132, in <lambda>
    lambda v: type_(unpack(v)))
  File "/usr/lib/python3.7/site-packages/python_pkcs11-0.5.1.dev33+g46ed66a-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 117, in <lambda>
    _ulong = (Struct('L').pack, lambda v: Struct('L').unpack(v)[0])
struct.error: unpack requires a buffer of 8 bytes

Benjamin-Shengming avatar Oct 04 '19 02:10 Benjamin-Shengming

Something is wrong with the length of that type. I would start by having a look how long the value v is. Also what item v refers to.

danni avatar Oct 08 '19 06:10 danni

It might be related to how Python was built. I can reproduce the issue by following steps:

  1. on ubuntu14.04, download Python-3.7.4 source code and untar it,

  2. guarantee libssl-dev installed, if not sudo apt install ligssl-dev

  3. enter into python source folder and build python:

    *  ./configure --enable-shared 
    *  make 
    * sudo make install 
  1. sudo pip3 install python-pkcs11

Now run python3, import pkcs11, and create one aes key, same error again

Python 3.7.4 (default, Oct  4 2019, 09:08:14) 
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pkcs11
>>> lib = pkcs11.lib("/opt/bvhsm/usr/lib64/libbvpkcs.so")
>>> token = lib.get_token()
>>> session = token.open()
>>> key = session.generate_key(pkcs11.KeyType.AES, 256)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkcs11/_pkcs11.pyx", line 409, in pkcs11._pkcs11.Session.generate_key
    return Object._make(self, key)
  File "pkcs11/_pkcs11.pyx", line 564, in pkcs11._pkcs11.Object._make
    object_class = self[Attribute.CLASS]
  File "pkcs11/_pkcs11.pyx", line 615, in pkcs11._pkcs11.Object.__getitem__
    return _unpack_attributes(key, value)
  File "pkcs11/_utils.pyx", line 35, in pkcs11._pkcs11._unpack_attributes
    return unpack(bytes(value))
  File "/home/service/python-pkcs11-0.5.0/pkcs11/defaults.py", line 129, in <lambda>
    lambda v: type_(unpack(v)))
  File "/home/service/python-pkcs11-0.5.0/pkcs11/defaults.py", line 114, in <lambda>
    _ulong = (Struct('L').pack, lambda v: Struct('L').unpack(v)[0])
struct.error: unpack requires a buffer of 8 bytes

Benjamin-Shengming avatar Oct 08 '19 21:10 Benjamin-Shengming

Okay so I'm guessing it's a mismatch between how long Python thinks an L is and how long PKCS#11 thinks an L is. Is one being compiled for a different word length than the other?

danni avatar Oct 08 '19 23:10 danni

more info

import pkcs11
from pkcs11 import KeyType
lib = pkcs11.lib("/opt/bvhsm/usr/lib64/libbvpkcs.so")
token = lib.get_token()
session = token.open() 
key = session.generate_key(KeyType.AES, 256)
unpack attributes
Attribute.CLASS <MemoryView of 'array' object>
len of value 4
b'\x04\x00\x00\x00'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pkcs11/_pkcs11.pyx", line 409, in pkcs11._pkcs11.Session.generate_key
  File "pkcs11/_pkcs11.pyx", line 564, in pkcs11._pkcs11.Object._make
  File "pkcs11/_pkcs11.pyx", line 615, in pkcs11._pkcs11.Object.__getitem__
  File "pkcs11/_utils.pyx", line 39, in pkcs11._pkcs11._unpack_attributes
  File "/usr/local/lib/python3.7/site-packages/python_pkcs11-0.5.0-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 129, in <lambda>
    lambda v: type_(unpack(v)))
  File "/usr/local/lib/python3.7/site-packages/python_pkcs11-0.5.0-py3.7-linux-x86_64.egg/pkcs11/defaults.py", line 114, in <lambda>
    _ulong = (Struct('L').pack, lambda v: Struct('L').unpack(v)[0])
struct.error: unpack requires a buffer of 8 bytes

Benjamin-Shengming avatar Oct 09 '19 02:10 Benjamin-Shengming

same os, same pkcs11 lib but python3.4 from ubuntu

>>> import pkcs11
>>> from pkcs11 import KeyType
>>> lib = pkcs11.lib("/opt/bvhsm/usr/lib64/libbvpkcs.so")
>>> token = lib.get_token()
>>> session = token.open()
>>> key = session.generate_key(KeyType.AES, 256)
unpack attributes
Attribute.CLASS <MemoryView of 'array' object>
len of value 8
b'\x04\x00\x00\x00\x00\x00\x00\x00'
unpack attributes
Attribute.ENCRYPT <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.DECRYPT <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.SIGN <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.VERIFY <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.WRAP <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.UNWRAP <MemoryView of 'array' object>
len of value 1
b'\x01'
unpack attributes
Attribute.DERIVE <MemoryView of 'array' object>
len of value 1
b'\x00'

Benjamin-Shengming avatar Oct 09 '19 02:10 Benjamin-Shengming

Okay so I'm guessing it's a mismatch between how long Python thinks an L is and how long PKCS#11 thinks an L is. Is one being compiled for a different word length than the other?

Does pkcs#11 mean pkcs11 module or python-pkcs11?

pkcs11 module is a shared library provided by hsm vendor and it is working with ubuntu python3.4 + python-pkcs11, however same pkcs11 module does not work with python3.7 built from scratch with python-pkcs11.

I hooked my own trace library into vendor hsm, and get some log:

[2019-10-10 08:50:24.564] [PKCS11_Trace] [info] virtual CK_RV PKCS11_Trace::C_GenerateKey(CK_SESSION_HANDLE, CK_MECHANISM_PTR, CK_ATTRIBUTE_PTR, CK_ULONG, CK_OBJECT_HANDLE_PTR)
[2019-10-10 08:50:24.564] [PKCS11_Trace] [info] session 100
[2019-10-10 08:50:24.564] [PKCS11_Trace] [info] MECHANISM type 4224/CKM_AES_KEY_GEN               
[2019-10-10 08:50:24.564] [PKCS11_Trace] [info] MECHANISM pParameter 0x0
[2019-10-10 08:50:24.564] [PKCS11_Trace] [info] MECHANISM ulParameterLen 0
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] ulcount 14
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] type 0 /  CKA_CLASS              
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] pValue 0x2130ec0 
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] ulValueLen 8
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] 0400000000000000
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] type 258 /  CKA_ID                 
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] pValue 0x15d0210 
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] ulValueLen 0
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] 
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] type 3 /  CKA_LABEL              
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] pValue 0x15d0210 
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] ulValueLen 0
[2019-10-10 08:50:24.565] [PKCS11_Trace] [info] 
.....
[2019-10-10 08:50:24.824] [PKCS11_Trace] [info] virtual CK_RV PKCS11_Trace::C_GetAttributeValue(CK_SESSION_HANDLE, CK_OBJECT_HANDLE, CK_ATTRIBUTE_PTR, CK_ULONG)
[2019-10-10 08:50:24.824] [PKCS11_Trace] [info] session 100
[2019-10-10 08:50:24.824] [PKCS11_Trace] [info] hObject 101
[2019-10-10 08:50:24.824] [PKCS11_Trace] [info] ulCount1
[2019-10-10 08:50:24.825] [PKCS11_Trace] [info] pTemplate 0x7ffdcda2d970
[2019-10-10 08:50:24.825] [PKCS11_Trace] [info] type 0 /  CKA_CLASS              
[2019-10-10 08:50:24.825] [PKCS11_Trace] [info] pValue 0x0 
[2019-10-10 08:50:24.826] [PKCS11_Trace] [info] ulValueLen 0
[2019-10-10 08:50:25.070] [PKCS11_Trace] [info]  CKR_OK
[2019-10-10 08:50:25.070] [PKCS11_Trace] [info] type 0 /  CKA_CLASS              
[2019-10-10 08:50:25.073] [PKCS11_Trace] [info] pValue 0x0 
[2019-10-10 08:50:25.073] [PKCS11_Trace] [info] ulValueLen 4

notice that when generate a aes key, the CKA_CLASS's ulValueLen is 8 but ulValueLen returned is 4 when query the key.

if the problem is in pkcs11 module(vendor's share library), we can't explain why the module works with python3.4 + python-pkcs

Benjamin-Shengming avatar Oct 09 '19 22:10 Benjamin-Shengming

We'll assume the PKCS#11 module is correct. But that our compilation of the headers might be wrong. Especially if the LFS machine is missing some #define (those headers are full of dynamically sized types).

danni avatar Oct 10 '19 03:10 danni

Especially if the LFS machine is missing some #define (those headers are full of dynamically sized types).

I can reproduce the issue on ubuntu 1404 as well.

  1. Build python-3.7.4 from source
  2. python-pkcs11-0.5.0 (pip3 install or install from source)

Benjamin-Shengming avatar Oct 10 '19 22:10 Benjamin-Shengming

I got same problem. Could you solve this issue?

ghost avatar Jan 21 '20 13:01 ghost

I got this on arch linux with python 3.8.1. @Benjamin-Shengming what happened then? Could you find any workaround?

figbux avatar Jan 29 '20 13:01 figbux

@figbux @mehmetozcaan I have tried GCC 9.0 to build python3.7 from source, seems that bug disappeared. But I have tried that only once and I can't upgrade GCC from 4.8 to 9 at moment. I am not sure whether it is a certain workaround. If you guys have a chance to try it, please kindly share your experiments.Thanks.

Benjamin-Shengming avatar Jan 29 '20 21:01 Benjamin-Shengming

It feels a lot like what is being #defined to set all of the sizes. Did anyone check the sizes of types in the compile?

danni avatar Jan 30 '20 03:01 danni

@Benjamin-Shengming Thank you @danni You're right. It's due to the wrong size definition in my vendor's cryptoki. Until they fix it, using "I" instead of "L" in ulong definition is a workaround, for me.

_ulong = (Struct('I').pack, lambda v: Struct('I').unpack(v)[0])     # deaults.py:117

figbux avatar Feb 19 '20 10:02 figbux

@danni I believe this issue stems from default and standard size differences in python. When I checked the sizes of all prefixes(in both python2 and python3):

    The optional first format char indicates byte order, size and alignment:
      @: native order, size & alignment (default)
      =: native order, std. size & alignment
      <: little-endian, std. size & alignment
      >: big-endian, std. size & alignment
      !: same as >
>>> calcsize("L")
8
>>> calcsize("@L")
8
>>> calcsize("=L")
4
>>> calcsize("<L")
4
>>> calcsize(">L")
4
>>> calcsize("I")
4

It's clear that my system does not use the standard size for unsigned long in default. To make sure that the standard size is used, we could modify _ulong as

_ulong = (Struct('=L').pack, lambda v: Struct('=L').unpack(v)[0])     # deaults.py:117

Would this be the correct fix?

fubber1nflux avatar Apr 09 '20 13:04 fubber1nflux

No. You want to be using native sizes and endianness, not standard sizes. The definition from PKCS#11 for CK_ULONG is:

typedef unsigned long int CK_ULONG;

Which is your architecture's native long.

Standard sizes serve to provide a platform independent way for Python implementations to exchange binary data. We could consider measuring the length of v as a workaround for broken vendor implementations, but I'm not wild on that idea.

Are you by any chance using the same implementation as @figbux above?

danni avatar Apr 10 '20 02:04 danni

I'm not using cross platform things so much, but in this situation, even if native or standard, there will be no righteous length of CK_ULONG (or whatever other types). Even ISO C standard defines type lengths as "the values shall be replaced..." and "...shall be equal or greater in magnitude to those shown".

So everyone better hopes their vendor uses a compiler following the same specs with the ones in their systems. I don't have a true solution suggestion for this, but measuring the length as @danni says is something nice I think. This will be more significant when using network hsm's, where at the other side of the cable you are communicating with an embedded device with possibly smaller address space range. So again, malleability rocks, immutableness, sucks.

figbux avatar Apr 10 '20 07:04 figbux

That's not strictly true here. You have a C calling convention defined by your operating system that defines the lengths of types, endianness, etc. Else you'd never be able to link two libraries together ever. This means that unsigned long int is the same type across your whole system, unless you're doing something nutty like using Cygwin on Windows, in which case that's likely to always break.

Also this has nothing to do with the network, only the library that we're linking to. The library can do whatever it wants to deal with endianness and type length, it could talk in XML for all it wants, we don't know anything about that.

Are all of you using the same HSM vendor? Why type does their documentation tell you to use?

danni avatar Apr 10 '20 07:04 danni

Are you by any chance using the same implementation as @figbux above?

That would be correct. However, as I just discovered, the issue runs deeper that that.

The code that generates this exception is this:

_pkcs11.pyx:617

    def __getitem__(self, key):
        cdef CK_ATTRIBUTE template
        template.type = key
        template.pValue = NULL
        template.ulValueLen = <CK_ULONG> 0

        # Find out the attribute size
        assertRV(_funclist.C_GetAttributeValue(self.session._handle, self._handle,
                                     &template, 1))

        if template.ulValueLen == 0:
            return _unpack_attributes(key, b'')

        # Put a buffer of the right length in place
        cdef CK_CHAR [:] value = CK_BYTE_buffer(template.ulValueLen)
        template.pValue = <CK_CHAR *> &value[0]

        # Request the value
        assertRV(_funclist.C_GetAttributeValue(self.session._handle, self._handle,
                                     &template, 1))

        return _unpack_attributes(key, value)

The problem here is that C_GetAttributeValue fetches object attributes from the pkcs library, which are returned in raw form in template struct. This matters because in our case the pkcs library actually communicates with a piece of software that is running on a remote platform which has a different architecture than our own. That causes the fact that even though both the library and the remote application are compiled with identical headers, actual type sizes will differ (ulong is 64 bits on my machine while it is 32 bits on the remote one). And since C_GetAttributeValue returns unprocessed raw data, python-pkcs11 is trying to unpack the data using an incompatible type.

Now, an argument can be made for "Why is the pkcs library not handling the type differences?". But I don't see how that would be possible since it is a packed raw data that is unpacked by the application that is using the library. Interesting thing here is that in the pkcs11 standard(p26) it is stated that:

Note that pValue is a “void” pointer, facilitating the passing of arbitrary values.  Both the application and Cryptoki library MUST ensure that the pointer can be safely cast to the expected type (i.e., without word-alignment errors).

If I understand correctly, as long as the raw data can be safely cast to the attribute type, it is valid. At this point, since python-pkcs11 unpacks the data, I believe It should also handle the casting.

Don't get me wrong, I'm not trying to suggest integrating anything vendor-specific here. This seems more like an unforeseen but valid case that's well within the standard.

fubber1nflux avatar Apr 10 '20 11:04 fubber1nflux

I am not an expert on the PKCS#11 spec by any means, but I think you've misunderstood things.

By word alignment errors, I take it to mean alignment within the struct, that is to say, on many architectures you can only cast pointers that are aligned to 4 byte boundaries (e.g. ARM). So pValue has to be correctly aligned for your architecture. This has nothing to do with the data type you expect to cast pValue to, which is given in section 4.

Again, it's important to note that types in PKCS#11 are architecture dependent. If they weren't there would be a lot more code to normalise word size and endianness between operating systems, architecture and platforms. I think if you're just packaging up binary from one system and transporting it across the network to another machine and expecting those types to have the same meaning, the PKCS#11 implementation is broken.

You would have to expect the transfer to happen over an intermediate, network independent format that used the native typing on both ends. I don't see how it can work any other way. It's not safe to read binary off another system.

danni avatar Apr 10 '20 11:04 danni

These are valid (and obvious but im just not experienced enough i guess) points. Looks like I had a narrow perspective on the bigger picture. Thank you for the clear explanation.

fubber1nflux avatar Apr 10 '20 12:04 fubber1nflux