lupa Better error handling for calls to Lua functions from Python

Problem 1: error objects and stack tracebacks

Currently, when you call a Lua function from Python, it uses the debug.traceback message handler, which adds the stack traceback to the error message string. This pollutes the original error message. In these circumstances, if you are handling Lua errors from Python, you need to search for a substring instead of a cleaner equality check. So, we need to keep the error object intact. Well, how are we going to add the Lua traceback to the LuaError exception? Well, we can convert the Lua traceback (which is obtainable via the Lua debug library) into Python traceback objects and link them nicely.

Solution: add a message handler that creates a Python exception according to the error object and adds a Python stack traceback extracted from the Lua debug library (lua_getstack and lua_getinfo). When calling Python functions from Lua, the exception information (obtained from sys.exc_info) is stored inside an instance of the (new) _PyException class, which is wrapped in a Lua userdatum.

>>> lua.execute('error("spam")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>                                  # <<< Python traceback
  File "lupa/_lupa.pyx", line 335, in lupa._lupa.LuaRuntime.execute
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
  File "[C]", line 1, in <module>                                      # <<< Lua traceback
  File "[string "<python>"]", line 1, in <module>
lupa._lupa.LuaError: [string "<python>"]:1: spam

Problem 2: error re-raising is not re-entrant

Currently, when you call a Python function from Lua, and it raises a Python exception, it is converted to a Lua error and stored in _raised_exception inside the LuaRuntime instance. It is easy to see that this solution is not re-entrant, that is, it doesn't work for arbitrarily recursive calls between Lua and Python. So, instead of storing exception information (which includes the stack traceback) in the LuaRuntime instance, we need to propagate exception information via the error object, which is unique to each protected call in Lua.

Solution: Handle _PyException and BaseException instances raised from protected calls to Lua functions from Python.

#   1         2            3          4
>>> lua.eval('python.eval("lua.eval(\'python.eval([[0/0]])\')")')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "lupa/_lupa.pyx", line 327, in lupa._lupa.LuaRuntime.eval
    return run_lua(self, b'return ' + lua_code, args)
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
    return call_lua(runtime, L, args)                                # <<< Lua call (1)
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
    return execute_lua_call(runtime, L, len(args))
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
    py_from_lua_error(runtime, L, result_status)
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
    raise pyexc.etype, pyexc.value, pyexc.traceback
  File "lupa/_lupa.pyx", line 1879, in lupa._lupa.py_call_with_gil
    return call_python(runtime, L, py_obj)                             # <<< Python call (2)
  File "lupa/_lupa.pyx", line 1866, in lupa._lupa.call_python
    result = f(*args, **kwargs)
  File "<string>", line 1, in <module>
  File "lupa/_lupa.pyx", line 327, in lupa._lupa.LuaRuntime.eval
    return run_lua(self, b'return ' + lua_code, args)
  File "lupa/_lupa.pyx", line 1669, in lupa._lupa.run_lua
    return call_lua(runtime, L, args)                                # <<< Lua call (3)
  File "lupa/_lupa.pyx", line 1683, in lupa._lupa.call_lua
    return execute_lua_call(runtime, L, len(args))
  File "lupa/_lupa.pyx", line 1708, in lupa._lupa.execute_lua_call
    py_from_lua_error(runtime, L, result_status)
  File "lupa/_lupa.pyx", line 1651, in lupa._lupa.py_from_lua_error
    raise pyexc.etype, pyexc.value, pyexc.traceback
  File "lupa/_lupa.pyx", line 1879, in lupa._lupa.py_call_with_gil
    return call_python(runtime, L, py_obj)                             # <<< Python call (4)
  File "lupa/_lupa.pyx", line 1866, in lupa._lupa.call_python
    result = f(*args, **kwargs)
  File "<string>", line 1, in <module>
ZeroDivisionError: division by zero

Problem 3: clearing the stack

I never understood why Lupa clears the stack before it calls a Lua function from Python or vice versa. The Lua stack can be indexed either from the bottom (positive) and from the top (negative), which makes manipulating only the top n values very easy.

Solution: Use negative indices to navigate through the top-most values in the Lua stack.

Problem 4: type checking Python objects from Lua

Thanks to python.builtins.type the user is able to check the type of Python objects from Lua. However, this does not tell whether the object is a wrapped Python object or not. Ergo, python.builtins.type(nil) and python.builtins.type(python.none) output the same type, NoneType.

Solution: Add python.is_object for checking if a Lua value is a wrapped Python object or not

>>> lua.eval('python.is_object(nil)')
False
>>> lua.eval('python.is_object(python.none)')
True

Additional changes

Add python.is_error for checking if a Lua value is a wrapped _PyException instance

>>> lua.execute('''
... local ok, err = pcall(python.eval, '0/0')
... assert(not ok, "raises an error")
... assert(python.is_error(err), "raises Python error")
... return err.etype, err.value, err.traceback''')
(<class 'ZeroDivisionError'>, ZeroDivisionError('division by zero'), <traceback object at 0x7f5eba484e80>)

Restore the original lock_runtime (before #188) and add try_lock_runtime (returns boolean)
Simplify _LuaObject.__dealloc__ and _LuaIter.__dealloc__ (it's OK to call luaL_unref with LUA_NOREF or LUA_REFNIL)
Check the stack of the main Lua thread before calling lua_xmove in resume_lua_thread
Add tests for new features and adjust old tests for new behaviour of error handling
Add documentation for error handling in README

Jul 22 '21 17:07 guidanoli

@scoder I think these are really important changes, could you review it?

Jul 22 '21 18:07 guidanoli

One issue is that it seems to become less clear when exceptions and Python frames (which can hold on to arbitrarily large amounts of data) are getting cleaned up – and if at all. Cannot say if that's a real problem in practice.

So, I've tested the garbage collector of Lua and Python submitted to many errors with the following script. It throws 1000 Lua errors, convert them to Python (with frames and code objects). Then it displays a graph of the number of Python objects and allocated space by Lua.

import gc
import lupa

pygcdata = []
luagcdata = []

def checkgc():
    luagccount = checkluagc()
    while gc.collect() != 0:
        # collect garbage from Python and Lua
        luagccount = checkluagc()
    # print Python object count
    pygcdata.append(len(gc.get_objects()))
    luagcdata.append(luagccount*1024)

def showgcdata():
    import matplotlib.pyplot as plt
    plt.title('GC utilization by Lua and Python')
    plt.plot(pygcdata, label='Python (#objects)')
    plt.plot(luagcdata, label='Lua (bytes)')
    plt.legend()
    plt.show()

lua = lupa.LuaRuntime()

checkluagc = lua.eval('''function()
    while true do
        before = collectgarbage('count')
        collectgarbage()
        after = collectgarbage('count')
        if before == after then
            return after
        end
    end
end''')

s = 'error()'

checkgc()

for i in range(1000):
    try: lua.eval(s)
    except: pass
    finally: checkgc()

showgcdata()

Here are the results: Imgur We have detected no memory leakage whatsoever.

Sep 03 '21 01:09 guidanoli

There was a related discussion going on on python-dev, so I asked and there seems to be general interest in improving the C-API for this. Not something for right now, but once that's available and used (and backported) by Cython, Lupa (and probably several other projects that handle foreign languages, DSLs, templates, …) could benefit from it.

That's awesome to hear! It would be really of great help for projects like ours and many others like Jinja.

Sep 03 '21 09:09 guidanoli