macroni icon indicating copy to clipboard operation
macroni copied to clipboard

Create a dialect for checking Lua resource management

Open PappasBrent opened this issue 1 year ago • 3 comments

Macroni should offer a dialect and corresponding static analyses for checking the usage of resource management macros in the Lua programming language's source code.

Lua's source provides an internal API for checking that a lock is held before executing code that touches the language's core functionality:

/*
** macros that are executed whenever program enters the Lua core
** ('lua_lock') and leaves the core ('lua_unlock')
*/
#if !defined(lua_lock)
#define lua_lock(L)	((void) 0)
#define lua_unlock(L)	((void) 0)
#endif

The macros are defined to the above pointless expressions by default, and developers can override them to add dynamic lock checking to the language. The Lua developers do this to define some of Lua's tests:

#define lua_lock(l)     lua_assert((*getlock(l)->plock)++ == 0)
#define lua_unlock(l)   lua_assert(--(*getlock(l)->plock) == 0)

Lua's resource management API is similar to the Linux Kernel's RCU API: both interfaces are defined using macros, and both offer locking and unlocking facilities to allow users to specify critical sections in which users are allowed to access special resources. The main difference between the two is that the RCU API is much more developed, with specific functions for accessing resources in critical sections (e.g., rcu_dereference() and friends) and function attributes for specifying when a function definition should be called while holding a lock (__must_hold()), when a function acquires a lock (__acquires()), and when a function releases a lock (__releases()). Lua's API offers none of these features, so since we've already implemented a dialect for a more complex API, it should be (relatively) easy to implement a new static MLIR dialect in Macroni for it.

If this dialect proves useful for catching bugs at compile-time (instead of at run-time which the Lua devs are currently doing in their tests), then it would be a great example of Macroni's applicability for analyzing real-world code.

I foresee two main challenges in creating the Lua API dialect and its analyses:

  1. Unlike the RCU API, whose locking and unlocking mechanisms are defined as functions, the Lua API defines its locking and unlocking mechanisms as macros. This means that in order to reliably match on them and lower them to MLIR, we have two options: either use PASTA to match on the macro expansions regardless of how they are defined, or require that developers use the default ((void) 0) definitions of the locking and unlocking macros in order for Macroni to be able to detect them. The first option is better for end-users since it is more robust, while the second is better for the Macroni developers and those who wish to build Macroni from source since it doesn't require building a patched version of LLVM+Clang for PASTA. I'd rather pursue the second option now to accelerate development time and change to the first later if the dialect proves to be effective at finding bugs.

  2. As mentioned earlier, the LUA API is much simpler than the RCU API, and offers no mechanisms for constraining the lock state when a function is called (like the RCU API does with attributes like __must_hold()). Many functions in the Lua source code, however, do assume that their callers obtain a lock before calling them. Therefore, to make this new dialect practically useful, we would need to extend Lua's internal API with some way of specifying that a function definition requires that its caller holds a lock before it is called (probably with attributes like the RCU API does), and hook into this with Macroni to avoid spuriously emitting errors every time a function calls lua_unlock() without calling lua_lock() beforehand (since the caller is supposed to do this). We would make these changes in a fork of Lua as a proof-of-concept, and then reach out to the Lua developers in the Lua mailing list to gauge their interest.

PappasBrent avatar Jul 08 '24 20:07 PappasBrent

Some resources I found explaining more about this API:

  • https://www.lua.org/manual/2.1/subsection3_5_6.html
    • This is for an older version of Lua and the functions lua_getlocked() and lua_pushlocked() appear to have been deprecated since then.
  • https://stackoverflow.com/questions/3010974/purpose-of-lua-lock-and-lua-unlock
  • http://lua-users.org/lists/lua-l/2007-06/msg00411.html
  • https://www.gamedev.net/forums/topic/677883-just-discovered-this-about-luawtf/
  • http://lua-users.org/lists/lua-l/2021-01/msg00285.html

PappasBrent avatar Jul 16 '24 14:07 PappasBrent

More resources:

  • http://lua-users.org/wiki/EnvironmentTables
  • http://lua-users.org/wiki/SpeedingUpStrings
  • http://lua-users.org/wiki/ThreadsTutorial

According to these sources, the locking API is still a part of Lua 5.0 (though it appears to be undocumented in the official Lua manual). Also, it seems that users of the API are not supposed to call the lock and unlock functions themselves; only define them. This means that an analysis of these lock and unlock functions will not generalize to other codebases, since they won't call the locking API directly. This reduces the applicability of an analysis of this API, but it does simplify the implementation: Since the only place that calls the locking API is Lua itself, we can customize our matchers to match on the default implementations of the locking macros (((void) 0)), and avoid using PASTA.

PappasBrent avatar Jul 16 '24 16:07 PappasBrent

The consensus however appears to be that using Lua in a multithreaded context is a bad idea with better alternatives.

PappasBrent avatar Jul 16 '24 16:07 PappasBrent