tree-sitter-cpp icon indicating copy to clipboard operation
tree-sitter-cpp copied to clipboard

bug: class specifier parsed as function definition

Open lsp-ableton opened this issue 1 year ago • 1 comments

Did you check existing issues?

  • [X] I have read all the tree-sitter docs if it relates to using the parser
  • [X] I have searched the existing issues of tree-sitter-cpp

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter 0.23.0

Describe the bug

tree-sitter-cpp does not correctly parse macros in a class declaration, causing the class to be misinterpreted as a function definition.

Steps To Reproduce/Bad Parse Tree

The following results in a parse error:

#define AttributeMacro __attribute__((visibility("default")))
#define ClassDeclarationMacro(Class, BaseClass) \
   public:                                      \
    static TInt sId;                            \

class AttributeMacro Class : public BaseClass
{
public:
  ClassDeclarationMacro(Class, BaseClass);
};
(translation_unit [0, 0] - [10, 0]
  (preproc_def [0, 0] - [1, 0]
    name: (identifier [0, 8] - [0, 22])
    value: (preproc_arg [0, 23] - [0, 61]))
  (preproc_function_def [1, 0] - [5, 0]
    name: (identifier [1, 8] - [1, 29])
    parameters: (preproc_params [1, 29] - [1, 47]
      (identifier [1, 30] - [1, 35])
      (identifier [1, 37] - [1, 46]))
    value: (preproc_arg [2, 3] - [4, 0]))
  (function_definition [5, 0] - [9, 1]   <----- Class specifier incorrectly parsed as function definition
    type: (class_specifier [5, 0] - [5, 20]
      name: (type_identifier [5, 6] - [5, 20]))
    (ERROR [5, 21] - [5, 35]
      (identifier [5, 21] - [5, 26]))
    declarator: (identifier [5, 36] - [5, 45])
    body: (compound_statement [6, 0] - [9, 1]
      (labeled_statement [7, 0] - [8, 51]
        label: (statement_identifier [7, 0] - [7, 6])
        (expression_statement [8, 2] - [8, 51]
          (call_expression [8, 2] - [8, 50]
            function: (identifier [8, 2] - [8, 23])
            arguments: (argument_list [8, 23] - [8, 50]
              (identifier [8, 24] - [8, 38])
              (identifier [8, 40] - [8, 49])))))))
  (expression_statement [9, 1] - [9, 2]))

I understand it won't be possible to handle macros correctly in every case, but it seems like it should be possible to determine that the last node is still a class specifier and not a function definition. Removing either the AttributeMacro or the ClassDeclarationMacro, causes it to be parsed correctly again.

Expected Behavior/Parse Tree

It's fine if these nodes result in parsing errors, but ideally they should still allow the rest of the class declaration to be parsed:

(translation_unit [0, 0] - [10, 0]
  (preproc_def [0, 0] - [1, 0]
    name: (identifier [0, 8] - [0, 22])
    value: (preproc_arg [0, 23] - [0, 61]))
  (preproc_function_def [1, 0] - [5, 0]
    name: (identifier [1, 8] - [1, 29])
    parameters: (preproc_params [1, 29] - [1, 47]
      (identifier [1, 30] - [1, 35])
      (identifier [1, 37] - [1, 46]))
    value: (preproc_arg [2, 3] - [4, 0]))
  (class_specifier [5, 0] - [9, 1]
    (ERROR [5, 6] - [5, 20])            <------- Read some token between the `class` keyword and class name
    name: (type_identifier [5, 21] - [5, 26])
    (base_class_clause [5, 27] - [5, 45]
      (access_specifier [5, 29] - [5, 34])
      (type_identifier [5, 36] - [5, 44]))
    body: (field_declaration_list [6, 0] - [9, 1]
      (access_specifier [7, 0] - [7, 6])
      (declaration [8, 2] - [8, 51]
        declarator: (function_declarator [8, 2] - [8, 50]
          declarator: (identifier [8, 2] - [8, 23])
          parameters: (parameter_list [8, 23] - [8, 50]
            (parameter_declaration [8, 24] - [8, 38]
              type: (type_identifier [8, 24] - [8, 38]))
            (parameter_declaration [8, 40] - [8, 49]
              type: (type_identifier [8, 40] - [8, 49]))))))))

Repro

#define AttributeMacro __attribute__((visibility("default")))
#define ClassDeclarationMacro(Class, BaseClass) \
   public:                                      \
    static TInt sId;                            \

class AttributeMacro Class : public BaseClass
{
public:
  ClassDeclarationMacro(Class, BaseClass);
};

lsp-ableton avatar Nov 11 '24 19:11 lsp-ableton

Having this exact issue as well. I get that macros are hard. I don't really care about the macro, but I want to correctly get the class name and the scope of its definition.

Curious if there is some workaround or if this is just fundamentally impossible given the architecture of tree-sitter.

robmck1995 avatar Nov 25 '24 18:11 robmck1995