mlir Invalid code passes verification: return type mismatch

I wasn't expecting this code to pass verification but it does. (The return op's operand type isn't the same as the function's return type here.)

  func @foo(%arg0: !llvm<"{ float*, i64, [2 x i64], [2 x i64] }*">) -> !llvm<"{ <8 x float>*, i64, [2 x i64], [2 x i64] }"> {
    %0 = llvm.load %arg0 : !llvm<"{ float*, i64, [2 x i64], [2 x i64] }*">
    %1 = llvm.extractvalue %0[0 : index] : !llvm<"{ float*, i64, [2 x i64], [2 x i64] }">
    %2 = llvm.bitcast %1 : !llvm<"float*"> to !llvm<"<8 x float>*">
    llvm.return %2 : !llvm<"<8 x float>*">
  }

For that matter, changing the type of %2 from an !llvm<"<8 x float>*"> to !llvm<"<4 x i32>"> (no longer a pointer) still doesn't fail verification.

Oct 10 '19 14:10 bondhugula

The problem is of a much larger scope than this issue, almost none(only 3) of the operations in the LLVM dialect have verification.

Oct 10 '19 16:10 River707

The problem is of a much larger scope than this issue, almost none(only 3) of the operations in the LLVM dialect have verification.

Right - but as far as the LLVM dialect goes, I think there isn't a need to immediately go overboard fixing all the verifiers to be "op semantic perfect", but just the important ones that impact debugging experience. The lack of verification for LLVM dialect ops often leads to direct tell tale errors when the LLVM verifier itself runs post translation. However, in many other common cases, it's time consuming to backtrack and identity the source from downstream. The following are I believe more pressing:

llvm.extractvalue
llvm.insertvalue
llvm.bitcast
llvm.return

Oct 10 '19 21:10 bondhugula

As @River707 mentioned, almost none of the LLVM operations have verifiers. There are two reasons for that: (1) LLVM type system is essentially opaque to MLIR, especially in ODS where all the operations are defined; (2) we don't model many aspects of the LLVM yet so it's non-trivial to write verifiers for them. For example, some type mismatches between function declaration and definition are allowed by LLVM for specific calling conventions, which we don't model.

A proper solution to (1) goes through an extension of the LLVM type system modeling in ODS to differentiate between types.

On a conceptual level, I would be interested in a solution that prevents us from replicating LLVM's IR verifiers in MLIR. This is less true for "core" instructions, but intrinsics may evolve in LLVM and somebody will have to keep track and update their counerparts in MLIR.

Oct 10 '19 21:10 ftynse

I'm not sure the current way we handle the LLVM dialect is really nice. I suspect on a medium term horizon we should consider making it more "first class" even at the cost of duplication. It seems just not great that the main target dialect does not have a good MLIR experience. This is likely a larger discussion to have in the future though.

Oct 11 '19 00:10 joker-eph

This is a discussion that I would love to have with the LLVM community. We have a way to describe the validity conditions of instructions in a declarative way. It may be interesting to have a complete definition of those, in addition to the English in LangRef.

Oct 11 '19 09:10 ftynse

Are you talking about using MLIR for LLVM itself? This is more ambitious than what I had in mind (which was focused on the LLVM Dialect alone), especially in terms of projects dependencies.

Oct 11 '19 20:10 joker-eph

Having some sort of machine-readable specification of the instruction set would be great, it does not have to use the MLIR ODS.

Oct 14 '19 08:10 ftynse