tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[Bug][CodeGen] V 0.18.0 compilation after tir.Simplify causes Segmentation Faults

Open talha-ahsan opened this issue 1 year ago • 2 comments

Environment Information

OS: Ubuntu 20.04 LTS Python: 3.10.4 TVM: v0.18.0 built from source with CPU only, no GPU usage in place

Steps to Reproduce:

import tvm
from tvm import tir
from tvm.tir.analysis.analysis import verify_well_formed, verify_memory

from tvm.script import tir as T

@T.prim_func
def main(p0: T.Buffer((1, 4, 4, 512), "float32"), T_relu: T.Buffer((1, 4, 4, 512), "float32")):
    T.func_attr({"from_legacy_te_schedule": T.bool(True), "hash": "8a54a445c9c66af6", "target": T.target({"host": {"keys": ["cpu"], "kind": "llvm", "mtriple": "x86_64-pc-linux-gnu", "tag": ""}, "keys": ["cpu"], "kind": "llvm", "mtriple": "x86_64-pc-linux-gnu", "tag": ""}), "tir.noalias": T.bool(True)})
    for ax0_ax1_fused in T.parallel(4):
        for ax2, ax3_outer in T.grid(4, 32):
            f6a = T.uint32()
            cse_var_1: T.int32 = T.max(-1390372897, T.Shuffle([T.Broadcast(T.Cast("int32", T.ldexp(T.Cast("float32", T.Shuffle([T.Broadcast(1845025892, 3)], [0])), T.Cast("float32", T.Shuffle([T.Broadcast(928309885, 3) - T.Broadcast(-929172350, 3)], [2])))), 3), T.Broadcast(1237026474, 4) // T.Broadcast(-136990005, 4) * T.min(T.Broadcast(2087815492, 4), T.Broadcast(73625980, 4)), T.Broadcast(-1604583078, 3), T.Broadcast(T.Cast("int32", T.nextafter(T.Cast("float32", T.Shuffle([T.Broadcast(-1615280813, 2)], [0])), T.Cast("float32", T.Shuffle([T.Broadcast(217806831, 2)], [1])))), 2), T.min(T.Cast("int32x3", T.max(T.Broadcast(T.fabs(T.Shuffle([T.Broadcast(T.float32(0.18184940914562642), 3)], [2])), 3), T.truncmod(T.max(T.Broadcast(T.float32(0.25323582565477309), 3), T.Broadcast(T.float32(0.45124937914803442), 3)), T.Broadcast(T.float32(0.84180151259624503), 3))) * T.Broadcast(T.float32(0.85337293296299876), 3)), T.Div(T.Broadcast(1075650555, 3), T.Broadcast(-1611576096, 3))) % T.Cast("int32x3", T.min(T.min(T.Broadcast(T.Cast("uint32", T.sqrt(T.Cast("float32", T.Mul(T.uint32(172697918), T.uint32(41149579))))), 3), T.Broadcast(T.uint32(1159702194), 3)), T.Broadcast(T.Cast("uint32", T.popcount(T.Cast("int32", T.Shuffle([T.Broadcast(T.uint32(908195236), 3) + T.Broadcast(T.uint32(653293327), 3)], [2])))), 3)))], [T.Let(T.Broadcast(T.uint32(1121529639), 3), where={f6a: T.uint32(192542249)})])) + ax2 * 512 + ax3_outer * 16
            T_relu_1 = T.Buffer((8192,), data=T_relu.data)
            p0_1 = T.Buffer((8192,), data=p0.data)
            T_relu_1[cse_var_1:cse_var_1 + 16] = T.max(p0_1[cse_var_1:cse_var_1 + 16], T.Broadcast(T.float32(0.0), 16))
func = main
mod = tvm.ir.IRModule({'main': func})
if not verify_well_formed(mod) and verify_memory(func):
    print("Validation failed")
else: 
    print("Beginning Compilation")
    with tvm.transform.PassContext(opt_level=4):
        nopt_mod = tvm.build(mod)
    print("Success!")

Expected Behavior:

Successful Compilation or a reason for why the compilation target is invalid

Reality: Segmentation fault (core dumped) when testing

talha-ahsan avatar Jan 13 '25 23:01 talha-ahsan

Simplifying the huge expression tree reveals this error: InternalError: Check failed: (val && *val >= 0 && *val < total_lanes) is false: Shuffled indeces are suppose to be int, but get T.Let(T.Broadcast(T.uint32(1121529639), 3), where={f6a: T.uint32(192542249)}). Changing this to T.uint32(1) results in a success.

mac-op avatar Jan 14 '25 04:01 mac-op

Thank you!

In this case, is it reasonable to assume that the compiler's behavior should be to surface the internal error rather than throw the segmentation fault when the TIR is invalid?

talha-ahsan avatar Jan 17 '25 02:01 talha-ahsan