hdk icon indicating copy to clipboard operation
hdk copied to clipboard

116 add shared memory support for l0 path

Open lmontigny opened this issue 2 years ago • 4 comments

lmontigny avatar Apr 17 '23 11:04 lmontigny

Facing issue with the RuntimeFunctions.bc file for L0:

$ ./Tests/GpuSharedMemoryTestIntel [==========] Running 1 test from 1 test suite. [----------] Global test environment set-up. [----------] 1 test from SingleColumn [ RUN ] SingleColumn.VariableEntries_CountQuery_4B_Group warning: Linking two modules of different target triples: '/localdisk/lmontign/hdk/omniscidb/build/QueryEngine/RuntimeFunctions.bc' is 'nvptx64-nvidia-cuda' whereas '/localdisk/lmontign/hdk/omniscidb/build/QueryEngine/RuntimeFunctions.bc' is 'spir-unknown-unknown' InvalidTargetTriple: Expects spir-unknown-unknown or spir64-unknown-unknown. Actual target triple is nvptx64-nvidia-cuda

lmontigny avatar Apr 17 '23 11:04 lmontigny

Solved previous .bc mismatch. Now addr space casting issue: image

lmontigny avatar Apr 17 '23 13:04 lmontigny

Casting issue still going on: %4 = addrspacecast i64* %3 to i64 addrspace(3)*

Fail to generate spri-v here:

 std::unique_ptr<L0DeviceCompilationContext> gpu_context(compile_and_link_gpu_code(
	      module_str, module_, l0_mgr_, getWrapperKernel()->getName().str()))
	
	 auto success = writeSpirv(module, opts, ss, err)

Unclear where the casting is generated in the application. Not related to CreatePointerCast, need to double check GpuSharedMemoryUtils.cpp

lmontigny avatar Apr 18 '23 15:04 lmontigny

Casting issue for shared memory is happening here

with address_space = 3

  auto ptr_type = [&context](const size_t slot_bytes, const hdk::ir::Type* type) {
    if (slot_bytes == sizeof(int32_t)) {
      return llvm::Type::getInt32PtrTy(context, /*address_space=*/3);
    } else {
      CHECK(slot_bytes == sizeof(int64_t));
      return llvm::Type::getInt64PtrTy(context, /*address_space=*/3);
    }
    UNREACHABLE() << "Invalid slot size encountered: " << std::to_string(slot_bytes);
    return llvm::Type::getInt32PtrTy(context, /*address_space=*/3);
  };

  const auto casted_dest_slot_address = ir_builder.CreatePointerCast(
      ir_builder.CreateGEP(
          dest_byte_stream->getType()->getScalarType()->getPointerElementType(),
          dest_byte_stream,
          byte_offset),
      ptr_type(slot_bytes, type),
      "dest_slot_adr_" + std::to_string(slot_idx));
  return casted_dest_slot_address;
}

lmontigny avatar Apr 19 '23 14:04 lmontigny