Eddie Liao

AMD Santa Clara, CA

Results 22 comments of


                                            Eddie Liao

"File has unexpected size" when running run_dev.sh; Failed to build base image

No I haven't; am I supposed to? Why would I need `deb https://isaac.download.nvidia.cn/isaac-ros/ubuntu/main focal main` in my local apt source list if it's supposed to be running in a Docker...

Cuda bindings mismatch error using custom trained model

I believe I'm having the same issue. I'm trying to run a model with the following input/output dimensions: ![image](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_object_detection/assets/54926923/a58fd7cf-fac3-48ff-a6f2-9a9d09f7925c) I've already updated my `yolov8_decoder_node.cpp` to match the amount of classes...

Cuda bindings mismatch error using custom trained model

I've managed to get it to work with an image, but if I input a video it will crash when the video ends. > Could you please provide the full...

Add weight streaming at runtime

Looking to instead allocate a certain amount based on liveness and then overlap running kernels and loading during runtime to shorten the amount of time spent waiting.

Add weight streaming at runtime

Added a stream for copies in this [weight_streaming](https://github.com/ROCm/AMDMIGraphX/tree/weight_streaming) branch (not sure why I can't link it directly to this issue). Currently, the `@literal` instruction is taking up the majority of...

Add weight streaming at runtime

It appears that the `std::copy()` call in `make_shared_array` is responsible for the slowdown. The `@literal` instruction doesn't show up during the scheduling pass, so not sure what optimizations can be...

Add weight streaming at runtime

Adding the `@literal` instruction to the stream marginally improves performance (e.g. using a budget of 50000000 on resnet50 speeds up `@literals` from ~8.2ms to ~7.6ms). This does cause a lot...

Add weight streaming at runtime

Removed use of `std::copy` when weight streaming which decreases the time spent on `@literal` instructions drastically. Still need to investigate why increasing the amount of streams does not help performance.

Add weight streaming at runtime

After some testing it appears that weight streaming does work, although with a few caveats: - There doesn't appear to be a good way to find how much gpu memory...

Add weight streaming at runtime

> Parameters also take up space and are allocated after compilation, meaning they can't be considered during the write_literals pass Part of this is also due to this issue #3310.

1
2
3
›