TensorRT icon indicating copy to clipboard operation
TensorRT copied to clipboard

🐛 [Bug] Expected ivalues_maps.count(input) to be true but got false

Open Charlyo opened this issue 2 years ago • 8 comments

Bug Description

RuntimeError: [Error thrown at core/partitioning/shape_analysis.cpp:183] Expected ivalues_maps.count(input) to be true but got false Could not find torch::jit::Value* hidden.1 produced from %hidden.1 : (Tensor, Tensor) = prim::TupleConstruct(%440, %440) in lowering graph for mini graph input.

To Reproduce

Steps to reproduce the behavior:

hidden: Tuple[torch.Tensor, torch.Tensor] = (
            torch.zeros(
                size=(batch_size, self.hidden_size),
                dtype=torch.float,
                device=self.device
            ),
            torch.zeros(
                size=(batch_size, self.hidden_size),
                dtype=torch.float,
                device=self.device
            )
        )

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

torch==2.1.0.dev20230418+cu117 torch-tensorrt==1.4.0.dev0+a245b861

  • CPU Architecture: Intel 10750h
  • OS (e.g., Linux): Linux
  • Are you using local sources or building from archives: nvidia tensorrt 8.5.3.1
  • Python version: 3.8
  • CUDA version: 11.7

Additional context

Charlyo avatar Apr 18 '23 15:04 Charlyo

Hello - could you please share any information/sample of the model being compiled which led to the error, and/or the full debug logs associated with the failure?

Related: #1815

gs-olive avatar Apr 18 '23 16:04 gs-olive

With just a simple class that has this part of code in the forward. Being batch_size and self.hidden_size any positive integer

Charlyo avatar Apr 18 '23 21:04 Charlyo

I tried the following sample on my machine, which has a linear layer and uses the tensors in the hidden tuple, and it is working using commit 6f7627f. Please let me know what changes to the below could elicit the error.

Code Sample
import torch
import torch_tensorrt

class Network(torch.nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.lin = torch.nn.Linear(5, 5)
        self.hidden_size = 5
        self.device = "cuda:0"

    def forward(self, X: torch.Tensor):
        out_1 = self.lin(X)
        hidden = (
            torch.zeros(
                size=(5, self.hidden_size),
                dtype=torch.float,
                device=self.device,
            ),
            torch.zeros(
                size=(5, self.hidden_size),
                dtype=torch.float,
                device=self.device,
            )
        )
        a = out_1 + hidden[0] + hidden[1]
        return a


net = Network().eval().cuda()

kwargs = {
    'inputs': [torch_tensorrt.Input(shape=[5, 5], dtype=torch.float32)],
    'enabled_precisions': {torch.float32},
    'min_block_size': 1,
}

trt_network = torch_tensorrt.compile(torch.jit.script(net), **kwargs)

out = trt_network(torch.rand(5, 5).cuda())

gs-olive avatar Apr 18 '23 21:04 gs-olive

@gs-olive the hidden variable was input for an attentioncell:

` class AttentionCell(nn.Module): """Attention cell class."""

def __init__(self, input_size, hidden_size, num_embeddings):
    super(AttentionCell, self).__init__()
    self.i2h = nn.Linear(input_size, hidden_size, bias=False)
    self.h2h = nn.Linear(hidden_size, hidden_size)
    # either i2i or h2h should have bias
    self.score = nn.Linear(hidden_size, 1, bias=False)
    self.rnn = nn.LSTMCell(input_size + num_embeddings, hidden_size)
    self.hidden_size = hidden_size

def forward(
    self,
    prev_hidden: Tuple[torch.Tensor, torch.Tensor],
    batch_h,
    char_onehots
):
    # [batch_size x num_encoder_step x num_channel] ->
    # [batch_size x num_encoder_step x hidden_size]
    batch_h_proj = self.i2h(batch_h)
    prev_hidden_proj = self.h2h(prev_hidden[0]).unsqueeze(1)
    e = self.score(torch.tanh(batch_h_proj + prev_hidden_proj))
    # batch_size x num_encoder_step * 1

    alpha = functional.softmax(e, dim=1)
    context = torch.bmm(alpha.permute(0, 2, 1), batch_h).squeeze(1)
    # batch_size x num_channel
    concat_context = torch.cat([context, char_onehots], 1)
    # batch_size x (num_channel + num_embedding)
    cur_hidden = self.rnn(concat_context, prev_hidden)
    return cur_hidden, alpha

`

Charlyo avatar Apr 19 '23 06:04 Charlyo

@bowang007 - this may be similar to your work on torch::jit::Value* errors. Do you have any suggestions to resolve this issue?

gs-olive avatar Apr 24 '23 19:04 gs-olive

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

github-actions[bot] avatar Aug 15 '23 00:08 github-actions[bot]

@Charlyo Torchscript path might be incapable of supporting recurrent networks. I tested your model in dynamo path and it works. I would suggest trying dynamo path since this is in our top priority right now. Please refer to this link for more details. Thanks!

bowang007 avatar Oct 25 '23 00:10 bowang007

Hey @Charlyo I have a similar error, referencing this issue #1684 and was wondering If there was any update on a fix , appreciate your help!

willianck avatar Jan 24 '24 20:01 willianck