Segmentation fault (core dumped) on torch.jit.optimize_function() with Amazon Elastic Inference and PyTorch (through amazonei_pytorch_p36 conda environment)

Open nima-akram opened this issue 5 years ago • 0 comments

Hi,

I am getting a Segmentation fault (core dumped) error on ubuntu 16.04 when I execute the torch.jit.optimize_function(). This happens specifically in the amazonei_pytorch_p36 conda environment which uses torch==1.3.1 and a specialised torchei package (presumably for elastic inference stuff).

Upon running the code under gdb, I get a more verbose but confusing error:

#1  0x00007fffe5fb4eeb in c10::detail::LogAPIUsageFakeReturn(std::string const&) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libc10.so
#2  0x00007fffe899ed82 in torch::jit::GraphExecutorImplBase::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) () from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch.so
#3  0x00007fffe8b97df0 in torch::jit::Function::run(std::vector<c10::IValue, std::allocator<c10::IValue> >&) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch.so
#4  0x00007fffeb31a544 in torch::jit::runAndInsertCall(torch::jit::Function&, torch::jit::tuple_slice, pybind11::kwargs, c10::optional<c10::IValue>, std::function<torch::jit::Value* (torch::jit::Graph&, torch::jit::script::MatchedSchema const&)>) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch_python.so
#5  0x00007fffeb31a9c2 in torch::jit::invokeScriptMethodFromPython(torch::jit::script::Method&, torch::jit::tuple_slice, pybind11::kwargs) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch_python.so
#6  0x00007fffeb2f1b01 in void pybind11::cpp_function::initialize<torch::jit::script::initJitScriptBindings(_object*)::{lambda(pybind11::args, pybind11::kwargs)#33}, pybind11::object, pybind11::args, pybind11::kwargs, pybind11::name, pybind11::is_method, pybind11::sibling>(torch::jit::script::initJitScriptBindings(_object*)::{lambda(pybind11::args, pybind11::kwargs)#33}&&, pybind11::object (*)(pybind11::args, pybind11::kwargs), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch_python.so
#7  0x00007fffeaf94264 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) ()
   from /home/ubuntu/anaconda3/envs/amazonei_pytorch_p36/lib/python3.6/site-packages/torch/lib/libtorch_python.so

To Reproduce

Steps to reproduce the behavior:

Use amazonei_pytorch_p36 conda environment from the Ubuntu 16.04 Deep Learning AMI EC2 image
Load a pretrained ".pt" jit pytorch model and make it predict on sample data
Pass tensor through torch.jit.optimize_function()

My code sample:

import torch
model = torch.jit.load('traced_bert.pt', map_location=torch.device('cpu'))

from transformers import BertTokenizer, BertForSequenceClassification
tokenizer = BertTokenizer.from_pretrained('tokenizer/',do_lower_case=True)

# Set the maximum sequence length. The longest sequence in our training set is 47, but we'll leave room on the end anyway.
# In the original paper, the authors used a length of 512.
MAX_LEN = 256

## Import BERT tokenizer, that is used to convert our text into tokens that corresponds to BERT library
input_ids = [tokenizer.encode(sent, add_special_tokens=True,max_length=MAX_LEN,pad_to_max_length=True) for sent in sentences]

print('input_ids done')

## Create attention mask
attention_masks = []
## Create a mask of 1 for all input tokens and 0 for all padding tokens
attention_masks = [[float(i>0) for i in seq] for seq in input_ids]

print('attention_masks done')

# convert all our data into torch tensors, required data type for our model
inputs = torch.tensor(input_ids)
masks = torch.tensor(attention_masks)

print('input and mask tensors done')
print('model ready')

input_id = inputs
input_mask = masks

print('inputs ready')

with torch.no_grad():
    # Forward pass, calculate logit predictions
    with torch.jit.optimized_execution(True, {'target_device': 'eia:0'}):
        print('creating logits')
        logits = model(input_id, attention_mask=input_mask)[0]
        print('logits done')

logits = logits.to('cpu').numpy()

pred_flat = np.argmax(logits, axis=1).flatten()

Sample data:

sentences = [
    'hello my name is bob',
    'hello my name is not bob'
]

Nov 16 '20 15:11 nima-akram