Fix dock pypath

Open jperez999 opened this issue 3 years ago • 5 comments

remove excess pythonpath declaration. Should be set only in base or secondary installation phase. No where else.

Jul 05 '22 12:07 jperez999

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

Jul 05 '22 12:07 review-notebook-app[bot]

Click to view CI Results

GitHub pull request #431 of commit 63750285986911e40df3133900b412be4849a189, no merge conflicts.
Running as SYSTEM
Setting status of 63750285986911e40df3133900b412be4849a189 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/223/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/431/*:refs/remotes/origin/pr/431/* # timeout=10
 > git rev-parse 63750285986911e40df3133900b412be4849a189^{commit} # timeout=10
Checking out Revision 63750285986911e40df3133900b412be4849a189 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 63750285986911e40df3133900b412be4849a189 # timeout=10
Commit message: "remove excess python path setting"
 > git rev-list --no-walk 071da271939cdc0956baab073333435a307ab260 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins5073629776611346628.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items
tests/unit/test_version.py .                                             [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py .      [100%]
========================= 2 passed in 90.14s (0:01:30) =========================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins13744331202960576196.sh

Jul 05 '22 12:07 nvidia-merlin-bot

Documentation preview

https://nvidia-merlin.github.io/Merlin/review/pr-431

Jul 05 '22 12:07 github-actions[bot]

Click to view CI Results

GitHub pull request #431 of commit 17aec990e1b1d43a6e93f2656d98f0d3d09ea58a, no merge conflicts.
Running as SYSTEM
Setting status of 17aec990e1b1d43a6e93f2656d98f0d3d09ea58a to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/243/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/431/*:refs/remotes/origin/pr/431/* # timeout=10
 > git rev-parse 17aec990e1b1d43a6e93f2656d98f0d3d09ea58a^{commit} # timeout=10
Checking out Revision 17aec990e1b1d43a6e93f2656d98f0d3d09ea58a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 17aec990e1b1d43a6e93f2656d98f0d3d09ea58a # timeout=10
Commit message: "Merge branch 'main' into fix-dock-pypath"
 > git rev-list --no-walk b7bfb46270dcffbea3ee582e96556b85fab4e3b0 # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins7588720284986569121.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.5.0, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items
tests/unit/test_version.py .                                             [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py .      [100%]
======================== 2 passed in 146.96s (0:02:26) =========================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins12271827748898218482.sh

Jul 07 '22 21:07 nvidia-merlin-bot

Click to view CI Results

GitHub pull request #431 of commit c6c42ab73191527377e55fd406003ad81c6c5a46, no merge conflicts.
Running as SYSTEM
Setting status of c6c42ab73191527377e55fd406003ad81c6c5a46 to PENDING with url https://10.20.13.93:8080/job/merlin_merlin/250/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_merlin
using credential systems-login
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/Merlin # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/Merlin
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/Merlin +refs/pull/431/*:refs/remotes/origin/pr/431/* # timeout=10
 > git rev-parse c6c42ab73191527377e55fd406003ad81c6c5a46^{commit} # timeout=10
Checking out Revision c6c42ab73191527377e55fd406003ad81c6c5a46 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f c6c42ab73191527377e55fd406003ad81c6c5a46 # timeout=10
Commit message: "Merge branch 'main' into fix-dock-pypath"
 > git rev-list --no-walk 3f0d332dce3a80d86abc017b0283f44c985ec79a # timeout=10
[merlin_merlin] $ /bin/bash /tmp/jenkins1797933822103593977.sh
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_merlin/merlin
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 2 items
tests/unit/test_version.py .                                             [ 50%]
tests/unit/examples/test_building_deploying_multi_stage_RecSys.py F      [100%]
=================================== FAILURES ===================================
__________________________________ test_func ___________________________________
self = <testbook.client.TestbookNotebookClient object at 0x7f82a17ca850>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:


          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)


/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:

args = (<testbook.client.TestbookNotebookClient object at 0x7f82a17ca850>, {'id': '2de94ed0', 'cell_type': 'code', 'metadata'...ast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}, 53)
kwargs = {}
def wrapped(*args, **kwargs):


  return just_run(coro(*args, **kwargs))


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:

coro = <coroutine object NotebookClient.async_execute_cell at 0x7f82a113bd40>
def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()


  return loop.run_until_complete(coro)


/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:

self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-365' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dis...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n\n')>
def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')


  return future.result()


/usr/lib/python3.8/asyncio/base_events.py:616:

self = <testbook.client.TestbookNotebookClient object at 0x7f82a17ca850>
cell = {'id': '2de94ed0', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-11T15:14:57.443020Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53, execution_count = None, store_history = True
async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )


  await self._check_raise_for_error(cell, cell_index, exec_reply)


/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1022:

self = <testbook.client.TestbookNotebookClient object at 0x7f82a17ca850>
cell = {'id': '2de94ed0', 'cell_type': 'code', 'metadata': {'execution': {'iopub.status.busy': '2022-07-11T15:14:57.443020Z',...ps/feast.py, line 299 in transform>]"\n\nAt:\n  /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute\n']}]}
cell_index = 53
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '2b84...e, 'engine': '2b848c57-4fc3-4c05-82e7-85989a45721c', 'started': '2022-07-11T15:14:57.443274Z', 'status': 'error'}, ...}
async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:


      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)


E           nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E           ------------------
E

E           import shutil
E           from merlin.models.loader.tf_utils import configure_tensorflow
E           configure_tensorflow()
E           from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E           response = run_ensemble_on_tritonserver(
E               "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E           )
E           response = [x.tolist()[0] for x in response["ordered_ids"]]
E           shutil.rmtree("/tmp/examples/", ignore_errors=True)
E

E           ------------------
E

E           [0;31m---------------------------------------------------------------------------[0m
E           [0;31mInferenceServerException[0m                  Traceback (most recent call last)
E           Input [0;32mIn [32][0m, in [0;36m<cell line: 5>[0;34m()[0m
E           [1;32m      3[0m configure_tensorflow()
E           [1;32m      4[0m [38;5;28;01mfrom[39;00m [38;5;21;01mmerlin[39;00m[38;5;21;01m.[39;00m[38;5;21;01msystems[39;00m[38;5;21;01m.[39;00m[38;5;21;01mtriton[39;00m[38;5;21;01m.[39;00m[38;5;21;01mutils[39;00m [38;5;28;01mimport[39;00m run_ensemble_on_tritonserver
E           [0;32m----> 5[0m response [38;5;241m=[39m [43mrun_ensemble_on_tritonserver[49m[43m([49m
E           [1;32m      6[0m [43m    [49m[38;5;124;43m"[39;49m[38;5;124;43m/tmp/examples/poc_ensemble[39;49m[38;5;124;43m"[39;49m[43m,[49m[43m [49m[43moutputs[49m[43m,[49m[43m [49m[43mrequest[49m[43m,[49m[43m [49m[38;5;124;43m"[39;49m[38;5;124;43mensemble_model[39;49m[38;5;124;43m"[39;49m
E           [1;32m      7[0m [43m)[49m
E           [1;32m      8[0m response [38;5;241m=[39m [x[38;5;241m.[39mtolist()[[38;5;241m0[39m] [38;5;28;01mfor[39;00m x [38;5;129;01min[39;00m response[[38;5;124m"[39m[38;5;124mordered_ids[39m[38;5;124m"[39m]]
E           [1;32m      9[0m shutil[38;5;241m.[39mrmtree([38;5;124m"[39m[38;5;124m/tmp/examples/[39m[38;5;124m"[39m, ignore_errors[38;5;241m=[39m[38;5;28;01mTrue[39;00m)
E

E           File [0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93[0m, in [0;36mrun_ensemble_on_tritonserver[0;34m(tmpdir, output_columns, df, model_name)[0m
E           [1;32m     91[0m response [38;5;241m=[39m [38;5;28;01mNone[39;00m
E           [1;32m     92[0m [38;5;28;01mwith[39;00m run_triton_server(tmpdir) [38;5;28;01mas[39;00m client:
E           [0;32m---> 93[0m     response [38;5;241m=[39m [43msend_triton_request[49m[43m([49m[43mdf[49m[43m,[49m[43m [49m[43moutput_columns[49m[43m,[49m[43m [49m[43mclient[49m[38;5;241;43m=[39;49m[43mclient[49m[43m,[49m[43m [49m[43mtriton_model[49m[38;5;241;43m=[39;49m[43mmodel_name[49m[43m)[49m
E           [1;32m     95[0m [38;5;28;01mreturn[39;00m response
E

E           File [0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141[0m, in [0;36msend_triton_request[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)[0m
E           [1;32m    139[0m outputs [38;5;241m=[39m [grpcclient[38;5;241m.[39mInferRequestedOutput(col) [38;5;28;01mfor[39;00m col [38;5;129;01min[39;00m outputs_list]
E           [1;32m    140[0m [38;5;28;01mwith[39;00m client:
E           [0;32m--> 141[0m     response [38;5;241m=[39m [43mclient[49m[38;5;241;43m.[39;49m[43minfer[49m[43m([49m[43mtriton_model[49m[43m,[49m[43m [49m[43minputs[49m[43m,[49m[43m [49m[43mrequest_id[49m[38;5;241;43m=[39;49m[43mrequest_id[49m[43m,[49m[43m [49m[43moutputs[49m[38;5;241;43m=[39;49m[43moutputs[49m[43m)[49m
E           [1;32m    143[0m results [38;5;241m=[39m {}
E           [1;32m    144[0m [38;5;28;01mfor[39;00m col [38;5;129;01min[39;00m outputs_list:
E

E           File [0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322[0m, in [0;36mInferenceServerClient.infer[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)[0m
E           [1;32m   1320[0m     [38;5;28;01mreturn[39;00m result
E           [1;32m   1321[0m [38;5;28;01mexcept[39;00m grpc[38;5;241m.[39mRpcError [38;5;28;01mas[39;00m rpc_error:
E           [0;32m-> 1322[0m     [43mraise_error_grpc[49m[43m([49m[43mrpc_error[49m[43m)[49m
E

E           File [0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62[0m, in [0;36mraise_error_grpc[0;34m(rpc_error)[0m
E           [1;32m     61[0m [38;5;28;01mdef[39;00m [38;5;21mraise_error_grpc[39m(rpc_error):
E           [0;32m---> 62[0m     [38;5;28;01mraise[39;00m get_error_grpc(rpc_error) [38;5;28;01mfrom[39;00m [38;5;28mNone[39m
E

E           [0;31mInferenceServerException[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E

E           At:
E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E

E           InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E               1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E

E           Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E

E           At:
E             /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/nbclient/client.py:916: CellExecutionError
During handling of the above exception, another exception occurred:
def test_func():
    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "01-Building-Recommender-Systems-with-Merlin.ipynb",
        execute=False,
    ) as tb1:
        tb1.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["NUM_ROWS"] = "10000"
            os.system("mkdir -p /tmp/examples")
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        tb1.execute()
        assert os.path.isdir("/tmp/examples/dlrm")
        assert os.path.isdir("/tmp/examples/feature_repo")
        assert os.path.isdir("/tmp/examples/query_tower")
        assert os.path.isfile("/tmp/examples/item_embeddings.parquet")
        assert os.path.isfile("/tmp/examples/feature_repo/user_features.py")
        assert os.path.isfile("/tmp/examples/feature_repo/item_features.py")

    with testbook(
        REPO_ROOT
        / "examples"
        / "Building-and-deploying-multi-stage-RecSys"
        / "02-Deploying-multi-stage-RecSys-with-Merlin-Systems.ipynb",
        execute=False,
    ) as tb2:
        tb2.inject(
            """
            import os
            os.environ["DATA_FOLDER"] = "/tmp/data/"
            os.environ["BASE_DIR"] = "/tmp/examples/"
            """
        )
        NUM_OF_CELLS = len(tb2.cells)
        tb2.execute_cell(list(range(0, NUM_OF_CELLS - 3)))
        top_k = tb2.ref("top_k")
        outputs = tb2.ref("outputs")
        assert outputs[0] == "ordered_ids"


      tb2.inject(


            """
            import shutil
            from merlin.models.loader.tf_utils import configure_tensorflow
            configure_tensorflow()
            from merlin.systems.triton.utils import run_ensemble_on_tritonserver
            response = run_ensemble_on_tritonserver(
                "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
            )
            response = [x.tolist()[0] for x in response["ordered_ids"]]
            shutil.rmtree("/tmp/examples/", ignore_errors=True)
            """
        )

tests/unit/examples/test_building_deploying_multi_stage_RecSys.py:57:

/usr/local/lib/python3.8/dist-packages/testbook/client.py:237: in inject
cell = TestbookNode(self.execute_cell(inject_idx)) if run else TestbookNode(code_cell)

self = <testbook.client.TestbookNotebookClient object at 0x7f82a17ca850>
cell = [53], kwargs = {}, cell_indexes = [53], executed_cells = [], idx = 53
def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:


          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))


E               testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E               ------------------
E

E               import shutil
E               from merlin.models.loader.tf_utils import configure_tensorflow
E               configure_tensorflow()
E               from merlin.systems.triton.utils import run_ensemble_on_tritonserver
E               response = run_ensemble_on_tritonserver(
E                   "/tmp/examples/poc_ensemble", outputs, request, "ensemble_model"
E               )
E               response = [x.tolist()[0] for x in response["ordered_ids"]]
E               shutil.rmtree("/tmp/examples/", ignore_errors=True)
E

E               ------------------
E

E               [0;31m---------------------------------------------------------------------------[0m
E               [0;31mInferenceServerException[0m                  Traceback (most recent call last)
E               Input [0;32mIn [32][0m, in [0;36m<cell line: 5>[0;34m()[0m
E               [1;32m      3[0m configure_tensorflow()
E               [1;32m      4[0m [38;5;28;01mfrom[39;00m [38;5;21;01mmerlin[39;00m[38;5;21;01m.[39;00m[38;5;21;01msystems[39;00m[38;5;21;01m.[39;00m[38;5;21;01mtriton[39;00m[38;5;21;01m.[39;00m[38;5;21;01mutils[39;00m [38;5;28;01mimport[39;00m run_ensemble_on_tritonserver
E               [0;32m----> 5[0m response [38;5;241m=[39m [43mrun_ensemble_on_tritonserver[49m[43m([49m
E               [1;32m      6[0m [43m    [49m[38;5;124;43m"[39;49m[38;5;124;43m/tmp/examples/poc_ensemble[39;49m[38;5;124;43m"[39;49m[43m,[49m[43m [49m[43moutputs[49m[43m,[49m[43m [49m[43mrequest[49m[43m,[49m[43m [49m[38;5;124;43m"[39;49m[38;5;124;43mensemble_model[39;49m[38;5;124;43m"[39;49m
E               [1;32m      7[0m [43m)[49m
E               [1;32m      8[0m response [38;5;241m=[39m [x[38;5;241m.[39mtolist()[[38;5;241m0[39m] [38;5;28;01mfor[39;00m x [38;5;129;01min[39;00m response[[38;5;124m"[39m[38;5;124mordered_ids[39m[38;5;124m"[39m]]
E               [1;32m      9[0m shutil[38;5;241m.[39mrmtree([38;5;124m"[39m[38;5;124m/tmp/examples/[39m[38;5;124m"[39m, ignore_errors[38;5;241m=[39m[38;5;28;01mTrue[39;00m)
E

E               File [0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:93[0m, in [0;36mrun_ensemble_on_tritonserver[0;34m(tmpdir, output_columns, df, model_name)[0m
E               [1;32m     91[0m response [38;5;241m=[39m [38;5;28;01mNone[39;00m
E               [1;32m     92[0m [38;5;28;01mwith[39;00m run_triton_server(tmpdir) [38;5;28;01mas[39;00m client:
E               [0;32m---> 93[0m     response [38;5;241m=[39m [43msend_triton_request[49m[43m([49m[43mdf[49m[43m,[49m[43m [49m[43moutput_columns[49m[43m,[49m[43m [49m[43mclient[49m[38;5;241;43m=[39;49m[43mclient[49m[43m,[49m[43m [49m[43mtriton_model[49m[38;5;241;43m=[39;49m[43mmodel_name[49m[43m)[49m
E               [1;32m     95[0m [38;5;28;01mreturn[39;00m response
E

E               File [0;32m/usr/local/lib/python3.8/dist-packages/merlin/systems/triton/utils.py:141[0m, in [0;36msend_triton_request[0;34m(df, outputs_list, client, endpoint, request_id, triton_model)[0m
E               [1;32m    139[0m outputs [38;5;241m=[39m [grpcclient[38;5;241m.[39mInferRequestedOutput(col) [38;5;28;01mfor[39;00m col [38;5;129;01min[39;00m outputs_list]
E               [1;32m    140[0m [38;5;28;01mwith[39;00m client:
E               [0;32m--> 141[0m     response [38;5;241m=[39m [43mclient[49m[38;5;241;43m.[39;49m[43minfer[49m[43m([49m[43mtriton_model[49m[43m,[49m[43m [49m[43minputs[49m[43m,[49m[43m [49m[43mrequest_id[49m[38;5;241;43m=[39;49m[43mrequest_id[49m[43m,[49m[43m [49m[43moutputs[49m[38;5;241;43m=[39;49m[43moutputs[49m[43m)[49m
E               [1;32m    143[0m results [38;5;241m=[39m {}
E               [1;32m    144[0m [38;5;28;01mfor[39;00m col [38;5;129;01min[39;00m outputs_list:
E

E               File [0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322[0m, in [0;36mInferenceServerClient.infer[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)[0m
E               [1;32m   1320[0m     [38;5;28;01mreturn[39;00m result
E               [1;32m   1321[0m [38;5;28;01mexcept[39;00m grpc[38;5;241m.[39mRpcError [38;5;28;01mas[39;00m rpc_error:
E               [0;32m-> 1322[0m     [43mraise_error_grpc[49m[43m([49m[43mrpc_error[49m[43m)[49m
E

E               File [0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62[0m, in [0;36mraise_error_grpc[0;34m(rpc_error)[0m
E               [1;32m     61[0m [38;5;28;01mdef[39;00m [38;5;21mraise_error_grpc[39m(rpc_error):
E               [0;32m---> 62[0m     [38;5;28;01mraise[39;00m get_error_grpc(rpc_error) [38;5;28;01mfrom[39;00m [38;5;28mNone[39m
E

E               [0;31mInferenceServerException[0m: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E

E               At:
E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
E

E               InferenceServerException: [StatusCode.INTERNAL] in ensemble 'ensemble_model', Failed to process the request(s) for model instance '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
E                   1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
E

E               Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
E

E               At:
E                 /tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stdout call -----------------------------
Signal (2) received.
----------------------------- Captured stderr call -----------------------------
2022-07-11 15:13:56.650500: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-11 15:13:58.662853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-11 15:13:58.663679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
WARNING clustering 247 points to 32 centroids: please provide at least 1248 training points
2022-07-11 15:14:50.448837: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-11 15:14:52.484993: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1627 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-11 15:14:52.485786: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 14532 MB memory:  -> device: 1, name: Tesla P100-DGXS-16GB, pci bus id: 0000:08:00.0, compute capability: 6.0
I0711 15:14:57.709383 21573 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fab26000000' with size 268435456
I0711 15:14:57.710119 21573 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 15:14:57.717432 21573 model_repository_manager.cc:1191] loading: 1_predicttensorflow:1
I0711 15:14:57.817698 21573 model_repository_manager.cc:1191] loading: 0_queryfeast:1
I0711 15:14:57.917935 21573 model_repository_manager.cc:1191] loading: 2_queryfaiss:1
I0711 15:14:58.018266 21573 model_repository_manager.cc:1191] loading: 3_queryfeast:1
I0711 15:14:58.098923 21573 tensorflow.cc:2181] TRITONBACKEND_Initialize: tensorflow
I0711 15:14:58.098962 21573 tensorflow.cc:2191] Triton TRITONBACKEND API version: 1.9
I0711 15:14:58.098970 21573 tensorflow.cc:2197] 'tensorflow' TRITONBACKEND API version: 1.9
I0711 15:14:58.098975 21573 tensorflow.cc:2221] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}}
I0711 15:14:58.099010 21573 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 1_predicttensorflow (version 1)
I0711 15:14:58.104494 21573 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 1_predicttensorflow (GPU device 0)
I0711 15:14:58.118507 21573 model_repository_manager.cc:1191] loading: 4_unrollfeatures:1
I0711 15:14:58.218784 21573 model_repository_manager.cc:1191] loading: 5_predicttensorflow:1
I0711 15:14:58.319085 21573 model_repository_manager.cc:1191] loading: 6_softmaxsampling:1
2022-07-11 15:14:58.464305: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-11 15:14:58.467784: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-11 15:14:58.467841: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-11 15:14:58.467949: I tensorflow/core/platform/cpu_feature_guard.cc:152] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-11 15:14:58.515764: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10540 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-11 15:14:58.553229: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-11 15:14:58.632719: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/1_predicttensorflow/1/model.savedmodel
2022-07-11 15:14:58.658115: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 193830 microseconds.
I0711 15:14:58.658362 21573 model_repository_manager.cc:1345] successfully loaded '1_predicttensorflow' version 1
I0711 15:14:58.662208 21573 tensorflow.cc:2281] TRITONBACKEND_ModelInitialize: 5_predicttensorflow (version 1)
I0711 15:14:58.663909 21573 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 2_queryfaiss (GPU device 0)
I0711 15:15:01.064150 21573 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_queryfeast (GPU device 0)
I0711 15:15:01.066019 21573 model_repository_manager.cc:1345] successfully loaded '2_queryfaiss' version 1
I0711 15:15:03.418976 21573 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 3_queryfeast (GPU device 0)
I0711 15:15:03.419228 21573 model_repository_manager.cc:1345] successfully loaded '0_queryfeast' version 1
I0711 15:15:05.804694 21573 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 4_unrollfeatures (GPU device 0)
I0711 15:15:05.804934 21573 model_repository_manager.cc:1345] successfully loaded '3_queryfeast' version 1
I0711 15:15:07.899642 21573 tensorflow.cc:2330] TRITONBACKEND_ModelInstanceInitialize: 5_predicttensorflow (GPU device 0)
I0711 15:15:07.899893 21573 model_repository_manager.cc:1345] successfully loaded '4_unrollfeatures' version 1
2022-07-11 15:15:07.901233: I tensorflow/cc/saved_model/reader.cc:43] Reading SavedModel from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-11 15:15:07.918121: I tensorflow/cc/saved_model/reader.cc:78] Reading meta graph with tags { serve }
2022-07-11 15:15:07.918175: I tensorflow/cc/saved_model/reader.cc:119] Reading SavedModel debug info (if present) from: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-11 15:15:07.920301: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10540 MB memory:  -> device: 0, name: Tesla P100-DGXS-16GB, pci bus id: 0000:07:00.0, compute capability: 6.0
2022-07-11 15:15:07.943590: I tensorflow/cc/saved_model/loader.cc:230] Restoring SavedModel bundle.
2022-07-11 15:15:08.101939: I tensorflow/cc/saved_model/loader.cc:214] Running initialization op on SavedModel bundle at path: /tmp/examples/poc_ensemble/5_predicttensorflow/1/model.savedmodel
2022-07-11 15:15:08.155606: I tensorflow/cc/saved_model/loader.cc:321] SavedModel load for tags { serve }; Status: success: OK. Took 254386 microseconds.
I0711 15:15:08.155746 21573 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 6_softmaxsampling (GPU device 0)
I0711 15:15:08.156571 21573 model_repository_manager.cc:1345] successfully loaded '5_predicttensorflow' version 1
I0711 15:15:10.286184 21573 model_repository_manager.cc:1345] successfully loaded '6_softmaxsampling' version 1
I0711 15:15:10.289855 21573 model_repository_manager.cc:1191] loading: ensemble_model:1
I0711 15:15:10.390659 21573 model_repository_manager.cc:1345] successfully loaded 'ensemble_model' version 1
I0711 15:15:10.390832 21573 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0711 15:15:10.390955 21573 server.cc:583]
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend    | Path                                                            | Config                                                                                                                                                                       |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","version":"2","default-max-batch-size":"4"}} |
| python     | /opt/tritonserver/backends/python/libtriton_python.so           | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}               |
+------------+-----------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0711 15:15:10.391079 21573 server.cc:626]
+---------------------+---------+--------+
| Model               | Version | Status |
+---------------------+---------+--------+
| 0_queryfeast        | 1       | READY  |
| 1_predicttensorflow | 1       | READY  |
| 2_queryfaiss        | 1       | READY  |
| 3_queryfeast        | 1       | READY  |
| 4_unrollfeatures    | 1       | READY  |
| 5_predicttensorflow | 1       | READY  |
| 6_softmaxsampling   | 1       | READY  |
| ensemble_model      | 1       | READY  |
+---------------------+---------+--------+
I0711 15:15:10.455449 21573 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 15:15:10.456294 21573 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                        |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                       |
| server_version                   | 2.22.0                                                                                                                                                                                       |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0]         | /tmp/examples/poc_ensemble                                                                                                                                                                   |
| model_control_mode               | MODE_NONE                                                                                                                                                                                    |
| strict_model_config              | 1                                                                                                                                                                                            |
| rate_limit                       | OFF                                                                                                                                                                                          |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |
| response_cache_byte_size         | 0                                                                                                                                                                                            |
| min_supported_compute_capability | 6.0                                                                                                                                                                                          |
| strict_readiness                 | 1                                                                                                                                                                                            |
| exit_timeout                     | 30                                                                                                                                                                                           |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0711 15:15:10.457062 21573 grpc_server.cc:4589] Started GRPCInferenceService at 0.0.0.0:8001
I0711 15:15:10.457286 21573 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
I0711 15:15:10.498186 21573 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
W0711 15:15:11.477680 21573 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 15:15:11.477747 21573 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0711 15:15:12.477902 21573 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 15:15:12.477955 21573 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
W0711 15:15:13.498439 21573 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 15:15:13.498494 21573 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
0711 15:15:14.985303 21830 pb_stub.cc:749] Failed to process the request(s) for model '3_queryfeast', message: TypeError: init(): incompatible constructor arguments. The following argument types are supported:
1. c_python_backend_utils.InferenceResponse(output_tensors: List[c_python_backend_utils.Tensor], error: c_python_backend_utils.TritonError = None)
Invoked with: kwargs: tensors=[], error="<class 'TypeError'>, int() argument must be a string, a bytes-like object or a number, not 'NoneType', [<FrameSummary file /tmp/examples/poc_ensemble/3_queryfeast/1/model.py, line 105 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/op_runner.py, line 38 in execute>, <FrameSummary file /usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py, line 299 in transform>]"
At:
/tmp/examples/poc_ensemble/3_queryfeast/1/model.py(122): execute
I0711 15:15:14.989861 21573 server.cc:257] Waiting for in-flight requests to complete.
I0711 15:15:14.989889 21573 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 15:15:14.989897 21573 model_repository_manager.cc:1223] unloading: ensemble_model:1
I0711 15:15:14.989943 21573 model_repository_manager.cc:1223] unloading: 6_softmaxsampling:1
I0711 15:15:14.989975 21573 model_repository_manager.cc:1223] unloading: 5_predicttensorflow:1
I0711 15:15:14.990006 21573 model_repository_manager.cc:1223] unloading: 4_unrollfeatures:1
I0711 15:15:14.990046 21573 model_repository_manager.cc:1223] unloading: 3_queryfeast:1
I0711 15:15:14.990115 21573 model_repository_manager.cc:1223] unloading: 2_queryfaiss:1
I0711 15:15:14.990109 21573 model_repository_manager.cc:1328] successfully unloaded 'ensemble_model' version 1
I0711 15:15:14.990149 21573 model_repository_manager.cc:1223] unloading: 1_predicttensorflow:1
I0711 15:15:14.990200 21573 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0711 15:15:14.990239 21573 model_repository_manager.cc:1223] unloading: 0_queryfeast:1
I0711 15:15:14.990270 21573 server.cc:288] All models are stopped, unloading models
I0711 15:15:14.990281 21573 server.cc:295] Timeout 30: Found 7 live models and 0 in-flight non-inference requests
I0711 15:15:14.990408 21573 tensorflow.cc:2368] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0711 15:15:14.990500 21573 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0711 15:15:14.990628 21573 tensorflow.cc:2307] TRITONBACKEND_ModelFinalize: delete model state
I0711 15:15:15.006423 21573 model_repository_manager.cc:1328] successfully unloaded '1_predicttensorflow' version 1
I0711 15:15:15.016160 21573 model_repository_manager.cc:1328] successfully unloaded '5_predicttensorflow' version 1
I0711 15:15:15.990423 21573 server.cc:295] Timeout 29: Found 5 live models and 0 in-flight non-inference requests
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0711 15:15:16.403806 21573 model_repository_manager.cc:1328] successfully unloaded '6_softmaxsampling' version 1
/usr/local/lib/python3.8/dist-packages/merlin/systems/dag/ops/feast.py:15: DeprecationWarning: np.float is a deprecated alias for the builtin float. To silence this warning, use float by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.float64 here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ValueType.FLOAT: (np.float, False, False),
I0711 15:15:16.447350 21573 model_repository_manager.cc:1328] successfully unloaded '4_unrollfeatures' version 1
I0711 15:15:16.566330 21573 model_repository_manager.cc:1328] successfully unloaded '0_queryfeast' version 1
I0711 15:15:16.594458 21573 model_repository_manager.cc:1328] successfully unloaded '2_queryfaiss' version 1
I0711 15:15:16.676566 21573 model_repository_manager.cc:1328] successfully unloaded '3_queryfeast' version 1
I0711 15:15:16.990579 21573 server.cc:295] Timeout 28: Found 0 live models and 0 in-flight non-inference requests
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/logging/init.py", line 2127, in shutdown
h.close()
File "/usr/local/lib/python3.8/dist-packages/absl/logging/init.py", line 934, in close
self.stream.close()
File "/usr/local/lib/python3.8/dist-packages/ipykernel/iostream.py", line 438, in close
self.watch_fd_thread.join()
AttributeError: 'OutStream' object has no attribute 'watch_fd_thread'
=========================== short test summary info ============================
FAILED tests/unit/examples/test_building_deploying_multi_stage_RecSys.py::test_func
==================== 1 failed, 1 passed in 92.43s (0:01:32) ====================
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script  : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/Merlin/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_merlin] $ /bin/bash /tmp/jenkins4538807349618237391.sh

Jul 11 '22 15:07 nvidia-merlin-bot

I'm going to close this one because it seems like these changes are in the notebooks now

Oct 05 '22 13:10 nv-alaiacano