NVTabular Allows int64 hugectr keysetfile

Jan 07 '22 03:01 albert17

@jershi425 Initially, we can just write all values using 64 bits.

Do you know the expected dtype before reading the file?

Jan 07 '22 03:01 albert17

Click to view CI Results

GitHub pull request #1351 of commit 52e625cd42f31344e919319d7d12d3abdd6eaecb, no merge conflicts. Running as SYSTEM Setting status of 52e625cd42f31344e919319d7d12d3abdd6eaecb to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/4018/ and message: 'Pending' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1351/*:refs/remotes/origin/pr/1351/* # timeout=10 > git rev-parse 52e625cd42f31344e919319d7d12d3abdd6eaecb^{commit} # timeout=10 Checking out Revision 52e625cd42f31344e919319d7d12d3abdd6eaecb (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 52e625cd42f31344e919319d7d12d3abdd6eaecb # timeout=10 Commit message: "Increases size" > git rev-list --no-walk a375d77fbd715d5a7e4c28fe7811b6583de0fbda # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins4763363702340837413.sh Installing NVTabular Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.3.1) Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (59.4.0) Collecting setuptools Downloading setuptools-60.3.1-py3-none-any.whl (953 kB) Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.1) Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.9.0) Requirement already satisfied: numpy==1.20.3 in /var/jenkins_home/.local/lib/python3.8/site-packages (1.20.3) Found existing installation: nvtabular 0.8.0+7.gb459467 Can't uninstall 'nvtabular'. No files were found to uninstall. running develop running egg_info creating nvtabular.egg-info writing nvtabular.egg-info/PKG-INFO writing dependency_links to nvtabular.egg-info/dependency_links.txt writing requirements to nvtabular.egg-info/requires.txt writing top-level names to nvtabular.egg-info/top_level.txt writing manifest file 'nvtabular.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' /var/jenkins_home/.local/lib/python3.8/site-packages/setuptools/command/easy_install.py:156: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( /var/jenkins_home/.local/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( warning: no files found matching '*.h' under directory 'cpp' warning: no files found matching '*.cu' under directory 'cpp' warning: no files found matching '*.cuh' under directory 'cpp' adding license file 'LICENSE' writing manifest file 'nvtabular.egg-info/SOURCES.txt' running build_ext x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17 building 'nvtabular_cpp' extension creating build creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/cpp creating build/temp.linux-x86_64-3.8/cpp/nvtabular creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+23.g52e625c -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+23.g52e625c -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+23.g52e625c -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+23.g52e625c -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0 creating build/lib.linux-x86_64-3.8 x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .) nvtabular 0.8.0+23.g52e625c is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular Processing dependencies for nvtabular==0.8.0+23.g52e625c Searching for packaging==21.3 Best match: packaging 21.3 Adding packaging 21.3 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for protobuf==3.19.1 Best match: protobuf 3.19.1 Adding protobuf 3.19.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for tensorflow-metadata==1.5.0 Best match: tensorflow-metadata 1.5.0 Processing tensorflow_metadata-1.5.0-py3.8.egg tensorflow-metadata 1.5.0 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/tensorflow_metadata-1.5.0-py3.8.egg Searching for pyarrow==4.0.1 Best match: pyarrow 4.0.1 Adding pyarrow 4.0.1 to easy-install.pth file Installing plasma_store script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages Searching for tqdm==4.62.3 Best match: tqdm 4.62.3 Processing tqdm-4.62.3-py3.8.egg tqdm 4.62.3 is already the active version in easy-install.pth Installing tqdm script to /var/jenkins_home/.local/bin

Using /usr/local/lib/python3.8/dist-packages/tqdm-4.62.3-py3.8.egg Searching for numba==0.54.1 Best match: numba 0.54.1 Adding numba 0.54.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for pandas==1.3.5 Best match: pandas 1.3.5 Processing pandas-1.3.5-py3.8-linux-x86_64.egg pandas 1.3.5 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/pandas-1.3.5-py3.8-linux-x86_64.egg Searching for distributed==2021.7.1 Best match: distributed 2021.7.1 Processing distributed-2021.7.1-py3.8.egg distributed 2021.7.1 is already the active version in easy-install.pth Installing dask-ssh script to /var/jenkins_home/.local/bin Installing dask-scheduler script to /var/jenkins_home/.local/bin Installing dask-worker script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages/distributed-2021.7.1-py3.8.egg Searching for dask==2021.7.1 Best match: dask 2021.7.1 Processing dask-2021.7.1-py3.8.egg dask 2021.7.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg Searching for pyparsing==3.0.6 Best match: pyparsing 3.0.6 Adding pyparsing 3.0.6 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for googleapis-common-protos==1.54.0 Best match: googleapis-common-protos 1.54.0 Adding googleapis-common-protos 1.54.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for absl-py==0.12.0 Best match: absl-py 0.12.0 Processing absl_py-0.12.0-py3.8.egg absl-py 0.12.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/absl_py-0.12.0-py3.8.egg Searching for numpy==1.20.3 Best match: numpy 1.20.3 Adding numpy 1.20.3 to easy-install.pth file Installing f2py script to /var/jenkins_home/.local/bin Installing f2py3 script to /var/jenkins_home/.local/bin Installing f2py3.8 script to /var/jenkins_home/.local/bin

Using /var/jenkins_home/.local/lib/python3.8/site-packages Searching for setuptools==59.7.0 Best match: setuptools 59.7.0 Adding setuptools 59.7.0 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for llvmlite==0.37.0 Best match: llvmlite 0.37.0 Processing llvmlite-0.37.0-py3.8-linux-x86_64.egg llvmlite 0.37.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/llvmlite-0.37.0-py3.8-linux-x86_64.egg Searching for pytz==2021.3 Best match: pytz 2021.3 Adding pytz 2021.3 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for python-dateutil==2.8.2 Best match: python-dateutil 2.8.2 Adding python-dateutil 2.8.2 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for zict==2.0.0 Best match: zict 2.0.0 Processing zict-2.0.0-py3.8.egg zict 2.0.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/zict-2.0.0-py3.8.egg Searching for tornado==6.1 Best match: tornado 6.1 Processing tornado-6.1-py3.8-linux-x86_64.egg tornado 6.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tornado-6.1-py3.8-linux-x86_64.egg Searching for toolz==0.11.2 Best match: toolz 0.11.2 Processing toolz-0.11.2-py3.8.egg toolz 0.11.2 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/toolz-0.11.2-py3.8.egg Searching for tblib==1.7.0 Best match: tblib 1.7.0 Processing tblib-1.7.0-py3.8.egg tblib 1.7.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/tblib-1.7.0-py3.8.egg Searching for sortedcontainers==2.4.0 Best match: sortedcontainers 2.4.0 Processing sortedcontainers-2.4.0-py3.8.egg sortedcontainers 2.4.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/sortedcontainers-2.4.0-py3.8.egg Searching for PyYAML==5.4.1 Best match: PyYAML 5.4.1 Processing PyYAML-5.4.1-py3.8-linux-x86_64.egg PyYAML 5.4.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/PyYAML-5.4.1-py3.8-linux-x86_64.egg Searching for psutil==5.8.0 Best match: psutil 5.8.0 Processing psutil-5.8.0-py3.8-linux-x86_64.egg psutil 5.8.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/psutil-5.8.0-py3.8-linux-x86_64.egg Searching for msgpack==1.0.3 Best match: msgpack 1.0.3 Processing msgpack-1.0.3-py3.8-linux-x86_64.egg msgpack 1.0.3 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/msgpack-1.0.3-py3.8-linux-x86_64.egg Searching for cloudpickle==2.0.0 Best match: cloudpickle 2.0.0 Processing cloudpickle-2.0.0-py3.8.egg cloudpickle 2.0.0 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/cloudpickle-2.0.0-py3.8.egg Searching for click==8.0.3 Best match: click 8.0.3 Processing click-8.0.3-py3.8.egg click 8.0.3 is already the active version in easy-install.pth

Using /usr/local/lib/python3.8/dist-packages/click-8.0.3-py3.8.egg Searching for partd==1.2.0 Best match: partd 1.2.0 Processing partd-1.2.0-py3.8.egg partd 1.2.0 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/partd-1.2.0-py3.8.egg Searching for fsspec==2021.11.1 Best match: fsspec 2021.11.1 Adding fsspec 2021.11.1 to easy-install.pth file

Using /usr/local/lib/python3.8/dist-packages Searching for six==1.15.0 Best match: six 1.15.0 Adding six 1.15.0 to easy-install.pth file

Using /var/jenkins_home/.local/lib/python3.8/site-packages Searching for HeapDict==1.0.1 Best match: HeapDict 1.0.1 Processing HeapDict-1.0.1-py3.8.egg HeapDict 1.0.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/HeapDict-1.0.1-py3.8.egg Searching for locket==0.2.1 Best match: locket 0.2.1 Processing locket-0.2.1-py3.8.egg locket 0.2.1 is already the active version in easy-install.pth

Using /var/jenkins_home/.local/lib/python3.8/site-packages/locket-0.2.1-py3.8.egg Finished processing dependencies for nvtabular==0.8.0+23.g52e625c Running black --check All done! ✨ 🍰 ✨ 172 files would be left unchanged. Running flake8 Running isort Skipped 2 files Running bandit Running pylint ************* Module nvtabular.dispatch nvtabular/dispatch.py:607:11: I1101: Module 'numpy.random.mtrand' has no 'RandomState' member, but source is unavailable. Consider adding this module to extension-pkg-allow-list if you want to perform analysis based on run-time introspection of living objects. (c-extension-no-member)

Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

Running flake8-nb Building docs make: Entering directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs' /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.7) or chardet (3.0.4) doesn't match a supported version! warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported " /usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document warn("Container node skipped: type={0}".format(mdnode.t)) /usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document warn("Container node skipped: type={0}".format(mdnode.t)) /usr/local/lib/python3.8/dist-packages/recommonmark/parser.py:75: UserWarning: Container node skipped: type=document warn("Container node skipped: type={0}".format(mdnode.t)) make: Leaving directory '/var/jenkins_home/workspace/nvtabular_tests/nvtabular/docs' ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1629 items / 3 skipped / 1626 selected

tests/unit/test_dask_nvt.py ............................................ [ 2%] ....................................................................... [ 7%] tests/unit/test_io.py .................................................. [ 10%] ........................................................................ [ 14%] ..................ssssssss.............................................. [ 18%] ......... [ 19%] tests/unit/test_notebooks.py ...... [ 19%] tests/unit/test_tf4rec.py . [ 19%] tests/unit/test_tools.py ...................... [ 21%] tests/unit/test_triton_inference.py ................................ [ 23%] tests/unit/framework_utils/test_tf_feature_columns.py . [ 23%] tests/unit/framework_utils/test_tf_layers.py ........................... [ 24%] ................................................... [ 28%] tests/unit/framework_utils/test_torch_layers.py . [ 28%] tests/unit/graph/test_column_schemas.py ................................ [ 30%] .................................................. [ 33%] tests/unit/graph/test_column_selector.py .................... [ 34%] tests/unit/graph/ops/test_selection.py ... [ 34%] tests/unit/inference/test_graph.py . [ 34%] tests/unit/inference/test_inference_ops.py .. [ 34%] tests/unit/inference/test_op_runner.py ... [ 34%] tests/unit/inference/test_tensorflow_inf_op.py ... [ 35%] tests/unit/loader/test_dataloader_backend.py ...... [ 35%] tests/unit/loader/test_tf_dataloader.py ..........F

=================================== FAILURES =================================== ________________ test_tf_gpu_dl[cpu-False-True-1-parquet-0.01] _________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-32/test_tf_gpu_dl_cpu_False_True_0') paths = ['/tmp/pytest-of-jenkins/pytest-32/parquet0/dataset-0.parquet', '/tmp/pytest-of-jenkins/pytest-32/parquet0/dataset-1.parquet'] use_paths = True, device = 'cpu', cpu_true = False dataset = <nvtabular.io.dataset.Dataset object at 0x7f41e84ec280> batch_size = 1, gpu_memory_frac = 0.01, engine = 'parquet'

@pytest.mark.parametrize("gpu_memory_frac", [0.01, 0.06])
@pytest.mark.parametrize("engine", ["parquet"])
@pytest.mark.parametrize("batch_size", [1, 10, 100])
@pytest.mark.parametrize("use_paths", [True, False])
@pytest.mark.parametrize("cpu_true", [False, True])
@pytest.mark.parametrize("device", ["cpu", 0])
def test_tf_gpu_dl(
    tmpdir, paths, use_paths, device, cpu_true, dataset, batch_size, gpu_memory_frac, engine
):
    cont_names = ["x", "y", "id"]
    cat_names = ["name-string"]
    label_name = ["label"]
    if engine == "parquet":
        cat_names.append("name-cat")

    columns = cont_names + cat_names

    conts = cont_names >> ops.FillMedian() >> ops.Normalize()
    cats = cat_names >> ops.Categorify()

    workflow = nvt.Workflow(conts + cats + label_name)

  workflow.fit(dataset)

tests/unit/loader/test_tf_dataloader.py:255:

nvtabular/workflow/workflow.py:216: in fit results = dask.compute(stats, scheduler="synchronous")[0] ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/base.py:568: in compute results = schedule(dsk, keys, **kwargs) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:560: in get_sync return get_async( ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:503: in get_async for key, res_info, failed in queue_get(queue).result(): /usr/lib/python3.8/concurrent/futures/_base.py:437: in result return self.__get_result() /usr/lib/python3.8/concurrent/futures/_base.py:389: in __get_result raise self._exception ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:545: in submit fut.set_result(fn(*args, **kwargs)) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:237: in batch_execute_tasks return [execute_task(a) for a in it] ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:237: in return [execute_task(a) for a in it] ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:228: in execute_task result = pack_exception(e, dumps) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/local.py:223: in execute_task result = _execute_task(task, data) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/core.py:121: in _execute_task return func((_execute_task(a, cache) for a in args)) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/optimization.py:969: in call return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/core.py:151: in get result = _execute_task(task, cache) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/core.py:121: in _execute_task return func((_execute_task(a, cache) for a in args)) ../../../.local/lib/python3.8/site-packages/dask-2021.7.1-py3.8.egg/dask/utils.py:35: in apply return func(*args, **kwargs) nvtabular/workflow/workflow.py:462: in _transform_partition output_df = node.op.transform(selection, input_df) /usr/lib/python3.8/contextlib.py:75: in inner return func(*args, **kwds) nvtabular/ops/fill.py:126: in transform df[col] = df[col].fillna(self.medians[col]) /usr/local/lib/python3.8/dist-packages/cudf/core/series.py:2659: in fillna return super().fillna( /usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:1334: in fillna copy_data[name] = copy_data[name].fillna(value[name], method)

self = <cudf.core.column.numerical.NumericalColumn object at 0x7f41e80f69c0> [ 975, 1039, 1060, 1009, 1009, 1030,...3, 991, 1019, 1026, ... 979, 960, 1074, 947, 977, 968, 1012, 1044, 1024, 967 ] dtype: int64 fill_value = 1000.5, method = None, dtype = None, fill_nan = True

def fillna(
    self,
    fill_value: Any = None,
    method: str = None,
    dtype: Dtype = None,
    fill_nan: bool = True,
) -> NumericalColumn:
    """
    Fill null values with *fill_value*
    """
    if fill_nan:
        col = self.nans_to_nulls()
    else:
        col = self

    if method is not None:
        return super(NumericalColumn, col).fillna(fill_value, method)

    if fill_value is None:
        raise ValueError("Must specify either 'fill_value' or 'method'")

    if (
        isinstance(fill_value, cudf.Scalar)
        and fill_value.dtype == col.dtype
    ):
        return super(NumericalColumn, col).fillna(fill_value, method)

    if np.isscalar(fill_value):
        # cast safely to the same dtype as self
        fill_value_casted = col.dtype.type(fill_value)
        if not np.isnan(fill_value) and (fill_value_casted != fill_value):

          raise TypeError(

                f"Cannot safely cast non-equivalent "
                f"{type(fill_value).__name__} to {col.dtype.name}"
            )

E TypeError: Cannot safely cast non-equivalent float to int64

/usr/local/lib/python3.8/dist-packages/cudf/core/column/numerical.py:380: TypeError ------------------------------ Captured log call ------------------------------- ERROR nvtabular:workflow.py:464 Failed to transform operator <nvtabular.ops.fill.FillMedian object at 0x7f41d86588b0> Traceback (most recent call last): File "/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/workflow.py", line 462, in _transform_partition output_df = node.op.transform(selection, input_df) File "/usr/lib/python3.8/contextlib.py", line 75, in inner return func(*args, **kwds) File "/var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/ops/fill.py", line 126, in transform df[col] = df[col].fillna(self.medians[col]) File "/usr/local/lib/python3.8/dist-packages/cudf/core/series.py", line 2659, in fillna return super().fillna( File "/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py", line 1334, in fillna copy_data[name] = copy_data[name].fillna(value[name], method) File "/usr/local/lib/python3.8/dist-packages/cudf/core/column/numerical.py", line 380, in fillna raise TypeError( TypeError: Cannot safely cast non-equivalent float to int64 =============================== warnings summary =============================== tests/unit/test_dask_nvt.py: 3 warnings tests/unit/test_io.py: 24 warnings tests/unit/test_tf4rec.py: 1 warning tests/unit/test_tools.py: 2 warnings tests/unit/test_triton_inference.py: 7 warnings tests/unit/loader/test_tf_dataloader.py: 2 warnings /usr/local/lib/python3.8/dist-packages/numba/cuda/compiler.py:865: NumbaPerformanceWarning: [1mGrid size (1) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.[0m warn(NumbaPerformanceWarning(msg))

tests/unit/test_dask_nvt.py: 12 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 2 files did not have enough partitions to create 8 files. warnings.warn(

tests/unit/test_dask_nvt.py: 2 warnings tests/unit/test_io.py: 36 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/workflow/workflow.py:86: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for execution. Please use the client argument to initialize a Workflow object with distributed-execution enabled. warnings.warn(

tests/unit/test_dask_nvt.py: 2 warnings tests/unit/test_io.py: 52 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dask.py:375: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler will be used for this write operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 6 files did not have enough partitions to create 7 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 8 files did not have enough partitions to create 9 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 8 files did not have enough partitions to create 10 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 8 files did not have enough partitions to create 11 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 13 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 14 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 15 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 16 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 17 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 18 files. warnings.warn(

tests/unit/test_io.py::test_io_partitions_push /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 12 files did not have enough partitions to create 19 files. warnings.warn(

tests/unit/test_io.py: 96 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/init.py:38: DeprecationWarning: ColumnGroup is deprecated, use ColumnSelector instead warnings.warn("ColumnGroup is deprecated, use ColumnSelector instead", DeprecationWarning)

tests/unit/test_io.py: 12 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 2 files did not have enough partitions to create 20 files. warnings.warn(

tests/unit/test_io.py::test_multifile_parquet[False-Shuffle.PER_WORKER-5-0-csv] tests/unit/test_io.py::test_multifile_parquet[False-Shuffle.PER_WORKER-5-2-csv] tests/unit/test_io.py::test_multifile_parquet[False-None-5-0-csv] tests/unit/test_io.py::test_multifile_parquet[False-None-5-2-csv] /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 1 files did not have enough partitions to create 5 files. warnings.warn(

tests/unit/test_io.py::test_to_parquet_output_files[Shuffle.PER_WORKER-4-6] tests/unit/test_io.py::test_to_parquet_output_files[False-4-6] /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 2 files did not have enough partitions to create 6 files. warnings.warn(

tests/unit/test_io.py::test_parquet_lists[2-Shuffle.PER_PARTITION-0] tests/unit/test_io.py::test_parquet_lists[2-Shuffle.PER_PARTITION-1] tests/unit/test_io.py::test_parquet_lists[2-Shuffle.PER_PARTITION-2] tests/unit/test_io.py::test_parquet_lists[2-None-0] tests/unit/test_io.py::test_parquet_lists[2-None-1] tests/unit/test_io.py::test_parquet_lists[2-None-2] /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:868: UserWarning: Only created 1 files did not have enough partitions to create 2 files. warnings.warn(

tests/unit/test_io.py: 20 warnings /var/jenkins_home/workspace/nvtabular_tests/nvtabular/nvtabular/io/dataset.py:521: UserWarning: A global dask.distributed client has been detected, but the single-threaded scheduler is being used for this shuffle operation. Please use the client argument to initialize a Dataset and/or Workflow object with distributed-execution enabled. warnings.warn(

tests/unit/test_tools.py::test_cat_rep[None-1000] tests/unit/test_tools.py::test_cat_rep[distro1-1000] /usr/local/lib/python3.8/dist-packages/numba/cuda/compiler.py:865: NumbaPerformanceWarning: [1mGrid size (3) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.[0m warn(NumbaPerformanceWarning(msg))

tests/unit/test_tools.py::test_cat_rep[None-10000] tests/unit/test_tools.py::test_cat_rep[distro1-10000] /usr/local/lib/python3.8/dist-packages/numba/cuda/compiler.py:865: NumbaPerformanceWarning: [1mGrid size (30) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.[0m warn(NumbaPerformanceWarning(msg))

tests/unit/loader/test_tf_dataloader.py::test_nested_list /usr/local/lib/python3.8/dist-packages/numba/cuda/compiler.py:865: NumbaPerformanceWarning: [1mGrid size (2) < 2 * SM count (112) will likely result in GPU under utilization due to low occupancy.[0m warn(NumbaPerformanceWarning(msg))

-- Docs: https://docs.pytest.org/en/stable/warnings.html

---------- coverage: platform linux, python 3.8.10-final-0 ----------- Name Stmts Miss Branch BrPart Cover Missing

nvtabular/init.py 18 0 0 0 100% nvtabular/dispatch.py 341 124 166 35 60% 37-39, 42-46, 51-53, 59-69, 76-77, 107, 110, 112, 116-120, 128-130, 135-138, 142-147, 154, 173, 184, 190, 195->197, 207-210, 223-226, 231-234, 245, 248, 265-266, 274, 278-280, 286, 303, 311, 318, 346-362, 371-377, 382, 397, 408-411, 416, 432, 438, 445-448, 462-464, 466, 468, 472-483, 523, 540, 547, 549, 556, 571-585, 600, 607 nvtabular/framework_utils/init.py 0 0 0 0 100% nvtabular/framework_utils/tensorflow/init.py 1 0 0 0 100% nvtabular/framework_utils/tensorflow/feature_column_utils.py 134 78 90 15 39% 30, 99, 103, 114-130, 140, 143-158, 162, 166-167, 173-198, 207-217, 220-227, 229->233, 234, 239-279, 282 nvtabular/framework_utils/tensorflow/layers/init.py 4 0 0 0 100% nvtabular/framework_utils/tensorflow/layers/embedding.py 153 12 89 6 91% 60, 68->49, 122, 179, 231-239, 335->343, 357->360, 363-364, 367 nvtabular/framework_utils/tensorflow/layers/interaction.py 47 25 22 1 45% 49, 74-103, 106-110, 113 nvtabular/framework_utils/tensorflow/layers/outer_product.py 30 24 12 0 19% 37-38, 41-60, 71-84, 87 nvtabular/framework_utils/tensorflow/tfrecords_to_parquet.py 58 58 30 0 0% 16-111 nvtabular/framework_utils/torch/init.py 0 0 0 0 100% nvtabular/framework_utils/torch/layers/init.py 2 0 0 0 100% nvtabular/framework_utils/torch/layers/embeddings.py 32 2 18 2 92% 50, 91 nvtabular/framework_utils/torch/models.py 45 1 30 4 93% 57->61, 87->89, 93->96, 103 nvtabular/framework_utils/torch/utils.py 75 9 34 7 85% 51->53, 64, 71->76, 75, 109, 118-120, 129-131 nvtabular/graph/init.py 4 0 0 0 100% nvtabular/graph/base_operator.py 72 7 22 3 87% 97-102, 129, 196, 200 nvtabular/graph/graph.py 55 1 36 2 97% 47, 109->108 nvtabular/graph/node.py 284 78 151 20 69% 49, 63, 73-81, 86->89, 132, 135, 201-217, 220-239, 282, 300, 309, 319-324, 329, 331, 337, 351, 361-372, 377->380, 391-399, 408, 409->404, 423-424, 432, 433->416, 439-442, 446, 473, 480-485, 504 nvtabular/graph/ops/init.py 5 0 0 0 100% nvtabular/graph/ops/concat_columns.py 16 0 2 0 100% nvtabular/graph/ops/identity.py 6 1 2 0 88% 41 nvtabular/graph/ops/selection.py 22 0 2 0 100% nvtabular/graph/ops/subset_columns.py 16 1 2 0 94% 62 nvtabular/graph/ops/subtraction.py 20 10 4 0 50% 26, 35, 44-50, 53-54 nvtabular/graph/schema.py 126 9 59 6 92% 38, 65, 157, 160, 163, 176, 183, 208, 211, 216->215 nvtabular/graph/schema_io/init.py 0 0 0 0 100% nvtabular/graph/schema_io/schema_writer_base.py 8 0 2 0 100% nvtabular/graph/schema_io/schema_writer_pbtxt.py 122 11 58 13 87% 45, 61->68, 64->66, 75, 92->97, 95->97, 118->133, 121-122, 124-127, 129, 169->185, 177, 181 nvtabular/graph/selector.py 78 1 40 0 99% 121 nvtabular/graph/tags.py 16 0 2 0 100% nvtabular/inference/init.py 0 0 0 0 100% nvtabular/inference/graph/init.py 3 0 0 0 100% nvtabular/inference/graph/ensemble.py 57 42 26 0 20% 39-103, 107-118 nvtabular/inference/graph/graph.py 27 4 14 2 80% 42, 50-57 nvtabular/inference/graph/node.py 15 9 4 0 42% 22-23, 26-27, 31-36 nvtabular/inference/graph/op_runner.py 21 0 8 0 100% nvtabular/inference/graph/ops/init.py 0 0 0 0 100% nvtabular/inference/graph/ops/operator.py 32 6 12 1 80% 13-14, 19, 36, 40, 49 nvtabular/inference/graph/ops/tensorflow.py 48 16 16 2 66% 34-47, 79-83, 92 nvtabular/inference/graph/ops/workflow.py 30 0 4 0 100% nvtabular/inference/triton/init.py 36 12 14 1 58% 42-49, 68, 72, 76-82 nvtabular/inference/triton/benchmarking_tools.py 52 52 10 0 0% 2-103 nvtabular/inference/triton/data_conversions.py 87 3 58 4 95% 32-33, 84 nvtabular/inference/triton/ensemble.py 285 147 100 9 51% 89-93, 156-192, 236-284, 301-305, 377-385, 414-430, 483-493, 542-582, 588-604, 608-675, 682->685, 685->681, 702->701, 751, 757-776, 782-806, 813 nvtabular/inference/triton/model/init.py 0 0 0 0 100% nvtabular/inference/triton/model/model_pt.py 101 101 42 0 0% 27-220 nvtabular/inference/triton/model_config_pb2.py 299 0 2 0 100% nvtabular/inference/triton/workflow_model.py 52 52 22 0 0% 27-124 nvtabular/inference/workflow/init.py 0 0 0 0 100% nvtabular/inference/workflow/base.py 114 114 62 0 0% 27-210 nvtabular/inference/workflow/hugectr.py 37 37 16 0 0% 27-87 nvtabular/inference/workflow/pytorch.py 10 10 6 0 0% 27-46 nvtabular/inference/workflow/tensorflow.py 32 32 10 0 0% 26-68 nvtabular/io/init.py 5 0 0 0 100% nvtabular/io/avro.py 88 88 32 0 0% 16-189 nvtabular/io/csv.py 57 6 22 5 86% 22-23, 99, 103->107, 108, 110, 124 nvtabular/io/dask.py 183 9 74 12 92% 111, 114, 150, 226, 401, 411, 428->431, 439, 443->445, 445->441, 450, 452 nvtabular/io/dataframe_engine.py 61 5 30 6 88% 19-20, 50, 69, 88->92, 92->97, 94->97, 97->116, 125 nvtabular/io/dataframe_iter.py 21 1 14 1 94% 42 nvtabular/io/dataset.py 346 43 168 28 85% 48-49, 268, 270, 283, 308-322, 446->520, 451-454, 459->469, 476->474, 477->481, 494->498, 509, 520->529, 580-581, 582->586, 634, 762, 764, 766, 772, 776-778, 780, 840-841, 875, 882-883, 889, 895, 992-993, 1111-1116, 1122, 1134-1135 nvtabular/io/dataset_engine.py 31 2 6 1 92% 48, 74 nvtabular/io/fsspec_utils.py 115 101 64 0 8% 26-27, 42-98, 103-114, 151-198, 220-270, 275-291, 295-297, 311-322 nvtabular/io/hugectr.py 45 2 26 2 92% 34, 74->97, 101 nvtabular/io/parquet.py 591 50 218 34 88% 35-36, 59, 81->161, 92, 106, 118-132, 145-158, 181, 210-211, 228->253, 239->253, 247->253, 284->300, 290-298, 318, 324, 342->344, 358, 376->386, 379, 432, 440, 554-559, 597-602, 718->725, 786->791, 792-793, 913, 917, 921, 927, 959, 976, 980, 987->989, 1097->exit, 1101->1098, 1108->1113, 1118->1128, 1133, 1155, 1182, 1186 nvtabular/io/shuffle.py 31 7 18 4 73% 42, 44-45, 49, 62-64 nvtabular/io/writer.py 184 13 78 5 92% 24-25, 51, 79, 125, 128, 212, 221, 224, 267, 299-301 nvtabular/io/writer_factory.py 18 2 8 2 85% 35, 60 nvtabular/loader/init.py 0 0 0 0 100% nvtabular/loader/backend.py 372 30 154 15 91% 27-28, 93, 98-99, 126, 143, 158-160, 294, 300->302, 312-316, 363-364, 403->407, 404->403, 479, 483-484, 509-518, 589-590, 624-628, 633 nvtabular/loader/tensorflow.py 168 40 58 7 77% 55-57, 66, 83, 92-97, 311, 339, 350, 365-367, 390-400, 404, 408-416, 419-422, 425-429 nvtabular/loader/tf_utils.py 57 10 22 6 80% 32->35, 35->37, 42->44, 46, 47->68, 53-54, 62-64, 70-74 nvtabular/loader/torch.py 87 39 26 3 50% 28-30, 33-39, 114, 119, 124-130, 154-166, 169, 174-179, 182-187, 190-191 nvtabular/ops/init.py 23 0 0 0 100% nvtabular/ops/add_metadata.py 17 7 4 0 57% 32, 35, 38, 43-46 nvtabular/ops/bucketize.py 37 19 20 2 39% 52-54, 58->exit, 59-64, 71-87, 90, 93 nvtabular/ops/categorify.py 661 147 354 80 73% 252, 254, 272, 276, 280, 284, 288, 292, 294, 298, 321, 324-329, 342-343, 372-376, 390->394, 398-405, 433, 443, 461, 466, 469, 492-493, 508-511, 526-531, 595, 622->624, 625, 626->628, 632, 634, 643, 722, 724->727, 730, 747, 756-761, 792, 826, 870-871, 886-890, 891->855, 909, 917, 924-925, 942-943, 948, 951->954, 980, 1000-1018, 1034, 1053->1055, 1058, 1060-1063, 1068, 1071, 1073->1076, 1081->1050, 1089-1096, 1097->1099, 1101-1104, 1116, 1120, 1124, 1131, 1136-1139, 1217, 1219, 1281, 1289->1312, 1295->1312, 1313-1318, 1336, 1340-1348, 1351, 1362-1370, 1377, 1383->1388, 1387, 1396-1401, 1402->1394, 1409, 1412, 1417-1431, 1452-1460 nvtabular/ops/clip.py 18 2 8 3 81% 44, 52->54, 55 nvtabular/ops/column_similarity.py 123 87 40 0 23% 19-20, 29-30, 73-79, 82-89, 93-115, 126, 129-133, 136-138, 141, 144, 147, 173-202, 211-212, 221-223, 231-247, 256-281, 285-288, 292-293 nvtabular/ops/data_stats.py 56 1 24 3 95% 91->93, 95, 97->87 nvtabular/ops/difference_lag.py 33 14 12 1 49% 60->63, 70-79, 84, 87, 92, 95, 98 nvtabular/ops/dropna.py 8 3 2 0 70% 39-41 nvtabular/ops/fill.py 91 26 40 9 64% 53-55, 63-67, 72->74, 75-80, 86-87, 91-94, 121, 125, 127, 150->152, 157-158, 162-165 nvtabular/ops/filter.py 20 3 8 3 79% 49, 56, 60 nvtabular/ops/groupby.py 128 16 82 9 84% 72, 83, 93->95, 105->110, 122, 137, 142, 148-153, 225, 253, 259-266 nvtabular/ops/hash_bucket.py 40 21 22 2 37% 69, 73, 82-93, 98-102, 105-112, 115, 118 nvtabular/ops/hashed_cross.py 36 19 17 1 38% 53, 59-70, 75, 78, 81, 86-91 nvtabular/ops/join_external.py 95 29 38 11 61% 20-21, 115, 117, 119, 132, 136-162, 166, 170-173, 178-179, 184-185, 230-237 nvtabular/ops/join_groupby.py 104 15 38 7 80% 107, 109, 116, 122-125, 132-135, 140-142, 215, 225-226 nvtabular/ops/lambdaop.py 39 6 20 6 80% 59, 63, 77, 89, 94, 103 nvtabular/ops/list_slice.py 85 39 42 5 45% 21-22, 67-68, 74, 86-94, 105, 121->127, 141-155, 163-185 nvtabular/ops/logop.py 19 2 6 1 88% 48-49 nvtabular/ops/moments.py 69 1 24 1 98% 71 nvtabular/ops/normalize.py 89 28 22 3 65% 72, 77, 82, 89, 104, 124-126, 132-140, 146, 153-157, 160-161, 165, 173, 176 nvtabular/ops/operator.py 12 1 2 0 93% 53 nvtabular/ops/rename.py 41 7 24 5 82% 47, 61->63, 64-69, 88-90 nvtabular/ops/stat_operator.py 8 0 2 0 100% nvtabular/ops/target_encoding.py 157 111 68 0 21% 167-208, 211-214, 217, 226, 229-240, 243-244, 247-248, 251-255, 259-338, 342-370, 373, 383-392 nvtabular/ops/value_counts.py 32 18 6 0 42% 37-53, 56, 59, 62, 65 nvtabular/tools/init.py 0 0 0 0 100% nvtabular/tools/data_gen.py 251 12 86 7 94% 25-26, 124-127, 137-139, 161-162, 313, 323, 347->346, 349 nvtabular/tools/dataset_inspector.py 50 7 22 1 81% 32-39 nvtabular/tools/inspector_script.py 46 46 0 0 0% 17-168 nvtabular/utils.py 106 43 48 8 54% 31-32, 36-37, 50, 61-62, 64-66, 69, 72, 78, 84, 90-126, 145, 149->153 nvtabular/worker.py 80 5 38 7 90% 24-25, 81->97, 89, 90->97, 97->100, 106, 108, 109->111 nvtabular/workflow/init.py 2 0 0 0 100% nvtabular/workflow/node.py 7 0 4 0 100% nvtabular/workflow/workflow.py 201 15 84 10 91% 28-29, 47, 177, 183->197, 209-211, 324, 339-340, 375, 451, 467-469, 482

TOTAL 8372 2287 3524 449 70% Coverage XML written to file coverage.xml

FAIL Required test coverage of 70% not reached. Total coverage: 69.55% =========================== short test summary info ============================ SKIPPED [1] ../../../../../usr/local/lib/python3.8/dist-packages/dask_cudf/io/tests/test_s3.py:16: could not import 's3fs': No module named 's3fs' SKIPPED [1] tests/unit/inference/test_ensemble.py:32: could not import 'nvtabular.loader.tf_utils.configure_tensorflow': No module named 'nvtabular.loader.tf_utils.configure_tensorflow'; 'nvtabular.loader.tf_utils' is not a package SKIPPED [1] tests/unit/inference/test_export.py:8: could not import 'nvtabular.loader.tf_utils.configure_tensorflow': No module named 'nvtabular.loader.tf_utils.configure_tensorflow'; 'nvtabular.loader.tf_utils' is not a package SKIPPED [8] tests/unit/test_io.py:613: could not import 'uavro': No module named 'uavro' !!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!! ===== 1 failed, 581 passed, 11 skipped, 299 warnings in 644.70s (0:10:44) ====== Build step 'Execute shell' marked build as failure Performing Post build task... Match found for : : True Logical operation result is TRUE Running script : #!/bin/bash cd /var/jenkins_home/ CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/NVTabular/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log" [nvtabular_tests] $ /bin/bash /tmp/jenkins2781857893364032677.sh

Jan 07 '22 03:01 nvidia-merlin-bot

I think the expected dtype should be consistent with the dtypes specified in workflow.transform().to_parquet().

Jan 07 '22 03:01 jershi425

@jershi425 I have pushed an update. Please, let me know what you think now

Jan 07 '22 14:01 albert17

Click to view CI Results

GitHub pull request #1351 of commit eded46d89f537caf4623d3959fb60d7a773976a0, no merge conflicts. Running as SYSTEM Setting status of eded46d89f537caf4623d3959fb60d7a773976a0 to PENDING with url http://10.20.13.93:8080/job/nvtabular_tests/4019/ and message: 'Pending' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1351/*:refs/remotes/origin/pr/1351/* # timeout=10 > git rev-parse eded46d89f537caf4623d3959fb60d7a773976a0^{commit} # timeout=10 Checking out Revision eded46d89f537caf4623d3959fb60d7a773976a0 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f eded46d89f537caf4623d3959fb60d7a773976a0 # timeout=10 Commit message: "Writes based on cardinality size" > git rev-list --no-walk 52e625cd42f31344e919319d7d12d3abdd6eaecb # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins1472503853033228929.sh Installing NVTabular Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com Requirement already satisfied: pip in /var/jenkins_home/.local/lib/python3.8/site-packages (21.3.1) Requirement already satisfied: setuptools in /var/jenkins_home/.local/lib/python3.8/site-packages (59.4.0) Collecting setuptools Downloading setuptools-60.3.1-py3-none-any.whl (953 kB) Requirement already satisfied: wheel in /var/jenkins_home/.local/lib/python3.8/site-packages (0.37.1) Requirement already satisfied: pybind11 in /var/jenkins_home/.local/lib/python3.8/site-packages (2.9.0) Requirement already satisfied: numpy==1.20.3 in /var/jenkins_home/.local/lib/python3.8/site-packages (1.20.3) Found existing installation: nvtabular 0.8.0+7.gb459467 Can't uninstall 'nvtabular'. No files were found to uninstall. running develop running egg_info creating nvtabular.egg-info writing nvtabular.egg-info/PKG-INFO writing dependency_links to nvtabular.egg-info/dependency_links.txt writing requirements to nvtabular.egg-info/requires.txt writing top-level names to nvtabular.egg-info/top_level.txt writing manifest file 'nvtabular.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' /var/jenkins_home/.local/lib/python3.8/site-packages/setuptools/command/easy_install.py:156: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools. warnings.warn( /var/jenkins_home/.local/lib/python3.8/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. warnings.warn( warning: no files found matching '*.h' under directory 'cpp' warning: no files found matching '*.cu' under directory 'cpp' warning: no files found matching '*.cuh' under directory 'cpp' adding license file 'LICENSE' writing manifest file 'nvtabular.egg-info/SOURCES.txt' running build_ext x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.8 -c flagcheck.cpp -o flagcheck.o -std=c++17 building 'nvtabular_cpp' extension creating build creating build/temp.linux-x86_64-3.8 creating build/temp.linux-x86_64-3.8/cpp creating build/temp.linux-x86_64-3.8/cpp/nvtabular creating build/temp.linux-x86_64-3.8/cpp/nvtabular/inference x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+24.geded46d -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+24.geded46d -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/__init__.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+24.geded46d -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/categorify.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o -std=c++17 -fvisibility=hidden -g0 x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -DVERSION_INFO=0.8.0+24.geded46d -I./cpp/ -I/var/jenkins_home/.local/lib/python3.8/site-packages/pybind11/include -I/usr/include/python3.8 -c cpp/nvtabular/inference/fill.cc -o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -std=c++17 -fvisibility=hidden -g0 creating build/lib.linux-x86_64-3.8 x86_64-linux-gnu-g++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 build/temp.linux-x86_64-3.8/cpp/nvtabular/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/__init__.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/categorify.o build/temp.linux-x86_64-3.8/cpp/nvtabular/inference/fill.o -o build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so copying build/lib.linux-x86_64-3.8/nvtabular_cpp.cpython-38-x86_64-linux-gnu.so -> Generating nvtabular/inference/triton/model_config_pb2.py from nvtabular/inference/triton/model_config.proto Creating /var/jenkins_home/.local/lib/python3.8/site-packages/nvtabular.egg-link (link to .) nvtabular 0.8.0+24.geded46d is already the active version in easy-install.pth

Installed /var/jenkins_home/workspace/nvtabular_tests/nvtabular Processing dependencies for nvtabular==0.8.0+24.geded46d Searching for packaging==21.3 Best match: packaging 21.3 Adding packaging 21.3 to easy-install.pth file