ucx
ucx copied to clipboard
astroid.exceptions.NameInferenceError: 'spark' not found in ...
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
we're missing property/method inference from SparkSession:
astroid.exceptions.NameInferenceError: 'spark' not found in <Module.root l.0 at 0x107d06d40>.
19:36:23 DEBUG [d.l.u.s.linters.dbfs] Could not infer value of spark.read.parquet('/mnt/foo/bar')
Expected Behavior
No response
Steps To Reproduce
spark.read.parquet('/mnt/foo/bar')
Cloud
AWS
Operating System
macOS
Version
latest via Databricks CLI
Relevant log output
19:36:23 DEBUG [d.l.u.s.linters.python_infer] When inferring Call(func=<Attribute.parquet l.1 at 0x107d05f30>,
args=[<Const.str l.1 at 0x107d04100>],
keywords=[]): Traceback (most recent call last):
File "/Users/serge.smertin/git/labs/ucx/src/databricks/labs/ucx/source_code/linters/python_infer.py", line 76, in _infer_internal
for inferred in node.inferred():
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_ng.py", line 586, in inferred
return list(self.infer())
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_ng.py", line 170, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 90, in inner
yield next(generator)
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 49, in wrapped
for res in _func(node, context, **kwargs):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_classes.py", line 1757, in _infer
for callee in self.func.infer(context):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_ng.py", line 170, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 90, in inner
yield next(generator)
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 49, in wrapped
for res in _func(node, context, **kwargs):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_classes.py", line 1092, in _infer_attribute
for owner in node.expr.infer(context):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_ng.py", line 170, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 90, in inner
yield next(generator)
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 49, in wrapped
for res in _func(node, context, **kwargs):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_classes.py", line 1092, in _infer_attribute
for owner in node.expr.infer(context):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_ng.py", line 170, in infer
for i, result in enumerate(self._infer(context=context, **kwargs)):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 90, in inner
yield next(generator)
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/decorators.py", line 49, in wrapped
for res in _func(node, context, **kwargs):
File "/Users/serge.smertin/git/labs/ucx/.venv/lib/python3.10/site-packages/astroid/nodes/node_classes.py", line 595, in _infer
raise NameInferenceError(
astroid.exceptions.NameInferenceError: 'spark' not found in <Module.root l.0 at 0x107d06d40>.
19:36:23 DEBUG [d.l.u.s.linters.dbfs] Could not infer value of spark.read.parquet('/mnt/foo/bar')
@nfx this is a debug log, useful primarily to us.
We can remove the exception stack trace.
But in data = spark.read.parquet('/mnt/foo/bar'), we genuinely can't infer the value of data.
lint no longer fails with the exception
test_lint_error.py:0:0: [implicit-dbfs-usage] The use of default dbfs: references is deprecated: /mnt/foo/bar
test_lint_error.py:0:19: [dbfs-usage] Deprecated file system path: /mnt/foo/bar
Facing the same bug while working on an integration test