Exception while creating a udf
I get an exception as soon as any code containing udf is encountered, eg: @udf("int") def night(hour, weekday): if ((16 <= hour <= 20) and (weekday < 5)): return int(1) else: return int(0)
Using Ray 2.41, pyarrow 19.0.0.
File "/home/ankur/dev/apps/ML/learn/ray/ray_spark/taxi_data_preprocess.py", line 35, in add_time_features @udf("int") ^^^^^^^^^^ File "/home/ankur/miniconda3/envs/py3_12/lib/python3.12/site-packages/pyspark/sql/udf.py", line 127, in _create_py_udf else session.conf.get("spark.sql.execution.pythonUDF.arrow.enabled") == "true" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ankur/miniconda3/envs/py3_12/lib/python3.12/site-packages/pyspark/sql/conf.py", line 54, in get return self._jconf.get(key) ^^^^^^^^^^^^^^^^^^^^ File "/home/ankur/miniconda3/envs/py3_12/lib/python3.12/site-packages/py4j/java_gateway.py", line 1322, in call return_value = get_return_value( ^^^^^^^^^^^^^^^^^ File "/home/ankur/miniconda3/envs/py3_12/lib/python3.12/site-packages/pyspark/errors/exceptions/captured.py", line 179, in deco return f(*a, **kw) ^^^^^^^^^^^ File "/home/ankur/miniconda3/envs/py3_12/lib/python3.12/site-packages/py4j/protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling o53.get. : java.util.NoSuchElementException: spark.sql.execution.pythonUDF.arrow.enabled at org.apache.spark.sql.errors.QueryExecutionErrors$.noSuchElementExceptionError(QueryExecutionErrors.scala:1678) at org.apache.spark.sql.internal.SQLConf.$anonfun$getConfString$3(SQLConf.scala:4568) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.internal.SQLConf.getConfString(SQLConf.scala:4568) at org.apache.spark.sql.RuntimeConfig.get(RuntimeConfig.scala:72) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:569) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.base/java.lang.Thread.run(Thread.java:840)
I know I'm exceptional, but this is excessive.