[Bug]: `UDF_CACHING` persistence mode persists input if `persistent_id` is set.
Steps to reproduce
This code persists input, I am not sure if it should. Notice that persistence_mode is set to UDF_CACHING:
import pathway as pw
class InSchema(pw.Schema):
a: int
b: int
t = pw.io.csv.read("a.csv", persistent_id="abc", schema=InSchema, mode="static")
persistence_backend = pw.persistence.Backend.filesystem("./xyz")
persistence_config = pw.persistence.Config.simple_config(
persistence_backend,
persistence_mode=pw.PersistenceMode.UDF_CACHING,
)
pw.debug.compute_and_print_update_stream(t, persistence_config=persistence_config)
If you run the code twice, you'll see that the values are read from persistence on the second run.
Relevant log output
First run:
| a | b | __time__ | __diff__
^31NXFBM... | 1 | 3 | 1718180081298 | 1
^TC3B0CF... | 2 | 4 | 1718180081298 | 1
^VH8R9JC... | 3 | 5 | 1718180081298 | 1
Second run:
| a | b | __time__ | __diff__
^31NXFBM... | 1 | 3 | 0 | 1
^TC3B0CF... | 2 | 4 | 0 | 1
^VH8R9JC... | 3 | 5 | 0 | 1
What did you expect to happen?
UDF_CACHING mode not persisting the input even if persistent_id is set or error that the persistent_id is set in UDF_CACHING mode.
Version
0.12.0
Docker Versions (if used)
No response
OS
Linux
On which CPU architecture did you run Pathway?
None
In general the persistence_mode is not documented enough.
I agree that it is confusing that enabling UDF caching enables the rest of the persistence mechanisms.
Hello, I am closing this issue as the problem has been fixed in the release 0.27.0. The related commit improves documentation, fixes the related problem, and adds a test.