Update table rest API is not working
Willingness to contribute
Yes. I would be willing to contribute a fix for this bug with guidance from the OpenHouse community.
OpenHouse version
latest
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0): WSL (Windows)
- JDK version: 1.8.0_402
Describe the problem
I am trying to setup openhouse locally, but while running the update table REST API, I am receiving errors.
I am following the steps mentioned in https://github.com/linkedin/openhouse/blob/main/SETUP.md
- Create table request:
curl "${curlArgs[@]}" -XPOST http://localhost:8000/v1/databases/d3/tables/ \
--data-raw '{
"tableId": "t1",
"databaseId": "d3",
"baseTableVersion": "INITIAL_VERSION",
"clusterId": "LocalFSCluster",
"schema": "{\"type\": \"struct\", \"fields\": [{\"id\": 1,\"required\": true,\"name\": \"id\",\"type\": \"string\"},{\"id\": 2,\"required\": true,\"name\": \"name\",\"type\": \"string\"},{\"id\": 3,\"required\": true,\"name\": \"ts\",\"type\": \"timestamp\"}]}",
"timePartitioning": {
"columnName": "ts",
"granularity": "HOUR"
},
"clustering": [
{
"columnName": "name"
}
],
"tableProperties": {
"key": "value"
}
}'
- Create table response
{
"tableId": "t1",
"databaseId": "d3",
"clusterId": "LocalFSCluster",
"tableUri": "LocalFSCluster.d3.t1",
"tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
"tableLocation": "file:/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
"tableVersion": "INITIAL_VERSION",
"tableCreator": "DUMMY_ANONYMOUS_USER",
"schema": "{\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"id\",\"required\":true,\"type\":\"string\"},{\"id\":2,\"name\":\"name\",\"required\":true,\"type\":\"string\"},{\"id\":3,\"name\":\"ts\",\"required\":true,\"type\":\"timestamp\"}]}",
"lastModifiedTime": 1715285373822,
"creationTime": 1715285373822,
"tableProperties": {
"policies": "",
"write.metadata.delete-after-commit.enabled": "true",
"openhouse.tableId": "t1",
"openhouse.clusterId": "LocalFSCluster",
"openhouse.lastModifiedTime": "1715285373822",
"openhouse.tableVersion": "INITIAL_VERSION",
"openhouse.creationTime": "1715285373822",
"openhouse.tableUri": "LocalFSCluster.d3.t1",
"write.format.default": "orc",
"write.metadata.previous-versions-max": "28",
"openhouse.databaseId": "d3",
"openhouse.tableType": "PRIMARY_TABLE",
"openhouse.tableLocation": "/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
"openhouse.tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
"key": "value",
"openhouse.tableCreator": "DUMMY_ANONYMOUS_USER"
},
"timePartitioning": {
"columnName": "ts",
"granularity": "HOUR"
},
"clustering": [
{
"columnName": "name",
"transform": null
}
],
"policies": null,
"tableType": "PRIMARY_TABLE"
}
- GET table request
curl "${curlArgs[@]}" -XGET http://localhost:8000/v1/databases/d3/tables/t1
- GET table response
{
"tableId": "t1",
"databaseId": "d3",
"clusterId": "LocalFSCluster",
"tableUri": "LocalFSCluster.d3.t1",
"tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
"tableLocation": "file:/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
"tableVersion": "INITIAL_VERSION",
"tableCreator": "DUMMY_ANONYMOUS_USER",
"schema": "{\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"id\",\"required\":true,\"type\":\"string\"},{\"id\":2,\"name\":\"name\",\"required\":true,\"type\":\"string\"},{\"id\":3,\"name\":\"ts\",\"required\":true,\"type\":\"timestamp\"}]}",
"lastModifiedTime": 1715285373822,
"creationTime": 1715285373822,
"tableProperties": {
"policies": "",
"write.metadata.delete-after-commit.enabled": "true",
"openhouse.tableId": "t1",
"openhouse.clusterId": "LocalFSCluster",
"openhouse.lastModifiedTime": "1715285373822",
"openhouse.tableVersion": "INITIAL_VERSION",
"openhouse.creationTime": "1715285373822",
"openhouse.tableUri": "LocalFSCluster.d3.t1",
"write.format.default": "orc",
"write.metadata.previous-versions-max": "28",
"openhouse.databaseId": "d3",
"openhouse.tableType": "PRIMARY_TABLE",
"openhouse.tableLocation": "/tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json",
"openhouse.tableUUID": "e307fe92-56af-403d-983b-6cb0da61ef82",
"key": "value",
"openhouse.tableCreator": "DUMMY_ANONYMOUS_USER"
},
"timePartitioning": {
"columnName": "ts",
"granularity": "HOUR"
},
"clustering": [
{
"columnName": "name",
"transform": null
}
],
"policies": null,
"tableType": "PRIMARY_TABLE"
}
- Update table request
curl "${curlArgs[@]}" -XPUT http://localhost:8000/v1/databases/d3/tables/t1 \
--data-raw '{
"tableId": "t1",
"databaseId": "d3",
"baseTableVersion":"INITIAL_VERSION",
"clusterId": "LocalFSCluster",
"schema": "{\"type\": \"struct\", \"fields\": [{\"id\": 1,\"required\": true,\"name\": \"id\",\"type\": \"string\"},{\"id\": 2,\"required\": true,\"name\": \"name\",\"type\": \"string\"},{\"id\": 3,\"required\": true,\"name\": \"ts\",\"type\": \"timestamp\"}, {\"id\": 4,\"required\": true,\"name\": \"country\",\"type\": \"string\"}]}",
"timePartitioning": {
"columnName": "ts",
"granularity": "HOUR"
},
"clustering": [
{
"columnName": "name"
}
],
"tableProperties": {
"key": "value"
}
}'
- Update table response
{
"status": "CONFLICT",
"error": "Conflict",
"message": "Entity with key[LocalFSCluster.d3.t1] is modified by another process already, nested exception message: Conflict detected for databaseId: d3, tableId: t1, expected version: /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json actual version INITIAL_VERSION: The requested user table has been modified/created by other processes.",
"stacktrace": null,
"cause": "Conflict detected for databaseId: d3, tableId: t1, expected version: /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json actual version INITIAL_VERSION: The requested user table has been modified/created by other processes.",
}
Stacktrace, metrics and logs
No response
Code to reproduce bug
No response
What component does this bug affect?
- [X]
Table Service: This is the RESTful catalog service that stores table metadata.:services:tables - [ ]
Jobs Service: This is the job orchestrator that submits data services for table maintenance.:services:jobs - [ ]
Data Services: This is the jobs that performs table maintenance.apps:spark - [ ]
Iceberg internal catalog: This is the internal Iceberg catalog for OpenHouse Catalog Service.:iceberg:openhouse - [ ]
Spark Client Integration: This is the Apache Spark integration for OpenHouse catalog.:integration:spark - [ ]
Documentation: This is the documentation for OpenHouse.docs - [X]
Local Docker: This is the local Docker environment for OpenHouse.infra/recipes/docker-compose - [ ]
Other: Please specify the component.
I created a table called 'table10' and tried to update it with "baseTableVersion": "INITIAL_VERSION" and "clusterId": "LocalFSCluster", which I got from the successful table creation response. The update failed due to a conflict, suggesting the table had been modified by another process. However, when I retried after about 4 hours, the update worked.
Hi @aditya-sjsu , in your update request, the baseTableVersion is still pointing to INITIAL_VERSION, can you change it to /tmp/openhouse/d3/t1-e307fe92-56af-403d-983b-6cb0da61ef82/00000-1eba8f50-4d83-49fe-b968-b59c5f77c6e7.metadata.json, and try again ?
OH table versions are used for atomic updates. Each change/update targets a specific version. If the version in HTS has evolved from the specified version you'll see the error "Entity with <> is modified by another process already"
An example update scenario is as follows:
# Action # targetVersion # versionAfterUpdate
CREATE_TABLE INITIAL_VERSION TBL_LOC_1
UPDATE_TABLE_1 TBL_LOC_1 TBL_LOC_2
INSERT_DATA TBL_LOC_2 TBL_LOC_3
and so on.
Let me know if you still face this issue.
Hi @HotSushi
Thanks, it worked. Also I had to change the tableProperties field in the update request to the one received from get request.
Update response I got
{
"tableId": "t1",
"databaseId": "d3",
"clusterId": "LocalFSCluster",
"tableUri": "LocalFSCluster.d3.t1",
"tableUUID": "782cbf27-314a-4fd7-871f-9ce8eefced09",
"tableLocation": "file:/tmp/d3/t1-782cbf27-314a-4fd7-871f-9ce8eefced09/00001-dd5968a4-813f-4fd3-9e62-bdfa146450bd.metadata.json",
"tableVersion": "/tmp/d3/t1-782cbf27-314a-4fd7-871f-9ce8eefced09/00000-aeca28ec-2bf2-44a4-a398-f1f88dd69ca0.metadata.json",
"tableCreator": "DUMMY_ANONYMOUS_USER",
"schema": "{\"type\":\"struct\",\"schema-id\":1,\"fields\":[{\"id\":1,\"name\":\"id\",\"required\":true,\"type\":\"string\"},{\"id\":2,\"name\":\"name\",\"required\":true,\"type\":\"string\"},{\"id\":3,\"name\":\"ts\",\"required\":true,\"type\":\"timestamp\"},{\"id\":4,\"name\":\"country\",\"required\":false,\"type\":\"string\"}]}",
"lastModifiedTime": 1716766531096,
"creationTime": 1716765289450,
"tableProperties": {
"policies": "",
"write.metadata.delete-after-commit.enabled": "true",
"openhouse.tableId": "t1",
"openhouse.clusterId": "LocalFSCluster",
"openhouse.lastModifiedTime": "1716766531096",
"openhouse.tableVersion": "/tmp/d3/t1-782cbf27-314a-4fd7-871f-9ce8eefced09/00000-aeca28ec-2bf2-44a4-a398-f1f88dd69ca0.metadata.json",
"openhouse.creationTime": "1716765289450",
"openhouse.tableUri": "LocalFSCluster.d3.t1",
"write.format.default": "orc",
"write.metadata.previous-versions-max": "28",
"openhouse.databaseId": "d3",
"openhouse.tableType": "PRIMARY_TABLE",
"openhouse.tableLocation": "/tmp/d3/t1-782cbf27-314a-4fd7-871f-9ce8eefced09/00001-dd5968a4-813f-4fd3-9e62-bdfa146450bd.metadata.json",
"openhouse.tableUUID": "782cbf27-314a-4fd7-871f-9ce8eefced09",
"key": "value",
"openhouse.tableCreator": "DUMMY_ANONYMOUS_USER"
},
"timePartitioning": {
"columnName": "ts",
"granularity": "HOUR"
},
"clustering": [
{
"columnName": "name",
"transform": null
}
],
"policies": null,
"tableType": "PRIMARY_TABLE"
}
Thanks @aditya-sjsu