[BUG] Local docker failed to run spark-shell on Mac M1
Willingness to contribute
Yes. I can contribute a fix for this bug independently.
OpenHouse version
v0.5.62
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 20.0): Apple M1, MacOS Sonoma 14.5, Docker v4.30
- JDK version: 1.8
Describe the problem
While running spark-shell commands in the SETUP.md, it always prompts a fatal error related to Java Runtime Environment.
After investigation, I found that it is a common docker issue on Apple Silicon Macbook due to a bug in Rosseta (the x86/amd64 emulation application on Apple Silicon).
More details about this issue can be found in https://github.com/docker/for-mac/issues/7006
While waiting for the fix from Apple, there are several workarounds for this issue. For me, downgrading the Docker to [https://docs.docker.com/desktop/release-notes/#4272](version 4.27.2) will work. Additionally, other methods are mentioned in https://github.com/docker/for-mac/issues/7006#issuecomment-2122869966.
Stacktrace, metrics and logs
A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ffffe0b8e1e, pid=692, tid=0x00007fffe86e6700
#
# JRE version: OpenJDK Runtime Environment (8.0_232-b09) (build 1.8.0_232-8u232-b09-1~deb9u1-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.232-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x628e1e]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /opt/spark/hs_err_pid692.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
Code to reproduce bug
bin/spark-shell --packages org.apache.iceberg:iceberg-spark-runtime-3.1_2.12:1.2.0 \
--jars openhouse-spark-runtime_2.12-*-all.jar \
--conf spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,com.linkedin.openhouse.spark.extensions.OpenhouseSparkSessionExtensions \
--conf spark.sql.catalog.openhouse=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.openhouse.catalog-impl=com.linkedin.openhouse.spark.OpenHouseCatalog \
--conf spark.sql.catalog.openhouse.metrics-reporter-impl=com.linkedin.openhouse.javaclient.OpenHouseMetricsReporter \
--conf spark.sql.catalog.openhouse.uri=http://openhouse-tables:8080 \
--conf spark.sql.catalog.openhouse.auth-token=$(cat /var/config/$(whoami).token) \
--conf spark.sql.catalog.openhouse.cluster=LocalHadoopCluster
What component does this bug affect?
- [ ]
Table Service: This is the RESTful catalog service that stores table metadata.:services:tables - [ ]
Jobs Service: This is the job orchestrator that submits data services for table maintenance.:services:jobs - [ ]
Data Services: This is the jobs that performs table maintenance.apps:spark - [ ]
Iceberg internal catalog: This is the internal Iceberg catalog for OpenHouse Catalog Service.:iceberg:openhouse - [ ]
Spark Client Integration: This is the Apache Spark integration for OpenHouse catalog.:integration:spark - [ ]
Documentation: This is the documentation for OpenHouse.docs - [X]
Local Docker: This is the local Docker environment for OpenHouse.infra/recipes/docker-compose - [ ]
Other: Please specify the component.
I ran into the same issue, but downgrading docker like suggested also worked for me to fix it.
There is now an internal build of Docker 4.32 that resolved this issue for me: https://github.com/docker/for-mac/issues/7006#issuecomment-2163112416
Hopefully there will be an official release soon!