Sandboxed (MacOS) Batch transcription fails due to permission error accessing '/private/etc/apache2/mime.types'
Batch transcription works using the same code tested in the python console.
When sandboxed, the Batch transcription process fails, as some underlying library tries to access "/private/etc/apache2/mime.types".
To prevent this permission error, the file needs to be accessed within the app environment (or an alternative option is needed to avoid the file, if possible).
Real-time transcription works in the sandboxed context.
I first thought to store a local copy of the mime.types file and track down where Speechmatics is accessing it (to reroute the library to access the local version), but it is elusive and I suspect there is a better solution.
If there isn't a straightforward solution using the Speechmatics Python method, I'll plan to test with a lower-abstraction approach in python.
Batch transcription test:
import speechmatics
from speechmatics.batch_client import BatchClient
ssl_context = ssl.create_default_context()
ssl_context.load_verify_locations(certifi.where())
conf = speechmatics.models.BatchTranscriptionConfig(
language=LANGUAGE,
output_local=englishLocale if LANGUAGE == "en" else None,
operating_point=operatingPoint,
)
settings = speechmatics.models.ConnectionSettings(
url="https://asr.api.speechmatics.com/v2",
auth_token=speechmaticsAPIkey,
ssl_context=ssl_context,
)
try:
with BatchClient(settings) as client:
job_id = client.submit_job(audio=audio_file, transcription_config=conf)
transcript = client.wait_for_completion(job_id, transcription_format='json-v2')
Hi @petiatil
We did some digging and it seems like the httpx lib imports mimetypes here:
https://github.com/encode/httpx/blob/2318fd822cdb16435ccb5cabcba16c0b7969c1e4/httpx/_utils.py#L4
So maybe this is the issue you're seeing:
https://github.com/python/cpython/blob/3.12/Lib/mimetypes.py#L48
We do have an open issue to replace httpx but it is unlikely to be done soon.
Hopefully that helps a little.
Fortunately, using requests directly resolved the sandbox issue.
Thank you