MP4Muxer and FragmentedMP4Muxer throwing illegal state exception at random times causing crash
Version
Media3 1.4.1
More version details
No response
Devices that reproduce the issue
Samsung Galaxy S20
Devices that do not reproduce the issue
No response
Reproducible in the demo app?
Not tested
Reproduction steps
I'm migrating to media3. I'm encountering an IllegalStateException when I use Mp4Muxer or FragmentedMP4Muxer to mux a video with two tracks - one video from the camera, one audio from the mic. I'm using the following format parameters:
video format parameters: frame rate: 30 container mime type: MimeTypes.VIDEO_H264 sample mime type: MimeTypes.VIDEO_H264
audio format parameters: channels: 1 sample rate: 44100 bit rate: 128000
Expected result
If you record a video before the IllegalStateException is thrown, the video records fine and can be played back - I've successfully recorded short 15second videos. It is expected that this would also be the case for long videos.
Actual result
If you record a video before the IllegalStateException is thrown, the video records fine and can be played back. But eventually after between 30 - 60 seconds, an IllegalStateException is thrown due to the following assertion and causes a crash:
MP4Writer.java (if using MP4Muxer):
...
private void flushPending(Track track) throws IOException {
checkState(track.pendingSamplesByteBuffer.size() == track.pendingSamplesBufferInfo.size()); //<< triggered here
...
FragmentedMP4Writer.java (if using FragmentedMP4Muxer):
...
private ProcessedTrackInfo processTrack(int trackId, Track track) {
checkState(track.pendingSamplesByteBuffer.size() == track.pendingSamplesBufferInfo.size()); //<< triggered here
...
It's not immediately clear why the sizes between pendingSamplesbyteBuffer and pendingSamplesBufferInfo would ever differ and therefore why the IllegalStateException would ever be thrown. There are no comments or documentation that explains why this would occur and how to avoid it.
Example stack trace:
java.lang.IllegalStateException at androidx.media3.common.util.Assertions.checkState(Assertions.java:85)
at androidx.media3.muxer.FragmentedMp4Writer.processTrack(FragmentedMp4Writer.java:296)
at androidx.media3.muxer.FragmentedMp4Writer.processAllTracks(FragmentedMp4Writer.java:289)
at androidx.media3.muxer.FragmentedMp4Writer.createFragment(FragmentedMp4Writer.java:239)
at androidx.media3.muxer.FragmentedMp4Writer.writeSampleData(FragmentedMp4Writer.java:126)
at androidx.media3.muxer.FragmentedMp4Muxer.writeSampleData(FragmentedMp4Muxer.java:185)
at xx.xx.encoder.VideoEncoderCore.drainVideoEncoder(VideoEncoderCore.java:374)
at xx.xx.TextureMovieEncoder.handleFrameAvailable(TextureMovieEncoder.java:1401)
Media
N/A
Bug Report
- [X] You will email the zip file produced by
adb bugreportto [email protected] after filing this issue.
Tried running the code again, and received this crash today (also at a random interval), possibly related:
java.util.ConcurrentModificationException
at java.util.ArrayDeque.nonNullElementAt(ArrayDeque.java:278)
at java.util.ArrayDeque$DeqIterator.next(ArrayDeque.java:708)
at com.google.common.collect.ImmutableCollection$Builder.addAll(ImmutableCollection.java:450)
at com.google.common.collect.ImmutableCollection$ArrayBasedBuilder.addAll(ImmutableCollection.java:555)
at com.google.common.collect.ImmutableList$Builder.addAll(ImmutableList.java:818)
at androidx.media3.muxer.FragmentedMp4Writer.processTrack(FragmentedMp4Writer.java:317)
at androidx.media3.muxer.FragmentedMp4Writer.processAllTracks(FragmentedMp4Writer.java:289)
at androidx.media3.muxer.FragmentedMp4Writer.createFragment(FragmentedMp4Writer.java:239)
at androidx.media3.muxer.FragmentedMp4Writer.writeSampleData(FragmentedMp4Writer.java:126)
at androidx.media3.muxer.FragmentedMp4Muxer.writeSampleData(FragmentedMp4Muxer.java:185)
at com.xx.xx.VideoEncoderCore.drainVideoEncoder(VideoEncoderCore.java:376)
I'm using a Handler for video frames, but not for audio frames.
Hey @GhostCoder7 , The muxer APIs are expected to work on a single threaded model only. Since you are feeding audio and video from different threads, there could be some synchronisation issues. If you need to feed data from different threads then may be you could create some synchronised wrapper methods which then calls muxer APIs.
Thanks, I've implemented this now, but I am noticing every now and then that writeSampleData() function is taking a long time to finish (> 200ms) causing a jitter in output video. I would say this happens around every 10-20 seconds while recording, but it's unpredictable.
On Wed, 9 Oct 2024 at 00:14, Sheena Chhabra @.***> wrote:
Hey @GhostCoder7 https://github.com/GhostCoder7 , The muxer APIs are expected to work on a single threaded model only. Since you are feeding audio and video from different threads, there could be some synchronisation issues. If you need to feed data from different threads then may be you could create some synchronised wrapper methods which then calls muxer APIs.
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2399891665, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNDDYMOE36IXPRCS2SLZ2POUTAVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOJZHA4TCNRWGU . You are receiving this because you were mentioned.Message ID: @.***>
@GhostCoder7 Which media3 version you are using?
So muxer writes samples in batches so whenever that batch is written you will see that it takes longer time. We had done some improvement to reduce this latency. The changes are present in release-1.5.0-rc01.
We have also added an API to disable this sample batching but this API will be available in the next release.
Ok - but is there any reason there is a visible hiccup in the video when this happens? Given I'm providing the timestamps to my encoder and then to the muxer in BufferInfo, why would a visible blip in the video be noticeable?
On Tue, 12 Nov 2024 at 20:09, Sheena Chhabra @.***> wrote:
@GhostCoder7 https://github.com/GhostCoder7 Which media3 version you are using?
So muxer writes samples in batches so whenever that batch is written you will see that it takes longer time. We had done some improvement to reduce this latency. The changes https://github.com/androidx/media/commit/4be5b74366cc55b91ce971cd3ae2575f18482ae7 are present in release-1.5.0-rc01 https://github.com/androidx/media/tree/release-1.5.0-rc01.
We have also added an API to disable this sample batching but this API https://github.com/androidx/media/commit/f181855c5e1577b2df2d61ca49b04e6a202679b0#diff-4e18d0e246dd34a51eb8749f23d582e891836721e3dbeebd9d847f917347a288 will be available in the next release.
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2470046427, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNHCVY4YUQD7QJOJ4DT2AHEEJAVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQGA2DMNBSG4 . You are receiving this because you were mentioned.Message ID: @.***>
@GhostCoder7 Just to clarify what do you mean by "jitter in output video"?
Does it mean that when you play the output MP4 file it has issues?
Yes, there is a noticeable skip of frames which, when observing with the debugger, coincides with the batch writing. Here's a link to view a sample. You can visibly see the skipping within the first couple of seconds: https://drive.google.com/file/d/1UfXBlIbNF8XcC9Kq6Cu80eygHesP0uTc/view?usp=sharing
In this video I'm moving around so you can see motion in the video, and I'm watching the debugger, which I've programmed to measure how long writeSampleData takes and to write to the debugger if it takes longer than 50ms. The blips in the video coincide with the batch writing.
This also results in the video effectively losing frames which causes the video to fall out of sync with other videos if filming on multiple cameras/devices, for example. If I record at the same time on both Android and iOS versions of the app, the video on the Android app will be shorter over time due to lost frames.
Regards
On Wed, 13 Nov 2024 at 01:54, Sheena Chhabra @.***> wrote:
@GhostCoder7 https://github.com/GhostCoder7 Just to clarify what do you mean by "jitter in output video"?
Does it mean that when you play the output MP4 file it has issues?
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2470828510, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNAPXHNNGHJSH5TOPOT2AIMRHAVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQHAZDQNJRGA . You are receiving this because you were mentioned.Message ID: @.***>
Hi there
I was actually able to find a workaround, sharing with you incase it's relevant for potentially improving the performance of writeSampelData(), check out this article: https://petrakeas.medium.com/optimizing-mediamuxer-s-writing-speed-4b2a7641db72
When encoding video from a GL surface, you generally have to drain your encoder on the same thread as drawing occurs before you swap your buffers. The delayed "writeSampleData" had a flow on effect of delaying swapBuffers, which had a flow on effect of interrupting camera frames being handled and drawn. The solution was to implement a circular buffer into our muxer wrapper class to hold frame information that outlives the hang caused by the batch processing of writeSampleData. My videos are so much smoother and no more frames are missing.
Hope it helps.
Regards
On Wed, 13 Nov 2024 at 12:38, Jarrad @.***> wrote:
Yes, there is a noticeable skip of frames which, when observing with the debugger, coincides with the batch writing. Here's a link to view a sample. You can visibly see the skipping within the first couple of seconds:
https://drive.google.com/file/d/1UfXBlIbNF8XcC9Kq6Cu80eygHesP0uTc/view?usp=sharing
In this video I'm moving around so you can see motion in the video, and I'm watching the debugger, which I've programmed to measure how long writeSampleData takes and to write to the debugger if it takes longer than 50ms. The blips in the video coincide with the batch writing.
This also results in the video effectively losing frames which causes the video to fall out of sync with other videos if filming on multiple cameras/devices, for example. If I record at the same time on both Android and iOS versions of the app, the video on the Android app will be shorter over time due to lost frames.
Regards
On Wed, 13 Nov 2024 at 01:54, Sheena Chhabra @.***> wrote:
@GhostCoder7 https://github.com/GhostCoder7 Just to clarify what do you mean by "jitter in output video"?
Does it mean that when you play the output MP4 file it has issues?
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2470828510, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNAPXHNNGHJSH5TOPOT2AIMRHAVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZQHAZDQNJRGA . You are receiving this because you were mentioned.Message ID: @.***>
@GhostCoder7 Thanks for sharing the article. Muxer already handles this problem by creating a copy of the sample although it does not use any circular buffer. So ideally there should not be any frame dropping because muxer creates a copy and releases the original sample immediately. There is an API to disable the sample copy and in that case app needs to create a copy themselves (like you have done now). Can you please confirm if you have disabled the sample copy using this API
We are not using the sample copy api so default sample copying should be enabled.
The main reason we wanted to move to Mp4Muxer is so we could leverage the fragmented nature that would allow videos to not be corrupt in the event of a crash while the user is recording.
Unfortunately we still find the new Mp4Muxer class unusable due to the dropped frames / hangs on encoder waiting for writeSampleData to return when batch writing. We've had to resort back to MediaMuxer class instead.
The issue is that the batch writing takes too long and we can't release the encoded buffer until writeSampleData returns - this results in a hang when the batch writing occurs because an encoder buffer is not released for a period of time, resulting in a hang in our encoder, and subsequently a "blip" or lost frames in our output video. Running writeSampleBuffer on a background thread makes no difference as we still have to wait for writeSampleBuffer to return until we can release the buffer. This is why we were taking our own copy and using a circular buffer so that the encoder buffer could be released sooner. The circular buffer method we adopted solved the issue and gave us smooth video, but unfortunately we would get an exception after about 20 seconds of recording:* "java.lang.RuntimeException: java.lang.IllegalStateException: Sample does not start with a NAL unit"*
For now, we have to keep using MediaMuxer class and can't take advantage of fragmented Mp4. If you find a cause, workaround or fix please let me know.
Regards
On Wed, 13 Nov 2024 at 16:39, Sheena Chhabra @.***> wrote:
@GhostCoder7 https://github.com/GhostCoder7 Thanks for sharing the article. Muxer already handles this problem by creating a copy of the sample although it does not use any circular buffer. So ideally there should not be any frame dropping because muxer creates a copy and releases the original sample immediately. There is an API to disable the sample copy and in that case app needs to create a copy themselves (like you have done now). Can you please confirm if you have disabled the sample copy using this API https://developer.android.com/reference/androidx/media3/muxer/Mp4Muxer.Builder?_gl=1*1noljnj*_up*MQ..*_ga*MTAzMjcxNDI1OC4xNzMxNDc4MDAx*_ga_6HH9YJMN9M*MTczMTQ3ODAwMC4xLjAuMTczMTQ3ODAwMC4wLjAuMTYzNzg5OTc5Nw..#setSampleCopyEnabled(boolean)
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2472488340, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNFNOVQIAPFFEO7KNG32ALUH7AVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINZSGQ4DQMZUGA . You are receiving this because you were mentioned.Message ID: @.***>
Hey @GhostCoder7
The main reason we wanted to move to Mp4Muxer is so we could leverage the fragmented nature that would allow videos to not be corrupt in the event of a crash while the user is recording.
I would like to mention that Mp4Muxer is robust even in non fragmented case i.e if there is crash while recoding, the MP4 will still be valid with some partial data. It is mentioned here
The circular buffer method we adopted solved the issue and gave us smooth video, but unfortunately we would get an exception after about 20 seconds of recording:* "java.lang.RuntimeException: java.lang.IllegalStateException: Sample does not start with a NAL unit"*
Ideally this solution should work as it immediately unblocks encoder. Could it be possible that when you are creating a copy, the ByteBuffer is not copied properly resulting in an invalid NAL unit? Or maybe a ByteBuffer was not recycled properly and was reused again?
Also would it be possible for you to build from the github main branch and try this API to disable sample batching if it solves the issue?
Yes, I had noticed that Mp4Muxer was sufficient to prevent corrupt video from crash (without using FragmentedMp4Muxer). But the issues we had persisted with both. I suspect the NAL unit issue was likely some bad implementation on our part of circular buffer.
Per your suggestion, I built from main branch and tried the API to disable batch writing and it seems to have solved the problem. I've been able to record video for long periods of time (>2 minutes) with little to no lost frames or sync issues. When is this scheduled to be released so that I don't have to rely on the local dependency?
I think the api to disable batch writing is useful, especially for use cases beyond simple camera recording - our frames come from the GPU as we process frames from the camera with GLES before encoding them so we require minimal hold ups on the thread.
Thanks
On Wed, 20 Nov 2024 at 18:47, Sheena Chhabra @.***> wrote:
Hey @GhostCoder7 https://github.com/GhostCoder7
The main reason we wanted to move to Mp4Muxer is so we could leverage the fragmented nature that would allow videos to not be corrupt in the event of a crash while the user is recording.
I would like to mention that Mp4Muxer is robust even in non fragmented case i.e if there is crash while recoding, the MP4 will still be valid with some partial data. It is mentioned here https://developer.android.com/reference/androidx/media3/muxer/Mp4Muxer#:~:text=When%20writing%20a%20file%2C%20if%20an%20error%20occurs%20and%20the%20muxer%20is%20not%20closed%2C%20then%20the%20output%20MP4%20file%20may%20still%20have%20some%20partial%20data.
The circular buffer method we adopted solved the issue and gave us smooth video, but unfortunately we would get an exception after about 20 seconds of recording:* "java.lang.RuntimeException: java.lang.IllegalStateException: Sample does not start with a NAL unit"*
Ideally this solution should work as it immediately unblocks encoder. Could it be possible that when you are creating a copy, the ByteBuffer is not copied properly resulting in an invalid NAL unit? Or maybe a ByteBuffer was not recycled properly and was reused again?
Also would it be possible for you to build from the github main branch and try this API to disable sample batching https://github.com/androidx/media/blob/3c01500a4e6a7105a2f1c1439ceb451c035a9fe7/libraries/muxer/src/main/java/androidx/media3/muxer/Mp4Muxer.java#L275 if it solves the issue?
— Reply to this email directly, view it on GitHub https://github.com/androidx/media/issues/1781#issuecomment-2487848400, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXNVTNF2VILKMVR63GME6CL2BRASVAVCNFSM6AAAAABPM66IKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBXHA2DQNBQGA . You are receiving this because you were mentioned.Message ID: @.***>
Thanks for trying the API.
This API will be available in the media3 1.6 version. The first alpha version of 1.6 is expected to get released by first week of February and the stable version by the end of March.
As an immediate solution, I would suggest you to
- Disable sample copy on muxer using API
setSampleCopyEnabled. - Simply create a sample copy at your end before writing to the muxer.
This will release the encoder buffer ASAP and it will also avoid having 2 copies of the same sample in the memory.
I would suggest you to first try creating a simple ByteBuffer copy and once it works then try circle buffer optimization.
I assume that all the queries on this issue have been resolved so I will close this issue now. Please feel free to reopen if you need any further help.
Thanks, it is much appreciated.
Is there any harm in using the main branch as a local dependency until the release?
It should be fine to use the local dependency and later switch to a released version.