react-native-vision-camera icon indicating copy to clipboard operation
react-native-vision-camera copied to clipboard

🐛 iOS pause audio/video output incorrect

Open xHeinrich opened this issue 1 year ago • 2 comments

What's happening?

When taking a video and pausing/resuming the video on iOS, the audio and video are chopped into different segments rather than one continuous video. For example, I start recording and say

one two three

pause recording and say four five six

resume recording and say seven eight nine

stop recording

Example video output:

https://github.com/mrousavy/react-native-vision-camera/assets/7674587/c50c74ab-23be-404e-a81f-22128227475a

Reproduceable Code

Full repo with a minimal reproduction: https://github.com/xHeinrich/vision-camera-reproduction

const camera = useRef<Camera>(null)

    let device = useCameraDevice('back', {
        physicalDevices: [
            'ultra-wide-angle-camera',
            'wide-angle-camera',
            'telephoto-camera'
        ]
    })
 
    const [targetFps, setTargetFps] = useState(30)

    let filters = [
        {fps: targetFps},
        {videoStabilizationMode: 'auto'},
        {
            videoResolution: {
                width: 1280,
                height: 720
            },
        },
        {
            photoResolution: {
                width: 1280,
                height: 720
            },
        }
    ];

    const format = useCameraFormat(device, filters)
    const [torchOn, setTorchOn] = useState('off')
    const onError = useCallback((error: any) => {
        console.error(error)
    }, [])
                      
                   <Camera device={device}
                                zoom={device.neutralZoom}
                                ref={camera}
                                format={format}
                                enableZoomGesture={true}
                                exposure={0}
                                style={StyleSheet.absoluteFill}
                                isActive={isActive}
                                torch={torchOn}
                                orientation={'portrait'}
                                audio={micPermissionState.hasPermission}
                                photo={permissionState.hasPermission}
                                video={permissionState.hasPermission}
                                onError={onError}
                        />


            camera.current!.takePhoto({
                qualityPrioritization: 'speed',
                enableShutterSound: false
            }).then((file: PhotoFile) => {
            }).catch((error) => {
                reject(error)
            })

// sleep 3 seconds

        camera.current.startRecording({
            flash: 'on',
            fileType: "mp4",
            videoCodec: "h264",
            videoBitRate: 5, // 5mbps as target, affected by target fps
            onRecordingFinished: (video: VideoFile) => {
            },
            onRecordingError: (error) => console.error(error)
        })


// sleep 3 seconds
        await camera.current.pauseRecording()

// sleep 3 seconds


        await camera.current.resumeRecording()

// sleep 3 seconds

       await camera.current.stopRecording()

Relevant log output

13:23:26.080: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 27 props: [onInitialized, cameraId, position, enableBufferCompression, preview, onStarted, onCodeScanned, collapsable, top, right, isActive, video, onViewReady, onError, onStopped, enableFrameProcessor, format, orientation, left, bottom, audio, enableZoomGesture, exposure, torch, photo, onShutter, zoom]
13:23:26.081: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:23:26.082: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: true, outputsChanged: true, videoStabilizationChanged: true, orientationChanged: true, formatChanged: true, sidePropsChanged: true, torchChanged: true, zoomChanged: true, exposureChanged: true, audioSessionChanged: true, locationChanged: true)
13:23:26.082: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Configuring Input Device...
13:23:26.082: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Configuring Camera com.apple.avfoundation.avcapturedevice.built-in_video:7...
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureDevice(configuration:): Successfully configured Input Device!
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Configuring Outputs...
13:23:26.086: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Adding Photo output...
13:23:26.088: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Adding Video Data output...
13:23:26.088: [info] :camera_with_flash: VisionCamera.configureOutputs(configuration:): Successfully configured all outputs!
13:23:26.089: [info] :camera_with_flash: VisionCamera.configureFormat(configuration:device:): Configuring Format (2112x1188 | [email protected] (ISO: 34.0..3264.0))...
13:23:26.089: [info] :camera_with_flash: VisionCamera.configureFormat(configuration:device:): Successfully configured Format!
13:23:26.090: [info] :camera_with_flash: VisionCamera.getPixelFormat(for:): Available Pixel Formats: ["420v", "420f", "BGRA", "&8v0", "-8v0", "&8f0", "-8f0", "&BGA", "-BGA"], finding best match... (pixelFormat="yuv", enableHdr={false}, enableBufferCompression={true})
13:23:26.090: [info] :camera_with_flash: VisionCamera.getPixelFormat(for:): Using PixelFormat: -8f0...
13:23:26.485: [info] :camera_with_flash: VisionCamera.onCameraStarted(): Camera started!
13:23:26.485: [info] :camera_with_flash: VisionCamera.onSessionInitialized(): Camera initialized!
13:23:26.486: [info] :camera_with_flash: VisionCamera.configure(_:): Beginning AudioSession configuration...
13:23:26.486: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Configuring Audio Session...
13:23:26.486: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Adding Audio input...
13:23:26.487: [info] :camera_with_flash: VisionCamera.configure(_:): Beginning Location Output configuration...
13:23:26.490: [info] :camera_with_flash: VisionCamera.configureAudioSession(configuration:): Adding Audio Data output...
13:23:26.491: [info] :camera_with_flash: VisionCamera.configure(_:): Committed AudioSession configuration!
13:23:26.495: [info] :camera_with_flash: VisionCamera.configure(_:): Finished Location Output configuration!
13:23:44.113: [info] :camera_with_flash: VisionCamera.takePhoto(options:promise:): Capturing photo...
13:23:47.175: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Starting Video recording...
13:23:47.177: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Will record to temporary file: /private/var/mobile/Containers/Data/Application/0F78673F-DB67-4F14-8017-D356A019D118/tmp/ReactNative/01121C78-1A01-4C95-9FFF-38CC93C40AF1.mp4
13:23:47.186: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): Enabling Audio for Recording...
13:23:47.186: [info] :camera_with_flash: VisionCamera.activateAudioSession(): Activating Audio Session...
13:23:47.194: [info] :camera_with_flash: VisionCamera.initializeAudioWriter(withSettings:format:): Initializing Audio AssetWriter with settings: ["AVSampleRateKey": 48000, "AVNumberOfChannelsKey": 1, "AVFormatIDKey": 1633772320]
13:23:47.194: [info] :camera_with_flash: VisionCamera.updateCategory(_:mode:options:): Changing AVAudioSession category from AVAudioSessionCategoryPlayAndRecord -> AVAudioSessionCategoryPlayAndRecord
13:23:47.357: [info] :camera_with_flash: VisionCamera.updateCategory(_:mode:options:): AVAudioSession category changed!
13:23:47.362: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 1 props: [torch]
13:23:47.362: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:23:47.773: [info] :camera_with_flash: VisionCamera.activateAudioSession(): Audio Session activated!
13:23:47.785: [info] :camera_with_flash: VisionCamera.initializeAudioWriter(withSettings:format:): Initialized Audio AssetWriter.
13:23:47.795: [info] :camera_with_flash: VisionCamera.recommendedVideoSettings(forOptions:): Using codec AVVideoCodecType(_rawValue: avc1)...
13:23:47.795: [info] :camera_with_flash: VisionCamera.recommendedVideoSettings(forOptions:): Setting Video Bit-Rate from 14358528.0 bps to 5000000.0 bps...
13:23:47.795: [info] :camera_with_flash: VisionCamera.initializeVideoWriter(withSettings:): Initializing Video AssetWriter with settings: ["AVVideoCompressionPropertiesKey": ["AverageNonDroppableFrameRate": 30, "Priority": 80, "RealTime": 1, "ExpectedFrameRate": 60, "MaxAllowedFrameQP": 41, "H264EntropyMode": CABAC, "MaxKeyFrameIntervalDuration": 1, "AverageBitRate": 5000000, "AllowFrameReordering": 0, "MinAllowedFrameQP": 15, "QuantizationScalingMatrixPreset": 3, "ProfileLevel": H264_High_AutoLevel], "AVVideoWidthKey": 720, "AVVideoHeightKey": 1280, "AVVideoCodecKey": avc1]
13:23:47.819: [info] :camera_with_flash: VisionCamera.initializeVideoWriter(withSettings:): Initialized Video AssetWriter.
13:23:47.819: [info] :camera_with_flash: VisionCamera.start(clock:): Starting Asset Writer(s)...
13:23:48.441: [info] :camera_with_flash: VisionCamera.start(clock:): Asset Writer(s) started!
13:23:48.442: [info] :camera_with_flash: VisionCamera.start(clock:): Started RecordingSession at time: 127794.095554541
13:23:48.442: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): RecordingSesssion started in 1266.5345ms!
13:23:48.443: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: false, outputsChanged: false, videoStabilizationChanged: false, orientationChanged: false, formatChanged: false, sidePropsChanged: false, torchChanged: true, zoomChanged: false, exposureChanged: false, audioSessionChanged: false, locationChanged: false)
13:24:00.905: [info] :camera_with_flash: VisionCamera.stop(clock:): Requesting stop at 127806.559768166 seconds for AssetWriter with status "writing"...
13:24:00.988: [info] :camera_with_flash: VisionCamera.appendBuffer(_:clock:type:): Successfully appended last audio Buffer (at 127806.56072916667 seconds), finishing RecordingSession...
13:24:00.988: [info] :camera_with_flash: VisionCamera.finish(): Stopping AssetWriter with status "writing"...
13:24:01.008: [info] :camera_with_flash: VisionCamera.startRecording(options:onVideoRecorded:onError:): RecordingSession finished with status completed.
13:24:01.008: [info] :camera_with_flash: VisionCamera.deactivateAudioSession(): Deactivating Audio Session...
13:24:01.013: [info] :camera_with_flash: VisionCamera.deactivateAudioSession(): Audio Session deactivated!
13:24:01.034: [info] :camera_with_flash: VisionCamera.didSetProps(_:): Updating 1 props: [torch]
13:24:01.034: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Waiting for lock...
13:24:01.034: [info] :camera_with_flash: VisionCamera.configure(_:): configure { ... }: Updating CameraSession Configuration... Difference(inputChanged: false, outputsChanged: false, videoStabilizationChanged: false, orientationChanged: false, formatChanged: false, sidePropsChanged: false, torchChanged: true, zoomChanged: false, exposureChanged: false, audioSessionChanged: false, locationChanged: false)

Camera Device

{
 "id": "com.apple.avfoundation.avcapturedevice.built-in_video:7",
 "formats": [],
 "hasFlash": true,
 "name": "Back Triple Camera",
 "minExposure": -8,
 "neutralZoom": 2,
 "physicalDevices": [
  "ultra-wide-angle-camera",
  "wide-angle-camera",
  "telephoto-camera"
 ],
 "supportsFocus": true,
 "supportsRawCapture": false,
 "isMultiCam": true,
 "minZoom": 1,
 "minFocusDistance": 2,
 "maxZoom": 61.875,
 "maxExposure": 8,
 "supportsLowLightBoost": false,
 "sensorOrientation": "landscape-right",
 "position": "back",
 "hardwareLevel": "full",
 "hasTorch": true
}

Device

iPhone 13 Pro

VisionCamera Version

4.0.0

Can you reproduce this issue in the VisionCamera Example app?

Yes, I can reproduce the same issue in the Example app here

Additional information

xHeinrich avatar Apr 24 '24 03:04 xHeinrich

Any update on this? facing similar issue

"react": "18.2.0",
"react-native": "0.74.1",
"react-native-vision-camera": "^4.0.3"

thanhtungkhtn avatar May 07 '24 10:05 thanhtungkhtn

Think it must be some timing issue with the asset writer but I don't know enough swift to find actual issue.

xHeinrich avatar May 08 '24 00:05 xHeinrich

same here

qper228 avatar May 29 '24 09:05 qper228

This is my naive workaround patch for 3.9.2. Only tested with iPhone 12 Pro. I'm not an expert on iOS and swift. Even though we pause recording, captureSession's clock keeps going (we cannot stop captureSession because we should show the camera preview to the users). It seems that AVAssetWriter only considers the timestamp recorded in CMSampleBuffers. The idea is to adjust timestamp in the buffer.

This is a demo video same as the author did.

https://github.com/mrousavy/react-native-vision-camera/assets/14037793/13a6d5dc-4422-4b30-abe7-2d84040b5ef5

diff --git a/ios/Core/CameraSession+Video.swift b/ios/Core/CameraSession+Video.swift
index 00ff941b1d4cee15323f1f960a19a14613acab01..69e57e4092d99104793b994e9273a37dd301c18f 100644
--- a/ios/Core/CameraSession+Video.swift
+++ b/ios/Core/CameraSession+Video.swift
@@ -157,11 +157,12 @@ extension CameraSession {
   func pauseRecording(promise: Promise) {
     CameraQueues.cameraQueue.async {
       withPromise(promise) {
-        guard self.recordingSession != nil else {
+        guard let recordingSession = self.recordingSession else {
           // there's no active recording!
           throw CameraError.capture(.noRecordingInProgress)
         }
         self.isRecording = false
+        try recordingSession.pause(clock: self.captureSession.clock)
         return nil
       }
     }
@@ -173,11 +174,12 @@ extension CameraSession {
   func resumeRecording(promise: Promise) {
     CameraQueues.cameraQueue.async {
       withPromise(promise) {
-        guard self.recordingSession != nil else {
+        guard let recordingSession = self.recordingSession else {
           // there's no active recording!
           throw CameraError.capture(.noRecordingInProgress)
         }
         self.isRecording = true
+        try recordingSession.resume(clock: self.captureSession.clock)
         return nil
       }
     }
diff --git a/ios/Core/RecordingSession.swift b/ios/Core/RecordingSession.swift
index 85e9c622573143bd38f0b0ab6f81ad2f40e03cc3..8c4836c97b562bbda362c14f314a0ce96f113d2a 100644
--- a/ios/Core/RecordingSession.swift
+++ b/ios/Core/RecordingSession.swift
@@ -33,6 +33,8 @@ class RecordingSession {
 
   private var startTimestamp: CMTime?
   private var stopTimestamp: CMTime?
+  private var pauseTimestamp: CMTime?
+  private var pauseTimestampOffset: CMTime?
 
   private var lastWrittenTimestamp: CMTime?
 
@@ -67,7 +69,12 @@ class RecordingSession {
           let startTimestamp = startTimestamp else {
       return 0.0
     }
-    return (lastWrittenTimestamp - startTimestamp).seconds
+
+    if let pauseTimestampOffset = pauseTimestampOffset {
+      return (lastWrittenTimestamp - startTimestamp - pauseTimestampOffset).seconds
+    } else {
+      return (lastWrittenTimestamp - startTimestamp).seconds
+    }
   }
 
   init(url: URL,
@@ -158,6 +165,8 @@ class RecordingSession {
     // Start the sesssion at the given time. Frames with earlier timestamps (e.g. late frames) will be dropped.
     assetWriter.startSession(atSourceTime: currentTime)
     startTimestamp = currentTime
+    pauseTimestamp = nil
+    pauseTimestampOffset = nil
     ReactLogger.log(level: .info, message: "Started RecordingSession at time: \(currentTime.seconds)")
 
     if audioWriter == nil {
@@ -195,6 +204,56 @@ class RecordingSession {
     }
   }
 
+  /**
+   Record pause timestamp to calculate timestamp offset using the current time of the provided synchronization clock.
+   The clock must be the same one that was passed to start() method.
+   */
+  func pause(clock: CMClock) throws {
+    lock.wait()
+    defer {
+      lock.signal()
+    }
+
+    let currentTime = CMClockGetTime(clock)
+    ReactLogger.log(level: .info, message: "Pausing Asset Writer(s)...")
+
+    guard pauseTimestamp == nil else {
+      ReactLogger.log(level: .error, message: "pauseTimestamp is already non-nil")
+      return
+    }
+
+    pauseTimestamp = currentTime
+  }
+
+  /**
+   Update pause timestamp offset using the current time of the provided synchronization clock.
+   The clock must be the same one that was passed to start() method.
+   */
+  func resume(clock: CMClock) throws {
+    lock.wait()
+    defer {
+      lock.signal()
+    }
+
+    let currentTime = CMClockGetTime(clock)
+    ReactLogger.log(level: .info, message: "Resuming Asset Writer(s)...")
+
+    guard let pauseTimestamp = pauseTimestamp else {
+      ReactLogger.log(level: .error, message: "Tried resume but recording has not been paused")
+      return
+    }
+
+    let pauseOffset = currentTime - pauseTimestamp
+    self.pauseTimestamp = nil
+    if let currentPauseTimestampOffset = pauseTimestampOffset {
+      pauseTimestampOffset = currentPauseTimestampOffset + pauseOffset
+      ReactLogger.log(level: .info, message: "Current pause offset is \(pauseTimestampOffset!.seconds)")
+    } else {
+      pauseTimestampOffset = pauseOffset
+      ReactLogger.log(level: .info, message: "Current pause offset is \(pauseTimestampOffset!.seconds)")
+    }
+  }
+
   /**
    Appends a new CMSampleBuffer to the Asset Writer.
    - Use clock to specify the CMClock instance this CMSampleBuffer uses for relative time
@@ -238,12 +297,32 @@ class RecordingSession {
     }
 
     // 3. Actually write the Buffer to the AssetWriter
+    let buf: CMSampleBuffer
+    if let pauseTimestampOffset = pauseTimestampOffset {
+      // let newTime = timestamp - pauseTimestampOffset
+      var count: CMItemCount = 0
+      CMSampleBufferGetSampleTimingInfoArray(buffer, entryCount: 0, arrayToFill: nil, entriesNeededOut: &count)
+      var info = [CMSampleTimingInfo](repeating: CMSampleTimingInfo(duration: CMTimeMake(value: 0, timescale: 0), presentationTimeStamp: CMTimeMake(value: 0, timescale: 0), decodeTimeStamp: CMTimeMake(value: 0, timescale: 0)), count: count)
+      CMSampleBufferGetSampleTimingInfoArray(buffer, entryCount: count, arrayToFill: &info, entriesNeededOut: &count)
+
+      for i in 0..<count {
+        info[i].decodeTimeStamp = info[i].decodeTimeStamp - pauseTimestampOffset
+        info[i].presentationTimeStamp = info[i].presentationTimeStamp - pauseTimestampOffset
+      }
+
+      var out: CMSampleBuffer?
+      CMSampleBufferCreateCopyWithNewTiming(allocator: nil, sampleBuffer: buffer, sampleTimingEntryCount: count, sampleTimingArray: &info, sampleBufferOut: &out)
+      buf = out!
+    } else {
+      buf = buffer
+    }
     let writer = getAssetWriter(forType: bufferType)
     guard writer.isReadyForMoreMediaData else {
       ReactLogger.log(level: .warning, message: "\(bufferType) AssetWriter is not ready for more data, dropping this Frame...")
       return
     }
-    writer.append(buffer)
+    writer.append(buf)
+    ReactLogger.log(level: .info, message: "append \(bufferType) Buffer (at \(timestamp.seconds) seconds)...")
     lastWrittenTimestamp = timestamp
 
     // 4. If we failed to write the frames, stop the Recording

My concerns on this workaround are:

  1. Because the latest pause and resume timestamp are considered, there can be some race condition due to out-of-order buffer processing (I guess it is rare)
  2. The only way to change the timestamp of the buffer I found is to copy it and I am not sure how much performance would be affected.

lee-byeoksan avatar May 31 '24 13:05 lee-byeoksan

Found relevant PR but it's closed. https://github.com/mrousavy/react-native-vision-camera/pull/1546

lee-byeoksan avatar May 31 '24 14:05 lee-byeoksan

Hey all!

I just spent a few days on thinking about a battleproof timestamp synchronization solution, and I came up with a great idea. I built a TrackTimeline helper class which represents a video or audio track - it can be started & stopped, paused & resumed, and even supports nesting pauses without issues.

  • The total duration of the video is summed up from the difference between the first and the last actually written timestamps, minus the total duration of all pauses between a video. No more incorrect video.duration! 🥳
  • Whereas before I just had a 4 second timeout if no frames arrive, I now just wait twice the frame latency (a few milliseconds) to ensure no frames are left out at maximum! 🎉
  • A video can be stopped while it is paused without any issues, as a pause call is taken into consideration before stopping 💪
  • A video file's session now exactly starts at the start() timestamp, and ends at the exact timestamp of the last video frame - this ensures there can never be any blank frames in the video, even if the audio track is longer 🤩

This was really complex to built as I had to synchronize timestamps between capture sessions, and the entire thing is a producer model - a video buffer can come like a second or so later than the audio buffer, but I need to make sure the video track starts before the audio track starts, and ends after the audio track ends - that's a huge brainf*ck! 🤯😅

There's also no helper APIs for this on iOS, and it looks like no other Camera framework (not even native Swift/ObjC iOS Camera libraries) support this - they all break when timestamps have a delay (e.g. video stabilization enabled) (or dont even support delays at all) ; so I had to build the thing myself.

Check out this PR and try if it fixes the issue for you; https://github.com/mrousavy/react-native-vision-camera/pull/2948

Thanks! ❤️

mrousavy avatar Jun 08 '24 13:06 mrousavy