Audio goes out of sync when source files have mismatched audio track lengths

I've been testing out Editly to concatenate various home videos, with transitions and subtitles. Overall it's working well, but I discovered that eventually audio will go out of sync. #117 suggests using `detached-audio` layers as a workaround, and that works, but is cumbersome because I basically have to list each video clip twice.

Then I started comparing the audio clips extracted into the temp directory when Editly runs. The first thing I noticed was that the length of the concatenated audio file was different depending on whether or not `keepSourceAudio` was enabled. If it was, then the file was shorter, in some cases by several seconds; if it wasn't, then the length was correct (i.e. matched the length of the final video).

Then I looked at the individual clips and found that in some cases the extracted audio clip was shorter than the silence clip generated for the same video clip (when disabling `keepSourceAudio`). This only happened with some video files, and only if there was no `cutTo` specified. Turns out, in those files the audio track is shorter than the video track. For example:

```shell
$ mediainfo --output=JSON SAM_4235.AVI | jq -r '.media.track[] | ."@type", .Duration'
General
20.767
Video
20.767
Audio
20.758
$ ffmpeg -nostdin -i SAM_4235.AVI -t 20.767 -sample_fmt s32 -ar 48000 -map a:0 -c:a flac -y test.flac 2>/dev/null
$ mediainfo --Output="General;%Duration%" test.flac
20758
```

It's a small difference, but over dozens of files it adds up. It also explains why the problem doesn't occur when `keepSourceAudio` is disabled - the silence files are generated to the exact same length as the video clip, so the concatenated audio file is also the correct length.

A couple of possible fixes I tested:

 1. Pad the extracted audio to the length of the video:

    ```shell
    $ ffmpeg -nostdin -i SAM_4235.AVI -t 20.767 -sample_fmt s32 -ar 48000 -map a:0 -c:a flac -af apad -shortest -y test.flac 2>/dev/null
    $ mediainfo --Output="General;%Duration%" test.flac
    20767
    ```

    This works, but I'm not sure how it might affect files where the audio track is *longer* than the video track.

 2. Always generate the silence clips and merge them with the extracted audio clips:
    ```shell
    $ ffmpeg -nostdin -f lavfi -i anullsrc=channel_layout=stereo:sample_rate=44100 -sample_fmt s32 -ar 48000 -t 20.767 -c:a flac -y silence.flac 2>/dev/null
    $ mediainfo --Output="General;%Duration%" silence.flac
    20767
    $ ffmpeg -nostdin -i SAM_4235.AVI -i silence.flac -filter_complex "[0:1][1:0] amix=inputs=2:duration=longest[a]" -t 20.767 -sample_fmt s32 -ar 48000 -map "[a]" -c:a flac -y test.flac 2>/dev/null
    $ mediainfo --Output="General;%Duration%" test.flac
    20767
    ```

    This also works, but I'm not sure how it might affect files with more than one audio stream. Alternatively, the silence and extracted audio can be merged as a separate step.

I guess another possible solution would be to generate a silence clip that's the full length of the final video and merge the extracted audio clips to it at the correct positions. Or something else I'm not thinking of, I just trial&error'd my way through this with different ffmpeg flags.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Audio goes out of sync when source files have mismatched audio track lengths #327

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Audio goes out of sync when source files have mismatched audio track lengths #327

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions