Editing immersive video with spatial audio must be done with care to preserve the audio and its spatial relationship with the video. This video walks through the necessary steps for doing this using Adobe Premiere and the FB360 Encoder.
Read Time: 5 Minutes
Adobe Premiere Pro or other NLE that supports multichannel audio tracks
Immersive video with spatial audio
Introduction
Normal videos can be edited easily by any video editor, but immersive videos need special handling. At a minimum, the edited videos need to be injected with metadata, and videos that contain spatial audio need to have the audio extracted and preserved during an edit.
For example, the Facebook 360 audio format (TBE8.2) is 8 channels of audio plus an optional 2 channels of headlocked audio. The .mkv container supports many channels per track, but .mp4 containers support a maximum of 4 per track. This means that the .mp4 files that are uploaded to Meta Quest Media Studio have 10 channels split into a 4+4+2 arrangement. It’s possible to import a video file with audio in this format into a tool such as Adobe Premiere Pro and hack a way to export spatial audio and headlocked audio separately, but it is better to extract the audio and treat it like new masters. Here is an overview of the full process:
Ffmpeg commands
You can download the FB360 Audio Extraction script from https://github.com/facebookincubator/, and here are the ffmpeg commands used in the script:
ffmpeg -i "input.mp4" -filter_complex "[0:1][0:2]amerge=inputs=2" "tbe_8.wav"
ffmpeg -i "input.mp4" -filter_complex "[0:1][0:2][0:3]amerge=inputs=3" "tbe_8.2.wav"
ffmpeg -i "input.mp4" -map 0:3 "headlocked.wav"
This is a video walkthrough of the concepts and process: