Editing immersive video with spatial audio

Editing immersive video with spatial audio

Editing immersive video with spatial audio must be done with care to preserve the audio and its spatial relationship with the video. This video walks through the necessary steps for doing this using Adobe Premiere and the FB360 Encoder.

Alt text.

Categories:Skills & Principles
Tags:360 VideoAudio EditingPost ProductionVideo Editing
Skill Level:

Read Time: 5 Minutes

Updated 09/15/2022


Adobe Premiere Pro or other NLE that supports multichannel audio tracks

FB360 Encoder

Immersive video with spatial audio


Normal videos can be edited easily by any video editor, but immersive videos need special handling. At a minimum, the edited videos need to be injected with metadata, and videos that contain spatial audio need to have the audio extracted and preserved during an edit.

For example, the Facebook 360 audio format (TBE8.2) is 8 channels of audio plus an optional 2 channels of headlocked audio. The .mkv container supports many channels per track, but .mp4 containers support a maximum of 4 per track. This means that the .mp4 files that are uploaded to Meta Quest Media Studio have 10 channels split into a 4+4+2 arrangement. It’s possible to import a video file with audio in this format into a tool such as Adobe Premiere Pro and hack a way to export spatial audio and headlocked audio separately, but it is better to extract the audio and treat it like new masters. Here is an overview of the full process:

  1. Extract spatial audio and headlocked audio
  2. Put the video and extracted audio on a timeline in Adobe Premiere Pro or another NLE that supports multichannel audio tracks
  3. Do your edit
  4. Export video, spatial audio, and headlocked tracks separately
  5. Use FB360 Encoder to encode audio, mux, and tag with appropriate metadata

Ffmpeg commands

You can download the FB360 Audio Extraction script from https://github.com/facebookincubator/, and here are the ffmpeg commands used in the script:

ffmpeg -i "input.mp4" -filter_complex "[0:1][0:2]amerge=inputs=2" "tbe_8.wav"

ffmpeg -i "input.mp4" -filter_complex "[0:1][0:2][0:3]amerge=inputs=3" "tbe_8.2.wav"

ffmpeg -i "input.mp4" -map 0:3 "headlocked.wav"

This is a video walkthrough of the concepts and process: