Transcode your videos to keep the Lucille Ball happy
14/30 - What's the difference between Constant Frame Rate and Variable Frame Rate explained with physical comedy
So I make music videos: I should know that when you’re about to edit any video, the first task is to transcode the footage. Welp, now I know there’s a good reason.
But because I’m a rebel, I’m cool, ain’t nobody got time for that. I didn’t transcode. That’s’ loser nerd shit (also it takes so much fucking disk-space).
Transcoding basically copies your footage from one file to another. It’s kind of opening a box of chocolates, taking each individual chocolate and carefully placing it in a spot in a different but identical box. Or so I thought.
Transcoding is more complicated than that, it “demuxes” and “remuxes" the footage. Video, just like a zoetrope or a film strip, exploits the persistence of vision effect, so a video is comprised of lots of individual still frames. Usually spaced by intervals of 24, 25, or 30. Which is called the frame rate. Actually it’s way more varied – but let’s keep it simple.
Demuxing or “de-multiplexing” pulls apart all the information stored in the file, the individual frames, but also the audio. You see the audio and the video have to be stored together so that the right sound appears with the right frames to form synchronization. In the old days of Vitaphone they’d just play a record along with the film and hope for the best. Today, the audio is separated into segments, that have a specific signature that can be represented by an STMPE timecode, something like 01:45:12.14 which is paired to a video frame.
This is important because if you don’t give every bit of audio and every frame a time signature you might end up with “dropped frames” – which can look like a stutter - or glitchy audio.
Now if we’ve demuxed the video, we need to remux it. When it goes wrong, I think of it like that famous I Love Lucy scene: Lucy and Ethel are working in a chocolate factory, wrapping chocolates coming on a conveyer belt. What could possibly go wrong with these two wacky ladies? You can imagine each individual chocolate is a frame of video (or a packet of audio), and each wrapper is a specific (destination) timecode.
(Actually you can store multiple videos, audio tracks, and subtitle tracks in the same file – so you may be taking two frames from two different streams and wrapping them together, along with an audio packet, or an English Dub and the original superior Japanese audio track for you Otakus).
So far I’ve assumed that the frame rates are constant. This means like a mechanical metronome, precisely 25 times a second, with the same gap, the same timecode is released. One chocolate for every single wrapper. The conveyer belt operates at a constant speed.
Things get wild and wacky when you introduce a “variable frame rate”. Smartphones will often mux the video with a variable frame rate: there’s many reasons for this: to save space, perhaps you were filming at night so it needed to keep the shutter open longer to compensate with more light. When you play it back and there’s a lag or a jilt, that’s probably dropping a frame. You can imagine Lucy had two chocolates coming down the conveyer belt when she only had one wrapper, or hadn’t finished wrapping the last one, stuffed the extra choccie it in her mouth, or her hat.
Dropped frames look bad, especially when you’re meant to be a professional like me.
It just so happens the last project I was working on was sourced from mobile phone footage with… you guessed it… a variable frame rate.
The Lucy homunculus that lives in my computer clearly gave up at one point – because my output video splashed a bit red “Media Offline” image for a fraction of a second. How many chocolates did she have to stuff in her hat? It didn’t just drop a frame; it couldn’t even find a frame.
How could I have avoided this egregious bout of uncouth unprofessionalism? I could have transcoded the footage.
Rather than randomly dropping chocolates on poor Lucy, the conveyer belt speeding up and slowing down, the conveyer belt would stop until Lucy had finished wrapping the chocolate, and only move one chocolate (or 25th of a second) at a time.
What happens if you have a frame that straddles between two timecodes? If we’re dealing with 25 frames per second and this one is sort of between frames 5 and 6, in addition to frames that perfectly fit 5 and 6, what happens?
Good question, you can actually choose what happens if you use the right software. For example, FFMPEG, which is both my savoir and the bane of my existence, is a commandline program that can do just that. In fact it lets you do pretty much anything to video.
To control straddling frames, has an option called “vsync” (video synchronization, I guess?), if you set vsync to zero (also known as “passthrough”) what it will do is it will disregard the original timecode and put every frame in order.
Won’t that mean there’ll be a sudden jump since you have three frames taken twice as close together as the ones around it? Yes.
If you don’t like that, you can set it to mode “1” which will drop frames that straddle, and if there’s a timecode that is missing a frame, it will simply copy the previous one. No big ugly “media offline” errors, but it might cause slight stutters or frozen vision.
If you set it to “2” it will copy the variable frame rate timings, leaving it for the next poor sap to worry about.
Now, I haven’t even gotten into the whole fact that most video you see has at some point been encoded using H264 (MPEG-4 Part 10 “Advanced Video Coding”), which means that most frames aren’t actually a frame at all, it’s more a Frankenstein frame, often based on Discrete Cosine Transformation. A process that breaks each frame into, say, 8x8 pixel blocks, it then generates a cosine wave and calculates intersects as closely to the values the block pixels as it can. P-frames, and I-Frames that determine how Frankenstein they are, and sometimes a timetravelling B-frame. Ugh.
The thing I’m trying to say is that even though it seems uncool, fellow kids: transcode your footage, especially on a major project. It will make the little Lucy in your machine much happier. It will also make you happier as you avoid unsightly glitches or error messages, and maybe even, appear more professional.
