Video transcoding is the process of converting an input video file into multiple versions with different formats, resolutions, and bitrates.
For instance, if a creator uploads a 4K video (2160p), the system transcodes it into resolutions like 1080p, 720p, 480p, and 360p. These versions are prepared to suit different devices and network speeds.
Video Resolution and Aspect Ratios
Resolution refers to the number of pixels in a video frame, defining its clarity and detail. The most common resolutions and their dimensions (assuming a 16:9 aspect ratio) include:
- 2160p (4K): 3840×2160 pixels
- 1080p (Full HD): 1920×1080 pixels
- 720p (HD): 1280×720 pixels
- 480p (SD): 854×480 pixels
- 360p: 640×360 pixels
The “p” indicates progressive scan, where the entire frame is displayed in one pass, as opposed to interlaced scan (denoted by “i”), which alternates between odd and even lines of the frame. Progressive scan provides smoother motion and better clarity, making it the standard for modern video streaming.
The aspect ratio, typically 16:9 for videos, determines the width-to-height relationship of the frame. For example:
- A 720p video (16:9) has dimensions of 1280×720 pixels.
- A 1080p video (16:9) has dimensions of 1920×1080 pixels.
Different aspect ratios, like 4:3 (used in older television broadcasts) or 21:9 (ultrawide screens), affect the width but not the vertical resolution.
Adaptive Bitrate Streaming and Bandwidth Optimization
Transcoding also involves creating bitrate-optimized versions of the video, enabling platforms to deliver the best possible quality based on the user’s available bandwidth. This is achieved through adaptive bitrate streaming, where the video is divided into small chunks (e.g., 5-second segments) for each resolution and bitrate.
For example, a video may have chunks encoded at:
- 1080p at 6 Mbps (high bandwidth)
- 720p at 3 Mbps (medium bandwidth)
- 360p at 0.5 Mbps (low bandwidth)
When a user plays the video, the streaming client dynamically selects the appropriate chunk based on the current network conditions. If bandwidth drops, the player seamlessly switches to a lower-resolution chunk, avoiding buffering.
How Transcoded Formats Are Chosen
The choice of formats, resolutions, and bitrates is based on:
- Device Capability: Different devices have limitations on the resolutions they can display. For example, a smartphone might display up to 1080p, while a 4K TV supports 2160p.
- Screen Size: Smaller screens, like those on mobile phones, benefit less from higher resolutions because the pixel density is already high.
- Network Conditions: Users on fast connections can stream higher resolutions and bitrates, while users on slower connections are served lower-quality versions to ensure smooth playback.
- Content Type: Videos with fast motion (e.g., sports) require higher bitrates to avoid artifacts, while static content (e.g., slideshows) can be encoded at lower bitrates.
For example:
- A user with a 4K TV and a fiber connection might stream at 2160p.
- A user on a smartphone with a 3G connection might stream at 360p.
Why Mbps Is Used in Streaming
Streaming services use Mbps because it’s a standard unit for network speed, directly reflecting how much data can be transferred per second over a connection. Video encoding bitrates are expressed in Mbps to match network performance metrics.