Run Open Source FFMPEG at Lower Cost and Better Performance on a VT1 Instance for VOD Encoding Workloads 

February 28, 2023 By Mark Otto 0

FFmpeg is an open source tool commonly used by media technology companies to encode and transcode video and audio formats. FFmpeg users can leverage a cost efficient Amazon Web Services (AWS) instance for their video on demand (VOD) encoding workloads now that AWS offers VT1 support on Amazon Elastic Compute Cloud (Amazon EC2).

VT1 offers improved visual quality for 4K video, support for a newer version of FFmpeg (4.4), expanded OS/kernel support , and bug fixes. These instances are powered by the AMD-Xilinx Alveo U30 media accelerator. Xilinx has the ability to add a single line  into FFmpeg, and enable the Alveo U30 to do the transcoding work. The Xilinx Video SDK includes an enhanced version of FFmpeg that can communicate with the hardware accelerated transcode pipeline in Xilinx devices to deliver up to 30% lower cost per stream than Amazon EC2 GPU-based instances and up to 60% lower cost per stream than Amazon EC2 CPU-based instances.

Companies typically use EC2 CPU instances such as the C5 and C6 coupled with FFmpeg for their VOD encoding workloads. These workloads can be costly in cases where companies encode thousands of VOD assets. The cost of an EC2 workload is influenced by the number of concurrent encoding jobs that an instance can support and this subsequently affects the time it takes to encode targeted outputs. As VOD libraries expand, companies typically auto scale to increase the size or number of C5 and C6 instances or allow the instances to operate longer. In both cases, these workloads experience an increase in cost. Important note: There is no additional charge for AWS Auto Scaling. You pay only for the AWS resources needed to run your applications and Amazon CloudWatch monitoring fees.

Amazon EC2 VT1 instances are designed to accelerate real-time video transcoding and deliver low-cost transcoding for live video streams. VT1 is also a cost effective and performance-enhancing alternative for VOD encoding workloads. Using FFmpeg as the transcoding tool, AWS performed an evaluation of VT1, C5, and C6 instances to compare price performance and speed of encode for VOD assets. When compared to C5 and C6 instances, VT1 instances can achieve up to 75% cost savings. The results show that you could operate two VT1 instances for the price of one C5 or C6 instance. Additionally, the  .

Benchmarking Method

First let’s determine the best instance type to use for our VOD workload. C5 and C6 instances are commonly used for transcoding. We used  and C6i.8xl instances and compared them against VT1.3xl instances to transcode 4K and 1080p VOD assets. The two assets were encoded into the output targets as shown  . The VT1.3xl, C5.9xl, and C6i.8xl output targets were measured against the amount of time it took to complete the encode.

As shown here in the screenshot from the AWS console for various instance types, VT1.3xl is the smallest instance type in the VT1 family. Even though VT1.6xl compares closely to C5 and C6 in terms of CPU/memory, we chose VT1.3xl for a closer price/performance comparison.

VT1 instance types

VT1 family instance type comparison to C5 in terms of CPU/memory.

Input data points

Sample input content

The following table summarizes the key parameters of the source content video files used in measuring the encoding performance for the benchmarking

Clip Name Frame Count Duration Frame Rate Codec Resolution Chroma Sampling
1080p 43092 12 mins 60 H.264, High Profile 1920×1080 4:2:0 YUV
4K 776 13 secs 60 H.264, High Profile 3840×2160 4:2:0 YUV

Evaluation Adaptive Bit Rate (ABR ) targets

Adaptive bitrate streaming (ABR or ABS) is technology designed to stream files efficiently over HTTP networks. Multiple files of the same content, in different size files, are offered to a user’s video player, and the client chooses the most suitable file to play back on the device. This involves transcoding a single input stream to multiple output formats optimized for different viewing resolutions.

For the benchmarking tests the input 4K and 1080K files were transcoded to various target resolutions that can be used to support different device and network capabilities at their native resolution: 1080p, 720p, 540p, and 360p. The bitrate (br) in the graphic shown here indicates the bitrate associated with each pixel. For example a 4K input file was transcoded to 360p resolution and 640 bitrate.

adaptive bitrate ladder for transcoding

Figure 1: Adaptive bitrate ladder used to transcode output video files. Source: https://developer.att.com/video-optimizer/docs/best-practices/adaptive-bitrate-video-streaming

Output results

Target duration analysis

The VT1.3xl instance completed the targeted encodes 15.709 seconds faster than the C5.9xl instance and 12.58 seconds faster than the C6.8xl instance. The results in the following charts detail that the VT1.3xl instance has better speed and price performance when compared to the C5.9xl and C6i.8xl instances.

% Price Performance = {(C5/C6 Price Performance – VT1 Price Performance) /C5/C6 Price Performance} * 100

% Speed = { (C5/C6 duration – VT1 duration) /C5/C6 duration } * 100

H.264 4K Clip (3 seconds duration) Instance Type
VT1.3xl C5.9xl C6i.8xl
Codec mpsoc_vcu_h264 x264 x264
Duration to complete ABR targets (See Evaluation Targets Below) 14.47632 30.186221 27.05856
Speed‍ Compared to VT1.3xl (%) N/A 52.043284 % 46.500035 %
Instance Cost ($/hour) $0.65 $1.53 $1.22
Instance Cost ($/second) $0.00018 $0.00043 $0.000338
Price Performance: $/(clip transcoded) $0.002605 $0.01298 $0.009145
Price Performance Compared to VT1.3xl (%) ‍ N/A 79.930662 % 71.514488 %
H.264 1080p Clip (12 minutes duration) Instance Type
VT1.3xl C5.9xl C6i.8xl
Codec mpsoc_vcu_h264 x264 x264
Duration to complete ABR targets (See Evaluation Targets Below) 490.82074 837.20001 762.63252
Speed‍ Compared to VT1.3xl (%)‍ N/A 41.373538 % 35.641252 %
Instance Cost ($/hour) $0.65 $1.53 $1.22
Instance Cost ($/second) $0.00018 $0.00043 $0.000338
Price Performance: $/(clip transcoded) $0.0883 $0.3599 $0.2577
Price Performance Compared to VT1.3xl (%)‍ N/A 75.465407 % 65.735351 %

The following section explains the encoding parameters used for testing. FFmpeg was installed on 1 x C5.9xl, 1x C6i.8xl, and 1 x VT1.3xl instances.  The two input files as mentioned in sample input files were run in parallel on each instance type, and the total duration to complete the transcoding to various output target resolutions was calculated.

Technical specifications

  • EC2 Instances: 1 x C5.9xl, 1x C6i.8xl, 1 x VT1.3xl
  • Video framework: FFmpeg
  • Video codecs: x264 (CPU), XMA (Xilinx U30)
  • Quality objective: x264 faster
  • Operating system
    • For C5 and C6 – x264, Amazon Linux 2 (Linux kernel 4.14)
    • For VT1- Xilinx: Amazon Linux 2 (Linux kernel 5.4.0-1038-aws)

Video codec settings for encoding performance tests

Instance C5, C6 VT1
Codec x264 mpsoc_vcu_h264
Preset Faster n/a
Output Bitrate (CBR) See Evaluation Targets above See Evaluation Targets above
Chroma Subsampling YUV 4:2:0 YUV 4:2:0
Color Bit Depth 8 bits 8 bits
Profile High n/a

Conclusion

Amazon VT1 EC2 instances are typically used for live real-time encoding; however, this blog post demonstrates VT1 VOD encoding  and price performance advantages when compared to C5 and C6 EC2 instances. VT1 instances can encode VOD assets up to 52% faster, and achieve up to 75% reduction in cost when compared to C5 and C6 instances. VT1 is best utilized in workloads with VOD encoding jobs that require low encoding speeds in the time it takes to complete outputs. Please visit the Amazon EC2 VT1 instances page for more details.