Untangling Frame Rates for a Worldwide High Quality Internet TV Service

The Input Frame Rate [1]

This is the frame rate for the input video stream or file in frames per second. In input files or streams, the rate may be buried in "metadata", informing a display system how to play it at the correct average rate. Playing onto a screen at this rate gives "live" playback (i.e. not "fastmo" or "slowmo").

The Converted Frame Rate [2]

This is the frame rate converter output frame rate, such that playing back converted video onto a screen at this rate gives live playback. Note: this value is NOT the same as the speed at which the converter can physically generate these converted frames (see [4] below)! For help with understanding the algorithmic process of frame rate conversion, follow this link:

frame rate conversion

The Legato command line option:

  -r <converted_frame_rate>:<input_frame_rate>

specifies the integer ratio corresponds to [2] / [1], allowing control of this aspect of the frame rate converter by the server. Note: the frame rate processing to convert from input frame rate [1] to [2] is dependent only on this ratio. It is purely an algorithmic process, with some similarities to other types of signal resampling. However, frame rate conversion is much more complex and compute intensive.

The Refresh Frame Rate [3]

This is the rate at which the client device updates its own display. It is usually a fixed rate for each device. It is assumed that a client device knows its own display refresh rate. For large Internet TV displays, it goes by region: 50Hz in Europe, and "60" (actually 60,000/1001=59.94...) Hz in North America. For some cell phones, it may be a custom, much lower rate. In any case, as a client of the content server, the client device can inform the server of its refresh rate when it requests video.

The Render Rate [4]

The render rate is the target number of new frames per second the server sends video to the client device. This render rate is embedded in the served video as metadata back to the client. The client display device then does its best to render the video based on its refresh rate. For example, at [3] = 60Hz refresh rates, smooth results can be obtained for a small number of fixed integer n=1,2, or 3 repetitions of each render frame at the refresh rate, i.e. [4] = [3] / n. Bigger display devices generally need higher render rates (i.e. n=1, so [4]=60) in order to reduce motion judder. Cell phone display judder may be acceptable with a lower render rate than its refresh rate.

The client device is then supposed to attempt to render at this average rate, and should do a good job if the server uses the correct calculations above. The value of n may depend on server load, network bandwidth, free service versus paid subscription, or other considerations. In any case, the client player must be able to play the video at the metadata render rate sent by the server. In addition, if n is not integer, low frequency components may become visible on the client display, which the eye also interprets as judder. The Legato option:

  -O <render_numerator>:<render_denominator>

(uppercase 'O', not zero) defines the output render rate metadata as an integer fraction. If the denominator above is equal to 1 (i.e. the render rate is an integer number of Hz), then the ':1' can be omitted.

Bringing it all Together

The client device usually buffers some number of frames prior to starting playback. Buffering is an attempt to get smoother playback due to temporary Internet delivery pauses. Once the client device starts to play, then if its buffer gets too empty it can request more frames from the server, and if its buffer gets too full, it can temporarily stops the server from sending new video data. The important thing is that the average rate of frame delivery corresponds to the render rate, and that the buffer is big enough so that it does not frequently empty or overflow. Long network connection latencies, or data throughput reduction may cause problems with buffering.

If the server cannot create converted frames at the average rate needed by the display device, then the buffer will empty and the playback will freeze or judder.

If the server is much faster than the render rate to the client, then the server will spend more time waiting for the client device to send a request for more data. This is a good thing - it allows the server to perform other tasks, such as serving other clients, while appearing to give the client its undivided attention.

As part of their initial request, the client could specify a slowmo value. A slowmo value of 2 means it plays twice as slow, and a slowmo of 1/2 means it plays twice as fast.  If a render rate has been chosen by the server, then:

  converted_rate [2] = slowmo * render_rate [4]

For live video streaming, slowmo should be 1, (i.e. the converted rate [2] is chosen to equal the render rate [4]) because:
a) the input video cannot be told to stop or change, and
b) the client display render rate cannot easily change without reprogramming the frame rate converter, potentially introducing judder or causing client buffer over/underflow.

In a high quality live video streaming situation, [2] = [4] = [3] at the server. Furthermore, if the input video metadata is already at the desired converted rate (i.e. so no frame rate conversion is needed), then [1] = [2] , and no frame rate conversion is needed! The computational load is a lot less in this (hopefully common) case, allowing a server to handle many more typical requests than its GPU limits would normally impose. For file input, the slowmo value can be anything you like, without any buffer over/underflow considerations.

What happens without frame rate conversion?

The input frame rate [1] may be sent to the client's device directly. If the input frame rate matches the client's display refresh rate, fine. Otherwise the display device drops or duplicates frames to match the average input rate buried in the sent video metadata. This causes visible judder, and may waste bandwidth if the client system drops frames.

Legato page