Demeler Deinterlacer Performance
deinterlacer uses high performance graphics cards to
achieve realtime performance for 1080i input at 60 or 50 frames/sec.
At this performance level, many host CPU system properties can restrict
performance, such as CPU clock rate, number of processors, cache size,
PCIe bandwidth, the number of PCIe lanes, system load, etc.
The amount of motion in the video also has an effect on GPU performance.
For realtime 1080i60 deinterlacing at a high quality level as of November 2013, we recommend an Intel i7
3930K processor (with 40 PCIe lanes) overclocked to 4.2GHz, with three Titan graphics cards, and
one or more Western Digital VelociRaptor disk drives or Samsung SSD 830/840 drives for local video
storage or caching if needed.
Demeler File I/O and Bandwidth Issues for 1080i Deinterlacing
output to disk drives may limit throughput for 1080p, even
for reduced chrominance bandwidth
for an in-depth discussion on this subject. The output
rates in the Demeler
throughput table below include using
to read pre-compressed files from an SSD drive, pipe the resulting
uncompressed output into Demeler
, pipe Demeler
output through y4mzip
compression, and finally write the compressed output to a Samsung SSD 830 solid-state drive. In the table, we give
input fields/sec figures for our suite of 1080p test sequences
used to provide interlaced input to Demeler
for testing) and for the listed card
configurations. Like many software-based algorithms, the processing
times for both compression and deinterlacing are image-content-dependent. The standard
deviation of throughput variation on our tests is about 6% of the average.
The table below is for graphics cards as delivered (no further overclocking).
||Nr. of cards
||Avg 1080p output frames/sec,
(a) 720i input field rates are about 2.2x the rates given for 1080i.
(b) At the time of writing, multiple GTX cards in SLI mode
CUDA performance improvement over a single card.
the software automatically
detects the number of graphics cards, and seamlessly partitions video
processing. Partitioning gives performance almost linear with the
number of graphics cards up to a CPU compute limit. We have verified
1080i deinterlacing performance using a host i7 3930K CPU
(6-core) overclocked to 4.2GHz, with 16 lanes of PCIe 2.X to each of two
cards, and have also tested a third card with just eight PCIe 2.X lanes.