Difference between revisions of "Performance & Benchmark"
m (→Benchmark) |
|||
(23 intermediate revisions by the same user not shown) | |||
Line 8: | Line 8: | ||
==Benchmark== | ==Benchmark== | ||
The following benchmark | The following benchmark via NVIDIA's Performance tools for Windows and Linux. Timing figure represents frames per second (fps) based on only the processing time on the single GPU. The benchmarks are performed on 8-bit images except where multiple images are required as input. L and D refer to laptop and desktop. | ||
{|class="wikitable" | {|class="wikitable" | ||
|- | |- | ||
! | ! | ||
! 720p | !colspan="2"| 720p | ||
! 1080p | !colspan="2"| 1080p | ||
! 4k (3840x2160) | !colspan="2"| 4k (3840x2160) | ||
! 8k (7680x4320) | !colspan="2"| 8k (7680x4320) | ||
|- | |- | ||
! | ! | ||
! GTX 1080 | ! GTX 1080-D | ||
! GTX 1080 | ! GTX 2060-L | ||
! GTX 1080 | ! GTX 1080-D | ||
! GTX 1080 | ! GTX 2060-L | ||
! GTX 1080-D | |||
! GTX 2060-L | |||
! GTX 1080-D | |||
! GTX 2060-L | |||
|- | |- | ||
! Color Operations | ! Color Operations | ||
| | |colspan="8" style="text-align:center;"|fps | ||
| fps | |||
|- | |- | ||
| [[Function:Adjust| adjust]] | | [[Function:Adjust| adjust]] | ||
|9,478.67 | |9,478.67 | ||
|6,804.97 | |||
|3,837.30 | |3,837.30 | ||
|2,646.75 | |||
|1,036.70 | |1,036.70 | ||
|773.81 | |||
|266.68 | |266.68 | ||
|204.30 | |||
|- | |- | ||
| [[Function:AutoColor| autoColor]] | | [[Function:AutoColor| autoColor]] | ||
|13,793.10 | |13,793.10 | ||
|9,938.47 | |||
|5,760.37 | |5,760.37 | ||
|4,205.68 | |||
|1,536.10 | |1,536.10 | ||
|1,209.18 | |||
|392.84 | |392.84 | ||
|316.55 | |||
|- | |- | ||
| [[Function:BorderMask| borderMask]] | | [[Function:BorderMask| borderMask]] | ||
|26,720.11 | |26,720.11 | ||
|20,384.87 | |||
|11,828.72 | |11,828.72 | ||
|8,466.54 | |||
|2,888.34 | |2,888.34 | ||
|2,077.94 | |||
|696.01 | |696.01 | ||
|494.56 | |||
|- | |- | ||
| [[Function:ChannelMix| channelMix]] | | [[Function:ChannelMix| channelMix]] | ||
|20,927.94 | |20,927.94 | ||
|16,801.08 | |||
|9,416.46 | |9,416.46 | ||
|7,642.46 | |||
|2,370.36 | |2,370.36 | ||
|1,975.85 | |||
|644.93 | |644.93 | ||
|497.26 | |||
|- | |- | ||
| [[Function:ChannelSplit| channelSplit]] | | [[Function:ChannelSplit| channelSplit]] | ||
|25,508.90 | |25,508.90 | ||
|18,242.85 | |||
|11,241 | |11,241 | ||
|8,300.13 | |||
|3,005.86 | |3,005.86 | ||
|2,102.54 | |||
|724.73 | |724.73 | ||
|508.90 | |||
|- | |- | ||
| [[Function:ChannelMerge| channelMerge]] | | [[Function:ChannelMerge| channelMerge]] | ||
|22,643.90 | |22,643.90 | ||
|13,510.59 | |||
|10,694.39 | |10,694.39 | ||
|5,939.97 | |||
|2,661.41 | |2,661.41 | ||
|1,474.96 | |||
|635.85 | |635.85 | ||
|365.76 | |||
|- | |- | ||
| [[Function:ColorPick| colorPick]] | | [[Function:ColorPick| colorPick]] | ||
|27,060.67 | |27,060.67 | ||
|17,351.47 | |||
|10,812.80 | |10,812.80 | ||
|7,873.52 | |||
|2,950.60 | |2,950.60 | ||
|2,078.21 | |||
|740.74 | |740.74 | ||
|526.04 | |||
|- | |- | ||
| [[Function:Dehaze| dehaze]] | | [[Function:Dehaze| dehaze]] | ||
|7,434.94 | |7,434.94 | ||
|5,430.06 | |||
|3,577.82 | |3,577.82 | ||
|2,450.98 | |||
|898.47 | |898.47 | ||
|629.62 | |||
|227.71 | |227.71 | ||
|156.44 | |||
|- | |- | ||
| [[Function:DemosaicDFPD|Demosaic (DFPD)]] | | [[Function:DemosaicDFPD|Demosaic (DFPD)]] | ||
Line 197: | Line 230: | ||
|2,782.69 | |2,782.69 | ||
|708.79 | |708.79 | ||
|- | |||
! Geometry Transforms | |||
| fps | |||
| fps | |||
| fps | |||
| fps | |||
|- | |||
| [[Function:Rotate|rotate]] | |||
| 14,520.32 | |||
| 6,324.15 | |||
| 1,564.89 | |||
| 393.55 | |||
|- | |||
| [[Function:RotateNoCrop|rotateNoCrop]] | |||
| 13,156.86 | |||
| 5,872.54 | |||
| 1,479.81 | |||
| 363.07 | |||
|- | |- | ||
|} | |} |
Revision as of 05:56, 27 April 2021
If one thing CUVI gives you, it's performance boost over competitive libraries and solutions. Using GPGPU as the underlying hardware, Imaging and Vision modules get maximum benefit due to their inherent parallel algorithms. In addition to cost cutting on CPU-based clusters, CUVI gives up to 15x speedup over Intel IPP.
Applications using CUVI are generally ten times faster than their CPU counterpart. CUVI framework also gives the ease to scale the application on more than one GPU making it as fast as you want.
Benchmark
The following benchmark via NVIDIA's Performance tools for Windows and Linux. Timing figure represents frames per second (fps) based on only the processing time on the single GPU. The benchmarks are performed on 8-bit images except where multiple images are required as input. L and D refer to laptop and desktop.
720p | 1080p | 4k (3840x2160) | 8k (7680x4320) | |||||
---|---|---|---|---|---|---|---|---|
GTX 1080-D | GTX 2060-L | GTX 1080-D | GTX 2060-L | GTX 1080-D | GTX 2060-L | GTX 1080-D | GTX 2060-L | |
Color Operations | fps | |||||||
adjust | 9,478.67 | 6,804.97 | 3,837.30 | 2,646.75 | 1,036.70 | 773.81 | 266.68 | 204.30 |
autoColor | 13,793.10 | 9,938.47 | 5,760.37 | 4,205.68 | 1,536.10 | 1,209.18 | 392.84 | 316.55 |
borderMask | 26,720.11 | 20,384.87 | 11,828.72 | 8,466.54 | 2,888.34 | 2,077.94 | 696.01 | 494.56 |
channelMix | 20,927.94 | 16,801.08 | 9,416.46 | 7,642.46 | 2,370.36 | 1,975.85 | 644.93 | 497.26 |
channelSplit | 25,508.90 | 18,242.85 | 11,241 | 8,300.13 | 3,005.86 | 2,102.54 | 724.73 | 508.90 |
channelMerge | 22,643.90 | 13,510.59 | 10,694.39 | 5,939.97 | 2,661.41 | 1,474.96 | 635.85 | 365.76 |
colorPick | 27,060.67 | 17,351.47 | 10,812.80 | 7,873.52 | 2,950.60 | 2,078.21 | 740.74 | 526.04 |
dehaze | 7,434.94 | 5,430.06 | 3,577.82 | 2,450.98 | 898.47 | 629.62 | 227.71 | 156.44 |
Demosaic (DFPD) | 1707.94 | 412.72 | 101.86 | |||||
Demosaic (Linear) | 4258.88 | 1025.64 | 234.66 | |||||
gammaCorrect | 10,893.48 | 4,786.91 | 1,247.10 | 303.36 | ||||
gray2rgb | 31,131.31 | 13,053.81 | 3,670.72 | 923.3 | ||||
histEq 8UC1 | 12,195.12 | 5,882.35 | 1,721.17 | 500.75 | ||||
hsv2rgb | 22,472.92 | 10,252.41 | 2,601.40 | 645.35 | ||||
rgb2hsv | 10,495.93 | 4,641.45 | 1,287.87 | 328.59 | ||||
hsv2rgb | 22,472.92 | 10,252.41 | 2,601.40 | 645.35 | ||||
imageBinary | 39,880.36 | 17,540.78 | 4,613.99 | 1,166.32 | ||||
rgb2Lab | 5,702.20 | 2,442.90 | 461.48 | 158.94 | ||||
Lab2rgb | 8,615.56 | 3,922.55 | 1,015.96 | 255.68 | ||||
logTransform | 9,478.67 | 2,861.23 | 745.74 | 168.05 | ||||
lowlight | 5,813.95 | 2,793.30 | 677.97 | 171.73 | ||||
blackGammaLUT | 10,961.55 | 4,798.12 | 1,479.60 | 396.91 | ||||
LUT | 6,744.22 | 3,105.69 | 982.46 | 288.77 | ||||
rgb2gray | 40,779.71 | 16,732.76 | 4,500.98 | 1,106.40 | ||||
rgb2ycbcr | 25,690.43 | 10,987.80 | 2,721.55 | 716.28 | ||||
ycbcr2rgb | 25,760.58 | 11,001.95 | 2,723.90 | 785.62 | ||||
rgb2yuv | 26,668.80 | 11,274.72 | 2,786.93 | 766.75 | ||||
yuv2rgb | 25,477.71 | 11,258.98 | 2,782.69 | 708.79 | ||||
Geometry Transforms | fps | fps | fps | fps | ||||
rotate | 14,520.32 | 6,324.15 | 1,564.89 | 393.55 | ||||
rotateNoCrop | 13,156.86 | 5,872.54 | 1,479.81 | 363.07 |