Difference between revisions of "Performance & Benchmark"

From CUVI Wiki
(30 intermediate revisions by the same user not shown)
Line 8: Line 8:


==Benchmark==
==Benchmark==
The following benchmark is performed on NVIDIA GTX 1080 via Nsight for Performance tool on Windows 10 (64-bit) and CUDA toolkit version 9.1. Timing figure represents frames per second (fps) based on only the processing time on the single GPU. The benchmarks are performed on 8-bit images except if mentioned otherwise. The benchmarks for 16-bit demosaicDFPD on 1080p, 4k and 8k image are 1550fps, 412fps and 94fps.
The following benchmark via NVIDIA's Performance tools for Windows and Linux. Timing figure represents frames per second (fps) based on only the processing time on the single GPU. The benchmarks are performed on 8-bit images except where multiple images are required as input. L and D refer to laptop and desktop.


{|class="wikitable"
{|class="wikitable"
|-
|-
!  
!  
! 720p
!colspan="2"| 720p
! 1080p
!colspan="2"| 1080p
! 4k (3840x2160)
!colspan="2"| 4k (3840x2160)
! 8k (7680x4320)
!colspan="2"| 8k (7680x4320)
|-
|-
!  
!  
! GTX 1080
! GTX 1080-D
! GTX 1080
! GTX 2060-L
! GTX 1080
! GTX 1080-D
! GTX 1080
! GTX 2060-L
! GTX 1080-D
! GTX 2060-L
! GTX 1080-D
! GTX 2060-L
|-
|-
! Color Operations
! Color Operations
| fps
|colspan="8" style="text-align:center;"|fps
| fps
| fps
| fps
|-
|-
| [[Function:Adjust| adjust]]
| [[Function:Adjust| adjust]]
|9,478.67
|9,478.67
|6,804.97
|3,837.30
|3,837.30
|2,646.75
|1,036.70
|1,036.70
|773.81
|266.68
|266.68
|204.30
|-
|-
| [[Function:AutoColor| autoColor]]
| [[Function:AutoColor| autoColor]]
|13,793.10
|13,793.10
|9,938.47
|5,760.37
|5,760.37
|4,205.68
|1,536.10
|1,536.10
|1,209.18
|392.84
|392.84
|316.55
|-
|-
| [[Function:BorderMask| borderMask]]
| [[Function:BorderMask| borderMask]]
|26,720.11
|26,720.11
|20,384.87
|11,828.72
|11,828.72
|8,466.54
|2,888.34
|2,888.34
|2,077.93
|696.01
|696.01
|494.56
|-
|-
| [[Function:ChannelMix| channelMix]]
| [[Function:ChannelMix| channelMix]]
|20,927.94
|20,927.94
|16,801.08
|9,416.46
|9,416.46
|7,642.46
|2,370.36
|2,370.36
|1,975.85
|644.93
|644.93
|497.26
|-
| [[Function:ChannelSplit| channelSplit]]
|25,508.90
|18,242.85
|11,241
|8,300.13
|3,005.86
|2,102.54
|724.73
|508.90
|-
| [[Function:ChannelMerge| channelMerge]]
|22,643.90
|13,510.59
|10,694.39
|5,939.97
|2,661.41
|1,474.96
|635.85
|365.76
|-
| [[Function:ColorPick| colorPick]]
|27,060.67
|17,351.47
|10,812.80
|7,873.52
|2,950.60
|2,078.21
|740.74
|526.04
|-
| [[Function:Dehaze| dehaze]]
|7,434.94
|5,430.06
|3,577.82
|2,450.98
|898.47
|629.62
|227.71
|156.44
|-
|-
| [[Function:DemosaicDFPD|Demosaic (DFPD)]]
| [[Function:DemosaicDFPD|Demosaic (DFPD)]]
| 1707.94 fps
| 1707.94
| 412.72 fps
| 412.72
| 101.86 fps
| 101.86
|
|-
|-
| [[Function:Demosaic|Demosaic (Linear)]]
| [[Function:Demosaic|Demosaic (Linear)]]
| 4258.88 fps
| 4258.88
| 1025.64 fps
| 1025.64
| 234.66 fps
| 234.66
|
|-
| [[Function:gammaCorrect|gammaCorrect]]
|10,893.48
|4,786.91
|1,247.10
|303.36
|-
| [[Function:gray2rgb|gray2rgb]]
|31,131.31
|13,053.81
|3,670.72
|923.3
|-
|-
| [[Function:Lowlight| Low Light Enhancement]]
| [[Function:histEq|histEq 8UC1]]
| 2143.02 fps
|12,195.12
| 525.16 fps
|5,882.35
| 145.52 fps
|1,721.17
|500.75
|-
|-
| [[Function:Resize|Resize (2x - Nearest Neighbor)]]
| [[Function:hsv2rgb|hsv2rgb]]
| 4169.51 fps
|22,472.92
| 1048.44 fps
|10,252.41
| 260.164 fps
|2,601.40
|645.35
|-
|-
| [[Function:Resize|Resize (2x - Linear)]]
| [[Function:rgb2hsv|rgb2hsv]]
| 2494.80 fps
|10,495.93
| 613.65 fps
|4,641.45
| 151.53 fps
|1,287.87
|328.59
|-
|-
| [[Function:Resize|Resize (2x - Cubic)]]
| [[Function:hsv2rgb|hsv2rgb]]
| 1778.42 fps
|22,472.92
| 456.68 fps
|10,252.41
| 108.44 fps
|2,601.40
|645.35
|-
|-
| [[Function:Resize|Resize (0.5x - Nearest Neighbor)]]
| [[Function:imageBinary|imageBinary]]
| 47,265.68 fps
|39,880.36
| 12,396.48 fps
|17,540.78
| 3145.28 fps
|4,613.99
|1,166.32
|-
| [[Function:rgb2Lab|rgb2Lab]]
|5,702.20
|2,442.90
|461.48
|158.94
|-
| [[Function:Lab2rgb|Lab2rgb]]
|8,615.56
|3,922.55
|1,015.96
|255.68
|-
| [[Function:logTransform|logTransform]]
|9,478.67
|2,861.23
|745.74
|168.05
|-
| [[Function:lowlight|lowlight]]
|5,813.95
|2,793.30
|677.97
|171.73
|-
| [[Function:blackGammaLUT|blackGammaLUT]]
|10,961.55
|4,798.12
|1,479.60
|396.91
|-
| [[Function:LUT|LUT]]
|6,744.22
|3,105.69
|982.46
|288.77
|-
| [[Function:RGB2Gray|rgb2gray]]
|40,779.71
|16,732.76
|4,500.98
|1,106.40
|-
| [[Function:rgb2ycbcr|rgb2ycbcr]]
|25,690.43
|10,987.80
|2,721.55
|716.28
|-
| [[Function:ycbcr2rgb|ycbcr2rgb]]
|25,760.58
|11,001.95
|2,723.90
|785.62
|-
| [[Function:RGB2YUV|rgb2yuv]]
|26,668.80
|11,274.72
|2,786.93
|766.75
|-
| [[Function:YUV2RGB|yuv2rgb]]
|25,477.71
|11,258.98
|2,782.69
|708.79
|-
! Geometry Transforms
| fps
| fps
| fps
| fps
|-
|-
| [[Function:Resize|Resize (0.5x - Linear)]]
| [[Function:Rotate|rotate]]
| 26,365.05 fps
| 14,520.32
| 6793.71 fps
| 6,324.15
| 1703.32 fps
| 1,564.89
| 393.55
|-
|-
| [[Function:Resize|Resize (0.5x - Cubic)]]
| [[Function:RotateNoCrop|rotateNoCrop]]
| 11,232.92 fps
| 13,156.86
| 3143.94 fps
| 5,872.54
| 799.00 fps
| 1,479.81
| 363.07
|-
|-
|}
|}

Revision as of 20:12, 13 January 2021

If one thing CUVI gives you, it's performance boost over competitive libraries and solutions. Using GPGPU as the underlying hardware, Imaging and Vision modules get maximum benefit due to their inherent parallel algorithms. In addition to cost cutting on CPU-based clusters, CUVI gives up to 15x speedup over Intel IPP.

Applications using CUVI are generally ten times faster than their CPU counterpart. CUVI framework also gives the ease to scale the application on more than one GPU making it as fast as you want.

Benchmark

The following benchmark via NVIDIA's Performance tools for Windows and Linux. Timing figure represents frames per second (fps) based on only the processing time on the single GPU. The benchmarks are performed on 8-bit images except where multiple images are required as input. L and D refer to laptop and desktop.

720p 1080p 4k (3840x2160) 8k (7680x4320)
GTX 1080-D GTX 2060-L GTX 1080-D GTX 2060-L GTX 1080-D GTX 2060-L GTX 1080-D GTX 2060-L
Color Operations fps
adjust 9,478.67 6,804.97 3,837.30 2,646.75 1,036.70 773.81 266.68 204.30
autoColor 13,793.10 9,938.47 5,760.37 4,205.68 1,536.10 1,209.18 392.84 316.55
borderMask 26,720.11 20,384.87 11,828.72 8,466.54 2,888.34 2,077.93 696.01 494.56
channelMix 20,927.94 16,801.08 9,416.46 7,642.46 2,370.36 1,975.85 644.93 497.26
channelSplit 25,508.90 18,242.85 11,241 8,300.13 3,005.86 2,102.54 724.73 508.90
channelMerge 22,643.90 13,510.59 10,694.39 5,939.97 2,661.41 1,474.96 635.85 365.76
colorPick 27,060.67 17,351.47 10,812.80 7,873.52 2,950.60 2,078.21 740.74 526.04
dehaze 7,434.94 5,430.06 3,577.82 2,450.98 898.47 629.62 227.71 156.44
Demosaic (DFPD) 1707.94 412.72 101.86
Demosaic (Linear) 4258.88 1025.64 234.66
gammaCorrect 10,893.48 4,786.91 1,247.10 303.36
gray2rgb 31,131.31 13,053.81 3,670.72 923.3
histEq 8UC1 12,195.12 5,882.35 1,721.17 500.75
hsv2rgb 22,472.92 10,252.41 2,601.40 645.35
rgb2hsv 10,495.93 4,641.45 1,287.87 328.59
hsv2rgb 22,472.92 10,252.41 2,601.40 645.35
imageBinary 39,880.36 17,540.78 4,613.99 1,166.32
rgb2Lab 5,702.20 2,442.90 461.48 158.94
Lab2rgb 8,615.56 3,922.55 1,015.96 255.68
logTransform 9,478.67 2,861.23 745.74 168.05
lowlight 5,813.95 2,793.30 677.97 171.73
blackGammaLUT 10,961.55 4,798.12 1,479.60 396.91
LUT 6,744.22 3,105.69 982.46 288.77
rgb2gray 40,779.71 16,732.76 4,500.98 1,106.40
rgb2ycbcr 25,690.43 10,987.80 2,721.55 716.28
ycbcr2rgb 25,760.58 11,001.95 2,723.90 785.62
rgb2yuv 26,668.80 11,274.72 2,786.93 766.75
yuv2rgb 25,477.71 11,258.98 2,782.69 708.79
Geometry Transforms fps fps fps fps
rotate 14,520.32 6,324.15 1,564.89 393.55
rotateNoCrop 13,156.86 5,872.54 1,479.81 363.07