Performance & Benchmark

From CUVI Wiki
Revision as of 14:00, 26 October 2022 by Jawad (talk | contribs)

Measured with NVIDIA's Performance tools for Windows and Linux. Timing figure represents time of kernel/function in milliseconds (rounded) on a single GPU. The benchmarks are performed on color images with 8-bits per channel except where mentioned otherwise.

Kernel Time in milliseconds (ms) with CUVI v1.8.0 on Jetson Xavier NX
Algorithm / Image Size 720p 1080p 4k (3840x2160) 8k (7680x4320)
add - 2 Images 0.29 0.61 2.04 8.61
channelMix 0.27 0.61 2.31 9.02
demosaic 1.87 2.3 9.17 36.74
demosaicDFPD 2.33 4.96 19.07 77.75
gammaCorrect 0.22 0.48 1.89 7.47
histEq - Single Channel 0.68 0.92 3.24 9.20
LUT 0.10 0.30 0.86 3.28
blackGammaLUT 0.36 0.68 1.86 7.29
rgb2gray 0.14 0.25 0.96 3.83
focusStack - Stacking 5 Images 142.56 285.95 1103.14 4399.84
bitConversion - From 8 to 16 bits 0.38 0.77 3.12 12.34
crop 0.13 0.48 2.05 6.05
resize - Scale=2.0 0.85 1.90 7.57 30.32
rotate - Non Cropping, Angle = -3.76f 0.23 0.49 1.90 7.64
warpPerspective 0.24 0.68 2.26 9.38
imageFilter - 5x5 floating point window 2.97 7.89 23.76 108.21
underwaterFilter 1.57 3.49 13.6 47.39
haarFwd 1.07 2.39 6.47 25.70
Kernel Time in milliseconds (ms) with CUVI v1.8.0 on RTX 2060
Algorithm / Image Size 720p 1080p 4k (3840x2160) 8k (7680x4320)
add - 2 Images 0.06 0.14 0.51 2.01
channelMix 0.07 0.14 0.55 2.25
demosaic 0.24 0.53 2.10 8.10
demosaicDFPD 0.52 1.22 4.53 18.1
gammaCorrect 0.12 0.28 1.02 4.30
histEq - Single Channel 0.21 0.24 0.84 3.10
LUT 0.03 0.08 0.29 1.20
blackGammaLUT 0.069 0.16 0.61 2.50
rgb2gray 0.04 0.09 0.34 1.43
focusStack - Stacking 5 Images 25.77 55.86 221.60 605.53
bitConversion - From 8 to 16 bits 0.01 0.24 0.95 3.81
crop 0.04 0.12 0.41 1.70
resize - Scale=2.0 0.25 0.55 2.21 8.70
rotate - Non Cropping, Angle = -3.76f 0.04 0.09 0.36 1.11
warpPerspective 0.08 0.20 0.77 3.10
imageFilter - 5x5 floating point window 0.65 1.56 5.81 13.7
underwaterFilter 0.53 1.10 4.00 15.2
haarFwd 0.14 0.30 1.21 4.90