Difference between revisions of "Performance & Benchmark"

From CUVI Wiki
Line 253: Line 253:
==Color Pipeline==
==Color Pipeline==
Let's take a typical color pipeline and see its performance on one of the least powerful GPU modules; Jetson Nano. Any color pipeline almost always starts with the Raw image. Before converting to RGB, you might want to do some processing on the raw which may include applying look up tables, fpn removal and changing white balance. Next comes debayer followed by several further enhancements and a color space conversion to your desired format. This pipeline can perform in real-time on a decent entry level GPU.  
Let's take a typical color pipeline and see its performance on one of the least powerful GPU modules; Jetson Nano. Any color pipeline almost always starts with the Raw image. Before converting to RGB, you might want to do some processing on the raw which may include applying look up tables, fpn removal and changing white balance. Next comes debayer followed by several further enhancements and a color space conversion to your desired format. This pipeline can perform in real-time on a decent entry level GPU.  
[[File:color_pipeline.png|250px|thumb|left|Color pipeline where each box represents a function]]
[[File:color_pipeline.png|400px|thumb|left|Color pipeline where each box represents a function]]

Revision as of 14:43, 26 October 2022

Measured with NVIDIA's Performance tools for Windows and Linux. Timing figure represents time of kernel/function in milliseconds (rounded) on a single GPU. The benchmarks are performed on color images with 8-bits per channel except where mentioned otherwise. The list below is a small subset of 100+ features in CUVI.

Kernel Time in milliseconds (ms) with CUVI v1.8.0 on Jetson Xavier NX
Algorithm / Image Size 720p 1080p 4k (3840x2160) 8k (7680x4320)
add - 2 Images 0.29 0.61 2.04 8.61
channelMix 0.27 0.61 2.31 9.02
demosaic 1.87 2.3 9.17 36.74
demosaicDFPD 2.33 4.96 19.07 77.75
gammaCorrect 0.22 0.48 1.89 7.47
histEq - Single Channel 0.68 0.92 3.24 9.20
LUT 0.10 0.30 0.86 3.28
blackGammaLUT 0.36 0.68 1.86 7.29
rgb2gray 0.14 0.25 0.96 3.83
focusStack - Stacking 5 Images 142.56 285.95 1103.14 4399.84
bitConversion - From 8 to 16 bits 0.38 0.77 3.12 12.34
crop 0.13 0.48 2.05 6.05
resize - Scale=2.0 0.85 1.90 7.57 30.32
rotate - Non Cropping, Angle = -3.76f 0.23 0.49 1.90 7.64
warpPerspective 0.24 0.68 2.26 9.38
imageFilter - 5x5 floating point window 2.97 7.89 23.76 108.21
underwaterFilter 1.57 3.49 13.6 47.39
haarFwd 1.07 2.39 6.47 25.70
Kernel Time in milliseconds (ms) with CUVI v1.8.0 on RTX 2060
Algorithm / Image Size 720p 1080p 4k (3840x2160) 8k (7680x4320)
add - 2 Images 0.06 0.14 0.51 2.01
channelMix 0.07 0.14 0.55 2.25
demosaic 0.24 0.53 2.10 8.10
demosaicDFPD 0.52 1.22 4.53 18.1
gammaCorrect 0.12 0.28 1.02 4.30
histEq - Single Channel 0.21 0.24 0.84 3.10
LUT 0.03 0.08 0.29 1.20
blackGammaLUT 0.069 0.16 0.61 2.50
rgb2gray 0.04 0.09 0.34 1.43
focusStack - Stacking 5 Images 25.77 55.86 221.60 605.53
bitConversion - From 8 to 16 bits 0.01 0.24 0.95 3.81
crop 0.04 0.12 0.41 1.70
resize - Scale=2.0 0.25 0.55 2.21 8.70
rotate - Non Cropping, Angle = -3.76f 0.04 0.09 0.36 1.11
warpPerspective 0.08 0.20 0.77 3.10
imageFilter - 5x5 floating point window 0.65 1.56 5.81 13.7
underwaterFilter 0.53 1.10 4.00 15.2
haarFwd 0.14 0.30 1.21 4.90

Color Pipeline

Let's take a typical color pipeline and see its performance on one of the least powerful GPU modules; Jetson Nano. Any color pipeline almost always starts with the Raw image. Before converting to RGB, you might want to do some processing on the raw which may include applying look up tables, fpn removal and changing white balance. Next comes debayer followed by several further enhancements and a color space conversion to your desired format. This pipeline can perform in real-time on a decent entry level GPU.

Color pipeline where each box represents a function