-
Notifications
You must be signed in to change notification settings - Fork 476
Description
I'm using 3d luts that span a large dynamic range and need trilinear interpolation for correct results (I'm happy to provide details of my use case if that's helpful).
Due to the nature of the luts, they should provide sufficiently accurate results even at relatively low lut resolutions. However, much to my surprise there is significant banding when applied by OCIO on GPU:
The only way to mitigate this is to crank the lut resolution extremely high, and even then especially dark colors exhibit some subtle banding. The results by CPU, on the other hand, are correct even with the low resolution lut:
Looking into OCIO's source, this seems to be the culprit:
OpenColorIO/src/OpenColorIO/ops/lut3d/Lut3DOpGPU.cpp
Lines 221 to 236 in b008779
// Trilinear interpolation | |
// Use texture3d and GL_LINEAR and the GPU's built-in trilinear algorithm. | |
// Note that the fractional components are quantized to 8-bits on some | |
// hardware, which introduces significant error with small grid sizes. | |
ss.newLine() << ss.float3Decl(name + "_coords") | |
<< " = (" << shaderCreator->getPixelName() << ".zyx * " | |
<< ss.float3Const(dim - 1) << " + " | |
<< ss.float3Const(0.5f) + ") / " | |
<< ss.float3Const(dim) << ";"; | |
ss.newLine() << shaderCreator->getPixelName() << ".rgb = " | |
<< ss.sampleTex3D(name, name + "_coords") << ".rgb;"; | |
} | |
shaderCreator->addToFunctionShaderCode(ss.string().c_str()); |
It's even noted in the comment that this can be an issue on some GPUs. For the record, I'm using an nVidia RTX 3060. So not exactly niche.
I think it would make sense to replace the sampleTex3D()
call with hand-written trilinear interpolation code, to ensure that it executes with full floating point precision and gives correct results on all platforms. It might be a small perf hit, but for something like OCIO I suspect correctness and consistency is more important here.
Would a PR along those lines be accepted?