Visualise Tiled
Sigma
Accuracy
Width
Height


Bluenoise Generator

Background

Bluenoise textures are commonly used in rendering to effectively mask a lack of samples via dithering.
The method used to generate bluenoise here is a variant of void-and-cluster, first described by Robert Ulichney (1993).
I'd also recommend giving the following articles a read over if you're more interested in a bit more background:

Algorithm Overview

1. Seeding

In order to make void and cluster non-deterministic, you typically seed points.
This does not seed points, but rather it seeds a very small amount of background energy [0, 1.17549421069e-38] to each pixel.
This satisfies non-determinism, while also not noticeably biasing actually filled points and can be done in a single pass on the GPU.
void_and_cluster_init.frag

2. The Main Loop

Each pixel of the final image should be a unique value, with the first value written being 1 and the last value written 0.
e.g, if we have a 3x3 image, the values and their order would be:

    [1.0, 0.875, 0.75, 0.625, 0.5, 0.375, 0.25, 0.125, 0.0]

We decide which pixel to write to by finding the biggest "void" that hasn't been written to yet.
In this case, I reduce the buffers down 8x8 pixels at a time, (64x64 => 8x8 => 1x1) and keep track of the biggest voids coordinate.
void_and_cluster_reduce_init.frag
void_and_cluster_reduce_iter.frag

This is then fed into a update routine, where the pixel + surrounding pixels energy are updated.
The amount of "energy" a pixel contributes to itself and its neighbours is:


Where x is the distance to the between pixels.

sigma=1.0
σ = 1.0
sigma=1.2
σ = 1.2
sigma=1.4
σ = 1.4
sigma=1.6
σ = 1.6
sigma=1.8
σ = 1.8
sigma=2.0
σ = 2.0
sigma=2.2
σ = 2.2
sigma=2.4
σ = 2.4
sigma=2.6
σ = 2.6
sigma=2.8
σ = 2.8
sigma=3.0
σ = 3.0

void_and_cluster_partial_update.vert
void_and_cluster_update.frag

3. Region To Update

3.a Correct version

We can derive a "true" bounds, given some target accuracy via:


Where u is the span of pixels to update.


Feeding in gives us:



We can therefore derive an accuracy percentage by:




And to get a usable span:


e.g:

              multiplier         |                update span
    --------------------------------------------------------------------------
    ierf(sqrt(0.5))     ≈ 0.744  | [-ceil(0.744 * sigma), ceil(0.744 * sigma)]
    ierf(sqrt(0.6))     ≈ 0.857  | [-ceil(0.857 * sigma), ceil(0.857 * sigma)]
    ierf(sqrt(0.7))     ≈ 0.986  | [-ceil(0.986 * sigma), ceil(0.986 * sigma)]
    ierf(sqrt(0.8))     ≈ 1.144  | [-ceil(1.144 * sigma), ceil(1.144 * sigma)]
    ierf(sqrt(0.9))     ≈ 1.377  | [-ceil(1.377 * sigma), ceil(1.377 * sigma)]
    ierf(sqrt(0.99))    ≈ 1.981  | [-ceil(1.981 * sigma), ceil(1.981 * sigma)]
    ierf(sqrt(0.999))   ≈ 2.456  | [-ceil(2.456 * sigma), ceil(2.456 * sigma)]
    ierf(sqrt(0.9999))  ≈ 2.862  | [-ceil(2.862 * sigma), ceil(2.862 * sigma)]
    ierf(sqrt(0.99999)) ≈ 3.223  | [-ceil(3.223 * sigma), ceil(3.223 * sigma)]

    With a sigma of 1.9:
    0.5     => [-2, 2]
    0.6     => [-2, 2]
    0.7     => [-2, 2]
    0.8     => [-3, 3]
    0.9     => [-3, 3]
    0.99    => [-4, 4]
    0.999   => [-5, 5]
    0.9999  => [-6, 6]
    0.99999 => [-7, 7]

That being said, anything less than 99.99% very quickly becomes unusable, and as far as UI controls are concerned, is quite unintuitive.

3.b More intuitive version

If we flip the problem around and change it to, what is the greatest span we can reach, before floating point precision kicks in and prevents anything from even being written, we just need to solve:


Where δ is a very small number (e.g FLT_MIN [2^-149]) and u would be the maximum viable span.

From which we can derive:

With δ = FLT_MIN, we arive at:

    span = 10.16262416423198497411 * sigma

And can better control the accuracy via:

    span = ceil(10.16262416423198497411 * sigma * accuracy)

Which being linear feels more intuitive, artefacts don't really start showing up until < 50%, and the span covered could be reduced as the pixel values decrease (although this hasn't been implemented here).

Example of the first 64 chosen samples using different accuracies, with σ=1.9:

acc0.1
acc = 0.1
acc0.2
acc = 0.2
acc0.3
acc = 0.3
acc0.4
acc = 0.4
acc0.5
acc = 0.5
acc1.0
acc = 1.0

As a note, atleast using WebGL2 (with only fragment shaders) on a PC, the time saved by limiting the region seems negligible. If we were updating regions using a compute shader this might be a different story, but even so, it's unlikely we'll ever really saturate the waves available without being bottlenecked by the reduction step first.


The full Javascript/WebGL2 stuff can be viewed here: void_and_cluster_webgl2.js.
It's worth mentioning, I primarily program in C++/HLSL/GLSL which is probably apparent to anyone more familiar with JS.

Also for funsies here is a shadertoy port: https://www.shadertoy.com/view/cdfSD8