日志

part 1-ASTC-introduction Bounded Integer Sequence Encoding

已有 3284 次阅读| 2016-2-28 16:29 |个人分类:转载

https://community.arm.com/groups/arm-mali-graphics/blog/2012/06/28/arm-unveils-details-of-astc-texture-compression-at-hpg-conference--part-1

The internal details of ARM's Adaptive Scalable Texture Compression (ASTC) technology were launched this week at the High Performance Graphics conference in Paris, France.

Tom Olson presented his paper entitled "Adaptive Scalable Texture Compression", as part of the session on texture and appearance on Wednesday 27 June 2012.

This is the first of two blog posts giving an overview of the ASTC technology as presented in the paper.

Features and Quality

As a quick recap, ASTC offers a number of advantages over existing texture compression schemes:

· It is flexible, allowing bit rates from 8 bits per pixel (bpp) down to less than 1 bpp. This allows content developers to fine-tune the tradeoff of space against quality.

· It supports from 1 to 4 color channels, together with modes for uncorrelated channels for use in mask textures and normal maps.

· It supports both low dynamic range (LDR) and high dynamic range (HDR) images.

· It supports both 2D and 3D images.

· All of these features are interoperable. You can choose any combination that suits your needs.

Despite all this flexibility, quality is universally better than existing texture compression schemes for LDR images, and is comparable to the de facto industry standard for HDR.

How Does It Work?

ASTC, like all current texture compression schemes, divides the image into fixed-size blocks. These blocks cover a fixed-size "footprint" in the texture image, and are encoded using a fixed number of bits. This feature makes it possible to access texels quickly in any order, with a well-bounded cost for that access.

This is in contrast to stream-based, variable-bitrate image formats such as PNG, where the decoding process requires that you have decoded the previous texels in the image. Obviously, this would be a problem if the texels you wish to access are at the bottom right of the texture.

The 2D block footprints in ASTC range from 4x4 texels up to 12x12. By dividing the 128 bits by the number of texels in the footprint, we derive bit rates from 8 bpp (128 bits / 16 texels) down to 0.89 bpp (128 bits / 144 texels).

Luoy

Each texel can have 4/3/2/1(rgba/rgb/ya/y) components, each texel can split to one/two plane.

Original example image

Detail of ASTC compressed image, at 8bpp, 3.56bpp and 2bpp

In the simplest case, the encoder analyses each block in isolation and selects two colors which define the end points of a line in the color space. The approximate colors of texels can then be reconstructed from these color endpoints by interpolating between them. For each texel in the footprint, a weight value is stored, and the weighted average calculated. The weight, mathematically, is a value in the range 0 to 1, but for storage this is quantized to a few bits. Selecting the endpoint colors and the weights to make an optimal match to the texel colors in the original block is the job of the encoder.

Luoy

前提假设是一个block中的颜色不会占据整个颜色空间（即使占据了整个空间，那插值出来的颜色值就初略些）

As to each block’s color, we select two endpoints in color space as the block’s color range. 然后我们需要对block中的每个texel插值出颜色值（可以是线性插值），因此需要计算出插值权重（weight），权重计算可以有多种计算方式，颜色插值完之后再与真实的图像比较误差，最终选择误差最小的权重。编码就是上述过程（找取一组或两组endpoint，计算出最优的插值权重）

Most of the existing formats use similar methods, and it is possible to trace the origins of this technique as far back as 1979. However, most schemes use a fixed split between the number of bits used to represent the endpoint colors, and the number of bits used to represent the color weights.

Some formats, offer different precision at different bit rates, but the number of bits for endpoints and weights is determined globally by the block footprint.

Tom's previous blog post on ASTC goes into some detail about the constraints of each of the existing texture compression methods.

Luoy

这篇文章对比了之前一些纹理压缩的不足

Trading Spaces

The "Adaptive" part of ASTC allows the encoder to tune the number of bits assigned to each piece of data, on a block-by-block basis. There are sixteen different color endpoint modes, any of which can be chosen for any block in the image. If a block is largely gray, then it is possible to encode the color endpoints more efficiently and devote the resulting extra bits to representing the texel weights more accurately.

Luoy

“Adaptive”是指能够自适应endpoint的bit数和weight的bit数（不是自适应块大小）

But here, apparently, is a problem. If we have a 4x4 texel footprint, and we want to increase the amount of data in each weight, then it would seem that the minimum increment would be one extra bit per texel. The resulting shift in the balance between weights and endpoints would thus be 16 bits, which is rather crude. For larger block footprints, the problem gets worse. For finer control, we need a way to add less than one bit per texel.

Luoy

但是这种调节是比较粗糙的

Bounded Integer Sequence Encoding

Initially, fractional bits per pixel sounds implausible, or even impossible, but it's not quite as strange as it initially sounds.

In principle we can choose our quantization of the texel weights (and the color components of the endpoint colors) to use any number of values.

For the sake of illustration, let us assume that we can best represent each texel in a particular block using one of five weight values - 0.0, 0.25, 0.5, 0.75 and 1.0. We can easily quantize these to the integer values 0..4. Two bits is insufficient to represent these, as that would only represent four values, so conventionally we would need to allocate 3 bits. We would then either expand the quantization to use all eight possible 3-bit values, or leave three of the values unused.

However, a combination of any three of these texels has one of 5³ possible values, or 125. This is very close to the number of values that it is possible to encode in 7 bits (2⁷= 128). So if we can group the texels into triplets, and find an appropriate encoding scheme for these base-5 values ("quints"), we can use just 7 bits, instead of the 9 we would need for storing three bits per value. This is a significant saving, and has the somewhat weird property of assigning a non-integer number of bits - 2.33 - to each value.

Luoy

每个texel 3b，三个texels need 9b; now three texels as a group, have 5^3 possible values, or 125, so only need 7b; 7b/3=2.33b/texel

Similar reasoning shows that it is possible to pack base-3 values (trits) in groups of five, each group taking 8 bits (3⁵ = 243, 2⁸= 256), for 1.6 bits per value.

Luoy

0.0, 0.5, 1.0; 8/5 = 1.6

The Bounded Integer Sequence Encoding (BISE) technique used in ASTC always quantizes values to ranges which conform. to one of three patterns: values from 0 up to 2ⁿ-1, using n bits; up to 3 x 2ⁿ-1, using n bits and a trit; or up to 5 x 2ⁿ-1 using n bits and a quint. This allows us to encode any ideal quantization range with much less waste than the traditional whole-number-of-bits approach.

When the number of values is not a multiple of three or five, we need to avoid wastage at the end of the sequence. Thus, we have another constraint on the chosen encoding. If the last few values in the sequence to encode are zero, the last few bits in the encoded bit string must also be zero. Ideally, the number of non-zero bits should be easily calculated and not depend on the magnitudes of the previous encoded values.

This is a little tricky to arrange, but it is possible. This means that we do not need to store any padding after the end of the bit sequence, as we can happily assume that they are zero bits, safe in the knowledge that they will not affect the decoding of the actual values.

With this constraint in place, and by interleaving the bits, trits and quints appropriately, BISE encodes a sequence of length S (i.e.an array of S integer values) using a fixed number of bits:

· For S values in the range 0 up to 2ⁿ-1, it uses nS bits.

· For S values in the range 0 up to 3 x 2ⁿ-1, it uses nS + ceiling(8S/5) bits.

· For S values in the range 0 up to 5 x 2ⁿ-1, it uses nS + ceiling(7S