Projects/Liberty/Appendix 3
Appendix 3 - HUS/VIP Decompression
In the .hus and .vip format files, the attribute, X displacement, and Y displacement of each stitch are held compressed using a version of the ARJ compression. An attribute has the values: 0x80 (stitch), 0x81 (jump), 0x84 (thread stop), and 0x90 (end stop); that is bit 7 is always set, bit 0 indicates jump without stitch, bit 2 indicates thread change stop, and bit 4 end of pattern stop; bits 1, 3, 5, 6 appear unused.
The compression uses a sliding window of past uncompressed values to source repeated strings of bytes, instead of outputting the string of bytes, a pointer (and length) back to an earlier version of the string is (coded and) output. A feature of this is, a self repeating string can be copied before it is fully output, for example, if 'ABC' has just been output and there is now a string 'ABCABCABC', the length would be 9, but the pointer would only point back 3 bytes (to the 'A') as the 'ABC' is copied to the output, they become available to copy again.
When there are no bytes to copy in the recent output, the new byte has to be output as is, a length of zero (or less than 3) indicates verbatim byte and no pointer. This is encoded as values 0 to 255 indicate verbatim byte and no pointer, values 256 or greater indicate length, but the true length is the value minus 256 plus 3.
How to Decompress
Allocate an output buffer of enough bytes to hold the uncompressed data, access is needed to some of the past decoded bytes. Either fetch bytes and add them to the output, or fetch length and pointer, then copy past bytes to the output. Repeat until output buffer is full.
Pseudo-code of Sliding Window Decompression:
Let uncompressed := new vector[0 to uncompressed-size - 1] of byte
-- allocate output buffer
Let count := 0 -- the number of bytes output, index to output buffer
While count < uncompressed-size Do
Let code := Fetch-Byte-or-Length
If code < 256 Then -- is byte
Let uncompressed[count] := code
Let count := count + 1
Else
Let length := code - 256 + 3 -- min length is 3
Let pointer := Fetch-Pointer
For length Times Do
If count < uncompressed-size Then -- not buffer overrun
If count > pointer Then -- not buffer under-run
Let uncompressed[count] :=
uncompressed[count - pointer - 1]
Else
Let uncompressed[count] := 0
End if
End if
Let count := count + 1
End for
End if
End while
Huffman Coding of Bytes or Lengths and Pointers
The sliding-window byte/length and pointer codes are packaged, the first 16 bits gives the package length in sliding-window codes.
To reduce the bit length of the byte or length symbols and the pointers, they are Huffman coded to reduce the length of the most frequent values. The Huffman tree is passed to the compressed output by sending the length of the Huffman code for each symbol in the symbol-set, run of zero encoding is used to fill in the length table of the many unused codes (of zero length). For the byte/length symbol Huffman code lengths, these are themselves Huffman coded.
The number of Huffman code byte/length symbol lengths is given in the next 5 bits (value 0 to 31, but actually limited to 19). If the number is zero then there is only one symbol which is contained in the next 5 bits and its code length is zero. After the third symbol, the next 2 bits provide a 0 to 3 length run of zeros. After the 'number' of symbols the rest of the lengths are set to zero.
Each symbol is processed thus, the next 3 bits give a value 0 to 7; the values 0 to 6 are used directly, but if the value is 7, then a series of next bits, up to 13. The series ends after the first 0 value bit or after the 13th bit, each 1 value bit increments the (length) value (giving a range 7 to 20); I suspect any value over 18 is an error, so a maximum of eleven 1 bits followed by a twelfth bit of 0. The value 0 to 18 is used to set the length of the next code.
A (virtual) Huffman tree is built using these 19 ordered length and symbol values. The next variable length codes can be used to uniquely identify a value.
The next 9 bits give the number byte/length symbols defined (and is typically 511 since the end code is coded as 510 - a bug?), again a 'number' of zero implies the next 9 bits give the only code used and its Huffman code does not need to be transmitted. For 'number' of times, a variable length Huffman code appear in the next bits and gives a value 0 to 18.
There are up to 19 symbols used to compress the byte/length symbol lengths. The symbol values 0 to 2 allow for a run of zeros: 0 implies run-length of one, 1 implies 3 plus the value of the next 4 bits, and 2 implies 20 plus the value of the next 9 bits. The remaining 16 values give the Huffman code length 1 to 16 bits (that is symbol value minus 2).
A (virtual) Huffman tree is built using these 511 ordered length and symbol values.
There are 511 byte/length symbols, byte ranges from 0 to 255 and usable length ranges from 3 to 256, where a length of 257 marks the end of the compressed data.
The pointer values are Huffman encoded and appear next, similar to above, the next 5 bits gives the number pointer symbols ranging from 0 to 19 (or 15). Again, 0 implies there is only one symbol and it is in the next 5 bits. Otherwise, the next 3 bits give a code length 0 to 6, and 7 again is used to increment length by each 1 bit until a 0 bit. The end of the table is filled with 0.
Fetch Byte or Length
Use the next variable length code to find the value of the Byte or Length, by descending the Huffman tree.
Fetch Pointer
Use the next variable length code to find the value of the Pointer, by descending the Huffman tree.
Huffman Style Processing
Initialize
Let NT = 19 Let NC = 511 Let NP = 15 Let package-remaining := 0 -- symbol/length code-length code-lengths Let T-len := new vector[0 to NT - 1] of (0 to 16) -- symbol/length code-lengths Let C-len := new vector[0 to NC - 1] of (0 to 16) -- pointer code-lengths Let P-len := new vector[0 to NP - 1] of (0 to 16)
Decode Sliding-window Byte/Length
-- prepare Huffman decoding trees If package-remaining = 0 Then Let package-remaining := Get-Bits( 16 ) Let T-tree := Build-T-tree Let C-tree := Build-C-tree( T-tree ) Let P-tree := Build-P-tree End if Let package-remaining := package-remaining - 1 Let byte-length := Huff-Decode( C-tree ) Return byte-length
Decode Sliding-window Pointer
Let pointer := Huff-Decode( P-tree ) If pointer <> 0 Then Let pointer := pointer - 1 Let pointer := 2^pointer + Get-Bits( pointer ) End if Return pointer
Build T tree
T-tree.Delete-All
Let n := Get-Bits( 5 )
Let n := Min( n, NT )
If n = 0 Then
Let c := Get-Bits( 5 )
T-tree.Append( length := 0, bits := 0, symbol := c )
Else
Let i := 0
While i < n Do
Let c := Get-Bits( 3 )
If c = 7 Then
Let b := Get-Bits( 1 )
While b = 1 Do
Let c := c + 1
Let b := Get-Bits( 1 )
End while
End if
Let T-len[i] := c
Let i := i + 1
If i = 3 Then
Let c := Get-Bits( 2 )
While c > 0 Do
Let T-len[i] := 0
Let i := i + 1
Let c := c - 1
End while
End if
End while
While i < NT Do
Let T-len[i] := 0
Let i := i + 1
End while
Let b := 0
For len := 1 to 16 Do
Let b := b * 2 -- shift a 0 into LSB
For c := 0 to NT-1 Do
If T-len[c] = len Then
T-tree.Append( length := len, bits := b, symbol := c )
Let b := b + 1
End If
End for
End for
End If
Return T-tree
Build C tree ( T-tree )
C-tree.Delete-All
Let n := Get-Bits( 9 )
Let n := Min( n, NC )
If n = 0 Then
Let c := Get-Bits( 9 )
C-tree.Append( length := 0, bits := 0, symbol := c )
Else
Let i := 0
While i < n Do
Let c := Huff-Decode( T-tree )
If c <= 2 Then
If c = 0 Then
Let c := 1
Else If c = 1 Then
Let c := 3 + Get-Bits( 4 )
Else
Let c := 20 + Get-Bits( 9 )
End if
While c > 0 Do
Let C-len[i] := 0
Let i := i + 1
Let c := c - 1
End while
Else
Let C-len[i] := c - 2
Let i := i + 1
End if
End while
While i < NC Do
Let C-len[i] := 0
Let i := i + 1
End while
Let b := 0
For len := 1 to 16 Do
Let b := b * 2 -- shift a 0 into LSB
For c := 0 to NC-1 Do
If C-len[c] = len Then
C-tree.Append( length := len, bits := b, symbol := c )
Let b := b + 1
End If
End for
End for
End If
T-tree.Delete-All
Return C-tree
Build P tree
P-tree.Delete-All
Let n := Get-Bits( 5 )
Let n := Min( n, NP )
If n = 0 Then
Let c := Get-Bits( 5 )
P-tree.Append( length := 0, bits := 0, symbol := c )
Else
Let i := 0
While i < n Do
Let c := Get-Bits( 3 )
If c = 7 Then
Let b := Get-Bits( 1 )
While b = 1 Do
Let c := c + 1
Let b := Get-Bits( 1 )
End while
End if
Let P-len[i] := c
Let i := i + 1
End while
While i < NP Do
Let P-len[i] := 0
Let i := i + 1
End while
Let b := 0
For len := 1 to 16 Do
Let b := b * 2 -- shift a 0 into LSB
For c := 0 to NP-1 Do
If P-len[c] = len Then
P-tree.Append( length := len, bits := b, symbol := c )
Let b := b + 1
End If
End for
End for
End If
Return P-tree
Huff Decode ( tree )
Let sym := 0
Let b := 0 -- bits
Let i := 0 -- current tree entry
Let n := tree.length
Let l := 0 -- code length
For i := 0 to n - 1
If l < tree[i].length Then
Let b := b * 2^(tree[i].length - l)
Let b := b + Get-Bits( tree[i].length - l )
Let l := tree[i].length
End if
If tree[i].length = l And tree[i].bits = b Then
Let sym := tree[i].symbol
Exit for
End if
End for
Return sym