Here's an interim solution. I'm working on writing and reading binary huffman encoded files. There is code here which isn't used yet by the demo code. I figured that the time had come to publish my approach to building the tree, and encoding and decoding. The current code is more pedagogical than useful since the 'compressed' data is represented by a string representation of the bit string. There are three files attached. huffman.rb is a module and some core extensions. huffman_demo1.rb runs the example from the quiz, dumping the tree and various computed data. Then 'compresses' and 'decompresses' the sample string. huffman_wikipedia.rb is an earlier version which uses the algorithm in the wikipedia article. A few remarks. I started out using the wikipedia algorithm. Once I got that working, I decided that it would be better to use a priority queue instead of two separate queues. For this I used Brian Schroeder's PriorityQueue gem. To handle the problem of padding to a given word length, I add an artificial 'character' to the tree with an occurrence count of 0. This should ensure that it doesn't take a shorter encoding away from a character which actually occurs in the input data. In order to decode the compressed data, I use a regex which is a simple alternation of all of the string representations of the codes, anchored at the begining of the string. Since the huffman encoding algorithm ensures that no code is a prefix of any other code, this regex can be used to find the code at the head of the input stream. So walking through a string representation of a huffman encoded bit stream can be done by matching this pattern against the string, the match will be the code. After this match, the string can be replaced by the string remaining after the match. Extending this to binary data will involve making an input stream class which reads the binary data and converts it into reasonably sized binary string representations, and an output stream which accumulates the string representation of the codes and writes them as binary as enough data is produced. The huffman.rb file contains these classes in an embrionic form. -- Rick DeNatale My blog on Ruby http://talklikeaduck.denhaven2.com/
on 2007-05-15 00:38
on 2007-05-15 01:06
"Rick DeNatale" <email@example.com> writes: > Extending this to binary data will involve making an input stream > class which reads the binary data and converts it into reasonably > sized binary string representations, and an output stream which > accumulates the string representation of the codes and writes them as > binary as enough data is produced. The huffman.rb file contains these > classes in an embrionic form. Note that I did this with just a few methods, that yielded strings of zeros and ones to other methods. The structure was like this: encoding: encode called encode_bits, passing in the input io stream encode_bits would yield strings of 0s and 1s into a block that was inside encode, which used pack to squish them into binary bytes. decoding: decode called decode_bits, which yielded strings of 0s and 1s that were read from the (binary) input - unpack was used to make those strings. I admit that I don't really like the separation of the logic here - it's not really parallel between encoding and decoding, for instance - but this works and didn't require that much code.
on 2012-04-20 17:26
The Vlsi design implementation of huffman encoding from scratch to complete with verilog codes and vhdl codes, systematic diagram can be found in the link below, here http://blog.ektel.com.np/2012/03/vlsi-architecture... I thought this view is also helpful for possible cross reference between ruby language and hdl language