User talk:Graeme E. Smith

Welcome to Wikibooks, Graeme E. Smith!  First steps tutorial Wikibooks is for freely-licensed collaboratively-developed textbooks. You don't need technical skills in order to contribute here. Be bold contributing and assume good faith about the intentions of others. Remember, this is a wiki, so you're allowed to change just about anything, and changes can be made easily. Come introduce yourself to everyone, and let us know what interests you.

If you're coming here from other Wikimedia projects, you should read our primer for Wikimedians to get quickly up-to-speed.  Getting help  Goodies, tips and tricks  Made a mistake? Thanks, DavidCary (talk) 04:17, 26 April 2009 (UTC) (P.S. Would you like to provide feedback on this message?)
 * See the Wikibooks help pages for common issues.
 * Remember, every edit is saved, so if you make mistakes, you can revert to an earlier version if needed.
 * Get help from the community in the Reading room or in our IRC channel.
 * You cannot upload an image until you have been a member for at least 4 days. If your upload is tagged with, , or , please read the template message as it explains the violation of our media policy. Please be sure to provide the required : a license tag and source citation are always required; fair use images require a . Get help in the user assistance room.
 * Please fill in the edit summary and preview your edits before saving.
 * Sign your name on discussion pages by typing &#126;&#126;&#126;&#126;
 * User scripts can make many tasks easier. Look at the Gadgets tab of my preferences; check off the boxes for the scripts you want, and hit save!
 * Please make sure you follow our naming policy - modules should be named like.
 * Need to rename a page? Use the move tab (only become available once your account is 4 days old - until then, ask for help).
 * To get a page deleted, add to the top of the page.
 * If something you wrote was deleted, please read the deletion policy, and check the deletion log to find out why. Also check the VFD archives if applicable. You can request undeletion at WB:VFU, or ask the administrator who deleted the page.

Thank you for adding so much good stuff to Evolution of Operating Systems Designs. Have you heard about the Operating System Design wikibook, the Consciousness Studies wikibook, the Artificial Intelligence wikibook, and other wiki that discuss artificial intelligence (there were 4 such wiki last time I looked)? --DavidCary (talk) 04:17, 26 April 2009 (UTC)


 * No I wasn't aware of the other wikis nor the books you referenced. Thanks for the links. There are still a few things I think I could do for the evolution of operating systems book when I get the time.--Graeme E. Smith (talk) 05:00, 26 April 2009 (UTC)

Sorry for the spelling
It was late last night when I made the contributions. I'll make an effort to correct my utilization of the plural. Thanks for your input.

Are you up-to-date on new algorithms and their characteristics ? I'm implementing a network application that will operate mostly over the Internet and I'm attempting to select what compression to implement on my protocol, it has to be quick on compression, handle the full range of the connection (some sort of shared dictionary could be possible since I'm implementing both ends), compression will be made in bursts of ~512 bytes, low system footprint and able to deal with already compressed data. My idea is to use some variation of the LZO (Lempel-Ziv-Oberhumer) if I can get a C/C++ implementation that is BSD (or even LGPL) licensed, if not I'll be probably using the 7z LZMA SDK. I would appreciate any info you may provide on the subject. --Panic (talk) 22:07, 3 December 2009 (UTC)


 * No problemo, these kind of things sometimes stick out, so I change them when I see them.


 * The question I see, is what kind of storage load do you want on the client? Anything that can replace data with stored data on the client, will increase the throughput, but may also limit the speed since you can't control an internet clients load factor. In other words you have to balance the compression factor against the client load if speed is an issue, and you indicate that it might be.


 * The next question that needs to be asked, is what type of data are you sending. (No need to tell me, but you want to model your data}, do a little profiling and so on, so that you know what your natural redundancy factor is etc. For instance if you are sending a lot of compressed data, as your raw data, you might already have lost considerable redundancy in the data during the compression step. This means that the compression mechanism you use for the general data stream, will probably not have as easy a time, compressing it again.


 * The next problem is compression ratio against minimal data needed to recover the file. Most compression schemes require that a residual is kept, or a record is built that tells how to recreate the data. 7z for instance claims that it can compress 7 times the compression ratio of regular zip files, but you might want to decompress, first before compressing in 7z, if it doesn't deal well with the residual data. I have seen the 7z advertisements but it has not been clear to me whether the 7 times performance was based on a zip file input. You would have to get into the actual scientific literature on the scheme to find that out, and I haven't bothered.


 * One of the problems with allowing the data stream to include zipped files, is that the only information you have on the original data, is it's zip scheme. It might be possible to tailor a specific compression mechanism to the actual nature of the data. Custom Data Compression schemes can often achieve a 9 fold increase over zip files, because they can work with a topic specific dictionary. Obviously you can't do that, if you are using a pre-zipped datafile, because the content is masked by the compression results. Unzipping and rezipping a file however will tend to make your program a limited time offer, since any change in compression efficiency in the long-term will result in obsolescence of your unzipping component.


 * Another thing you might want to look into is the server-side compression scheme. If you have a powerful enough server, you might be able to do a more complex compression scheme, that would run slow on a PC but because of parallelism or something run faster on the server. This Assymmetric compression, means that you can tune the compression algorythm on the server, and the Client just needs to follow the decryption mechanism. Consider that some compression schemes work better than others for some types of data. PKzip used to sense the datatype and use a different method to compress and decompress, giving you a list of the methods as it decompressed. Extend this a little further, think speculative compression, you compress the same file in multiple compression formats, and decompress it, again, compare the output to the original file, if there is an error then ignore that technique. if not, then pick the technique that gave you the best compression, you might be surprised to find that it is not always the technique that has the best generalized compression ratio that wins.


 * Of course my personal favorite is Entropy Injection. Essentially instead of a dictionary, you have a large table of selected seeds for random number generation. You compress the file, and then speculatively inject different random numbers (XOR?)from each separate seed, and compress again, The idea is that injecting entropy, might actually increase the order, thus giving you a better chance to compress the previous compression schemes result. It is based on one of Wolframs experiments with class 4 cellular automata and has as far as I know never been tested.


 * since the random number bases would be common, coding could be limited to some sort of selection code for the base, and some indication of which random number or range of random numbers to use, and which version of compression successfully out performed the others. The trick with this type of coding is to make the record long enough that when it's residual is added to the coding string, it is still reduced enough in size to give a positive compression, even after entropy injection.--Graeme E. Smith (talk) 23:26, 3 December 2009 (UTC)