Calvinist Information Theory, redux

This is a follow up to my previous post looking at some of the the "information theory expertise" over at Triablogue. Peter Pike is willing to do what other T-Bloggers like Hays are not, and wandered into an area where real demonstrable knowledge can be applied to his claims and idea. And as this post may show, there's a good reason Hays and others remain safely out of the reach of knowledge accountability -- Peter went off on a subject he has shown he's not got even a basic grasp on, and has now gone on to demonstrate clearly that he's not just confused, but obstinate and incorrigible in clinging to and amplifying his confusion. Think of this post as "What if someone took the time to really examine Triablogue 'knowledge' when they ventured into a subject where they were liable to real knowledge?".

Readers looking for discussion about the typical (a)theological subjects can go ahead and skip this post. I think this post is worthwhile as a matter of diligence, as a "for the record", a review that will document in some detail the kind of hostility to real knowledge Triabloggers, or at least Peter Pike are given to display. It's also useful to note Peter's aversion to correction and getting things right. He's got more than he needs to understand he's dug himself into a hole here, from several people, not just me. Often enough, Pike and the Triabloggers just get conspicuously quiet when good criticisms are leveled, else they obfuscate and prevaricate. Typically, this kind of feedback would go right into the combox for the post over at Triablogue, but the Triabloggers do not suffer criticism like this on their blog, so I'll burn a slot in the post queue here at DC in order to provide some documentation of the errors and problems with Information Theory According to Peter Pike™.

Peter Pike, 8/14/2008 8:41 PM
I will add one thing for people who are interested. T-Stone's example of a 1,000 x 1,000 random pixel grid is only "maximally complex" if there are no repetitions at all. That is, there must be 1,000,000 pixels each with their own unique color. If there are any repetitions, compression can occur.
This is incorrect, as just a little experience with compression will demonstrate. When converted to bits (binary), any string beyond just a handful of bits will have repetitions. Repetitions are not automatic candidates for compression, as many repetitions are more expensive to record and store as instances of a pattern than just leaving the repetition in the datastream untouched. This means that Pike's statement: "If there any repetitions, compression can occur." is a false statement.

Here's a quick example. If we examine the string:

[A] = "11001100"

We can see that it contains a number of repetitions. For example [11] [00] [11] [00] is one grouping of repetitions. 1 [100] 1 [100] would be another, and [1100]*2 would be another.
Pike supposes these repetitions are all opportunities for compression, but none of them actually produce a smaller program when compressed than the uncompressed program. The reason is that the "bookkeeping" for small repetitive patterns is more costly than the uncompressed pattern itself. It's only when the length of the repeating patterns becomes significant in length and the number of those repetitions is large that we can effect some amount of compression. A short string of English text takes more space to "compress" the few words that repeat than it saves. A long string of English text is likely to have a number of words that are long (>4 characters) and occur many times (the work 'molecule' might appear 300 times in a long article about biology, for example).
For example, suppose that the random pixels are simply 1 or 0, that is the pixel is either on or off (i.e. "white" or "black"). Now suppose that one section of the string shows:

0100010011101011

We can easily compress this by saying "00" = A and "01" = B and "10" = C and "11" = D. Thus, the above string of numbers can be compressed to:

BABADCCD

This is a simplistic examination. In fact, given that we have 26 letters to choose from, there is no reason that we have to stick with only two-digit numbers. We can use four-digit numbers sections easily enough represented with just 16 letters.
"Simplistic" is a charitable description of this examination. Here, Peter shows he is unfamiliar with how binary representation of data works. He begins with a binary string "0100010011101011", which he supposes might be compressed to "BABADCCD". But "0100010011101011" is just 16 bits long, and "BABADCCD", at least on byte-strings is 64 bits, as every character maps to a byte, which consumes 8 bits (8 char * 8 bits/char == 64 bits). So, on a conventional platform, Pike's "compressed" string has gotten FOUR TIMES LARGER, just in terms of representation. He hasn't even addressed the logic needed to provided the mapping between patterns ("00") and symbols ("A"). Even in a conceptual machine where a "pikebyte" is only 2 bits long (enough to allow for the symbols "A", "B", "C", and "D"), he still has produced a "compressed" version of the source string that is 16 bits, exactly the size of his "uncompressed" string (8 char * 2 bits/char == 16 bits).

All of this inflation he has introduced without providing the logic used to effect the compression/decompression, which is not free, and makes his "compressed" program just that much bigger than the uncompressed program that just echoes the string out as it is.
Of course, this level of compression immediately begs the question: is it really less complex to have various symbols represent sequences of other symbols? That is, while the string "BABADCCD" is shorter than the string "0100010011101011", the program needed to convert "BABADCCD" is more complex than simply reading "0100010011101011".
As above, the string "BABADCCD" is NOT shorter than "0100010011101011" (binary, pixels that are 1 or 0). Pike doesn't understand bits and bytes. The binary string "01000001" and the ASCII character 'A' both take up the same amount of storage -- 8 bits in both cases. To Pike's credit here, he is at least grabbing the idea that the compression logic matters in terms of complexity from whatever web page he's trying to catch up on this topic from, now that his goofs have been pointed out.
This also brings to mind the question: if complexity in information is reduced due to compression where certain symbols stand for a certain combination of other symbols, then we can theoretically define a symbol as a long combination of other symbols. In other words, we can say:

"*" = "0100010011101011"

Now we can transmit all of the above with a simple character: "*"
Here Peter Pike is invoking magic for us, or something supernatural at least. There is no free lunch with information. One can assign a symbol to point to a referent string (like "0100010011101011"), but there's a cost incurred in associating the symbol and its referent, and the symbol is never more than a symbol. If we construct a couple lines of code like this:

char* a = "0100010011101011";
std::cout << a;

We haven't shrunk the string "0100010011101011" at all. All we have done is assign an alias for it, a symbol that makes it easier for humans to conceptualize what is going on. But in terms of compression, nothing is achieved at all. Transmitting "*", in quotes as Pike has it, just produces an asterisk character on the output. In the fragment above, sending the variable a to std::cout just dumps our original string as output. Perfectly nothing is achieved by Peter's stratagem. His magic is a bit of self-hoaxing.
As we continue in this vein, obviously constructing a symbol code for all the iterations of a 1,000 x 1,000 grid will be immense; but the output will be greatly condensed. No matter which iteration comes about, it is represented by one symbol.
Creating symbols for a given output does not achieve compression itself. It is only when the repetitions in the data are so large and frequent that the symbol table plus the mapping logic is smaller than that data that compression is achieved. Additionally, Pike gives us the clue that he believes he can cheat with external symbol tables, pre-calculating all the possible permutations for the matrix, and simply resorting to transmitting a "permutation ID" as an elegant way to realize dramatic (like 10,000x) compression! The creation and storage of the symbol table counts against the size (and those algorithmic complexity) of the program. Pike here fails to understand the constraints the compression must operate within, and the costs it must incur in implementing compression heuristics.
But this brings to bear more issues beyond simply the complexity of the information itself. The program of compression (which, in order to be useful, must be able to decompress data as well) can be infinitely more complex than the output; and vice versa. But if we're merely looking at the information present and not considering the compression algorithms at all, then "*" is obviously less complex than "BABADCCD" which is itself less complex than "0100010011101011".
Confused gobbledygook. Perhaps Peter can tell us how he can start with "*" and produce "0100010011101011" without incurring the overhead of "0100010011101011"?
Some people want to stop thinking at that level and not consider any further. Others don't. You can decide whether it's important or not.
Ahh, the irony!

Peter Pike 8/14/2008 9:12 PM
Idahoev said:
---
There will always be repetitions in any string of sufficient length
---

Exactly my point.
That was not the same point. Idahoev was correctly pointing that repetitions occur in strings that cannot be compressed any further. Pike supposes that repetitions -- like the same bit value occurring twice in a row (or any two-bit combination, as it happens) implies compressibility. Idahoev points out the the inevitable repetitions in any string beyond a few bits is all the indicator Peter needs to see that he's mistaken. Peter considers this counterfactual a vindication of what he's been saying all along.
You further said:
---
...nonetheless you cannot create a compressor that will successfully compress random numbers. This is because you compress by replacing repeated sequences with shorter sequences, and then adding those shorter sequences to a symbol table.
(The symbol table's size must be considered part of the compressed string's size, otherwise, all you've done is destroy the original string because you can't uncompress.)
---

And that's where you're wrong.

Consider the example once again. If you only have two digits (a 1 and a 0) for a 1,000,000 long string, you'll find a lot of repetitions. Even if the 1's and 0's are generated randomly. And those repetitions are sufficient enough that you can include a symbol table with your data.

In a "pure" random sample, you'd have approximately the same number of "00", "01", "10", and "11" sequences. Thus, you could replace them all with the four symbols "A", "B", "C", and "D". Your string (which was 1,000,000 bits long) is now 500,000 bits long. Are you seriously claiming that the symbol table will be 500,000 bits in size?
Peter has obviously never tried this before or worked with data compression algorithms in any meaningful way, as casual interaction with the problem would arrest thoughts like this immediately. Pike is incredulous that Idahoev might suppose the the symbol table might take up a full half megabyte. Why would that be? Because Pike supposes that this random string can trivially be compressed in half... for free! How does Peter get to an estimate of 50% compression? Well, apparently he thinks he can trade "00" for "A" and realize a 2:1 compression. Two characters for one, right?

Wrong.

"00" is binary, {0 | 1} remember, so "00" takes up just two bits of space. For conventional computing platforms, "A" consumes EIGHT BITS, and even if he restricts his alphabet to just {A | B | C | D}, all he's done is reached parity for the original string (1 char "A" * 2 bits/char == 2 bits), and he's still got to add on the symbol table mapping logic in his program, which will unavoidably make his "compressed" version of the program bigger than the uncompressed version. The best he can hope to do with his plan here is to produce a program just a little bigger than the uncompressed program.
You said:
---
But you can't do that. The amount of information
(=complexity) in a string is defined as the smallest algorithm which
can produce that string.
---

Which is exactly my point. Dawkins (and by extention, T-Stone) only have the universe, which would be the "output" of the whole algorithm. They have no idea how complex or how simple something need be to produce that output. They don't even
think about that. Information theory can say nothing about God because we don't have all the sufficient knowledge available to us. What it can say is whether natural processes can account for the information we see present.

Thanks for venting your views. Now perhaps you could actually read what I wrote before responding.
That was exactly Pike's point, that complexity is defined as the smallest algorithm which can produce that string? Hah. As for the universe, information theory cannot tell us whether natural processes can account for the information we see around us. Information theory is a mathematical model, and doesn't reflect anything about the physical world necessarily. We can use it profitably to measure complexity all around us, and there is an unimaginable amount of complexity to comprehend in the universe. But none of that tells us whether natural processes can give rise to the information and complexity we observe and measure around us. This whole discussion started because Vox Day thought he had real knowledge about complexity by doing a drive-by read on fractals at Wikipedia. Pike knows even less about the subject than Vox Day, as he is ably demonstrating here, but none of this is attached to real process, but rather is anchored in the curious idea that Sierpinski triangles are "infinitely complex".

Peter Pike 8/14/2008 9:52 PM

Next, Pike move on to respond to comments from "nihilist"...
Nihilist,

First, I assumed what T-Stone meant by "maximal randomness" was the impossibility of the string to be compressed. Given that assumption, then my statements about randomness come into play.
Above, Pike said this:
FWIW, I disagree with T-Stone’s version of information and complexity. And despite what his post would lead you to believe, the idea that “maximal randomness = maximal complexity” is not true for all information theories."
It's not just my post(s) that will lead you to believe that the maximally complex string is the random string, it's the theories and proofs themselves, from Shannon, Kolomogorov, and others. Moreover, algorithmic complexity defines "random" as any string that is shorter than any program that can produce that string, given a reference platform like a Turing machine. That means that "random" is defined as "incompressible" in the theory.

Even if Peter cannot understand the conceptual basis for this, there's a simple exercise that will tease out the absurdity of Pike's claim. If a random string is compressible by 50%, as he suggested above, then the output of a compression pass on random string A will produce a shorter (by half) output string A'. But if any string with repetitions can be compressed, we can just repeat the processes until we have compressed any arbitrarily long string down to nothing at all. Magic! Take A' and compress it down to half it's size, producing A'', and take A'' and compress that down by half to yield A'''. Pretty soon, you've got all the worlds data neatly compressed into Pike's Magic Asterisk ("*"). Clearly, there must be a floor for the compressibility of any given input, and this is what information theory and algorithmic complexity focus on.
Again, consider the example of the 1,000,000 bits. Random data will tend to have an equal number of "00", "01", "10", and "11" number sequences. When you have 1,000,000 bits, that means you'll have roughly 250,000 of each. (As anyone who's rolled dice know, these percentages rarely happen in the real world.)

You can compress those to halve the length of the string. Even including the symbol table will not increase the string length back to its original size.
Here, Pike reiterates his error in clear terms, just to make sure he has no wiggle room in denying it later. "You can compress those to halve the length of the string", he says, referring to a million random bits. His understanding of the concepts here is such that he suppose two bits (which we represent as "00") is compressible by 50% to "A". But A isn't representable in less than two bits in an alphabet that has four symbols. These are rudimentary errors for this subject.
Therefore, it follows that a "maximally random" string can only be a set length, and if it cannot be compressed it must have specific rules in place as to what is in the string. Therefore, the "random" strings are not so random after all in order to be "maximally complex."
Information theory has shown that the opposite of what Pike claims here is true. There is no "set length" for a "maximally random" string, and it is the complete absence of rules governing the contents of the string. Rules are what enables compression! That's why, for example, we can render the "infinitely complex" output of a Sierpinski triangle or a Mandelbrot set with just a handful of lines of code -- the code implements the simple rules that govern the output for each. Randomness is "without rule", which is what makes it information-rich. Rules represent certainty, and information is measured as the reduction of uncertainty.
Now there are random aspects to it. I'm not denying that. Rather, as the string gets longer, in order to maintain its maximal complexity it must lose its randomness.
This is precisely the opposite of what information theory and Kolmogorov complexity demonstrate. Here's a short quote from Nick Szabo's tutorial on Algorithmic Information Theory that captures the idea (Peter, I invite you to have a read of the whole thing):
A truly random string is not significantly compressible; its description length is within a constant offset of its length. Formally we say K(x) = Theta(|x|), which means "K(x) grows as fast as the length of x".
(emphasis mine)
Let's use lower case letters for an example. Suppose our string starts with:

"jq"

The next letter can be completely random.
What does it mean for a letter to be "completely random"? This makes no sense at all. If the next letter is random, it can be any one of the possible letters. Another "j then q" is no more or less random than any other letter pair, prima facie. Randomness obtains from the unpredictability of the arriving input, and has nothing to do with whether it repeats or not. A fair coin flipped 20 times is exactly as likely to produce "00000000000000000000" as it is "10010110010101110010", even though the first string is trivially describable programmatically as a 20-iteration for loop echoing a 0 to the output, and the latter is not. Some random strings coincide with strings that have simple rules, statistically. But a maximally complex string has the fewest rules and the least structure. In any case, the idea that the next letter is "completely random" or not based on the letters that came before it is nonsense.
It can even be another "j." But if it is another "j" then a "q" CANNOT follow, else it can be compressed (say with a "J").

Now short strings can indeed run into problems with the symbol table, but even if we say the individual portions of the symbol table are twice the size of each bit of the string, then because there are only 26 lower case letters in the alphabet, random letters will result in letter "pairs" every (1/26) x (1/26), or 1/676 times. Thus, even having a text with only, say, 2,000 randomly generated characters will provide sufficent number of repeats for you to be able to compress the string. And that's even if you're using an arbitrary table and you're not tailor-making a compression program (like you would for language). You know that the pairing of "jq" will occur roughly ever 676 times, so that substitution will reduce the original length of the (sufficiently long) string even if that's the only substitution you make.
ASCII text is compressible due to its lavish use of bits in representation. If our random string only draws from the lower-case, 26-letter alphabet, you only need five bits per letter, where ASCII characters are 8 bits, meaning you can realize more than 35% compression right off the bat, if you are beginning with ASCII representations. To avoid "illusionary" compression (take a text file with randomly generated letters in it and you will see this file can always be shrunk with gzip, due to the wasted extra bits for each ASCII character), use random bits. ASCII strings are not random, no matter what letters the string contains, due to the wasted bits in the format.

In any case, the mistake operating in this paragraph from Pike is the belief that he can replace "jq", or any other two-letter combination, with symbolic metadata that takes up less space in capturing it. For example, if we have the following string:

[D] = "kujqopjqbnlsjqjqlmze"

We can identify four instances of the pattern "jq" which are candidates for Pikian compression (bolded). But the compression requires encodimg and this is what thwarts the compression of two-letter patterns. For example, we might use digits as our indirections into our symbol referent patterns. Here, we might use "1" to indicate the subsitution for "jq", meaning anytime we encounter a "1" we should replace it with "jq":

[E] = "ku1op1bnls11lmze"

[E] is 4 bytes shorter than [D], but you cannot represent a symbol table or the substitution coding in the space of those 4 bytes. If we suppose that we can overcome this with larger strings, we realize bigger savings, but only by preserving the non-random statistical distribution of the letters. If, instead of a 20 character string, we multiplied [D] by a factor of 10, we would have a 200 character string with 40 instances of 'jq' that we might replace. That's a lot better savings, 40 bytes you might do a little something with. But this only happens because the source string is NOT STATISTICALLY RANDOM, having 40% of the letter pairs matching "jq", several deviations over what we would expect statistically. This is precisely the insight Kolmogorov, Solomonoff and Chaitin all arrived at concerning the information in a string; the more statistically random a string is, the less compressible it becomes.

You can increase the complexity of the symbol table to be any finite value greater than the length of the string, and it will still be possible to compress that string if the string is longer. Only if you claim that the complexity of the symbol table is infinitely large will you be unable to compress the string in this manner.
This is one of the few paragraphs Peter offers in the whole thread that is not demonstrably incorrect. But in this case, that is only because it's incomprehensible. Peter, what is the "complexity of the symbol table", since you are clearly not using the accepted Kolmogorov-Chaitin definition of the term? This paragraph isn't coherent enough to qualify as possibly "wrong".

Therefore, a "maximally random" that is "maximally complex" can only have a finite number of repetitions (depending on the number of symbols the string itself has and the symbol table that would accompany it). Even adding in the symbol table's information, there is still a finite length that a "maximally random" string can be before it must obey rules (i.e., no longer be random) and still maintain it's "maximal complexity."
Peter doesn't identify a "maximally random" what here, but assuming he means "maximally random string", it is safe to say that a finite string does, indeed, contain a finite number of repetitions for any given pattern. So we can give Peter a second nod for the thread here. But while correct, it's irrelevant, and tells us nothing about the complexity, or lack thereof, for any given string. Peter offers a novel theorem here, names that a string with high statistical randomness must at some point begin to "obey rules". Peter doesn't tell us what these rules are, or when they kick in, or where we might anticipate this "kick in" of the "rules". Curiously, strings beyond this length are both "no longer random", and "maximally complex". If Peter is onto anything but spewing BS here, he's got ideas that will make him world famous in mathematical circles.
Finally, consider the odds that a string of random characters would reach it's maximal complexity by purely random means. The vast majority of random strings (unless you keep them really short) will have those repetitions enabling compression. It is far more likely that a random process will generate those strings than that one will generate a "maximally complex" string.
Finally! The maths that obtain here show just the opposite. Highly random strings do not contain the kinds of patterns and frequencies that lend themselves to effective compression -- this is the very definition of random! Peter hasn't grasped this yet, but he's demonstrably wrong based on the very terms he's using: if a given string can be compressed to any significant degree, it's definitionally not a highly random string.
It is far easier for an intelligent being to create a maximally complex string than for a maximally complex string to occur randomly.
It's quite difficult for a human to create a statistically random string. Go ahead and try it sometime, Peter. Sources of randomness abound all around us in nature, and tapping into one of these sources provides an easy, reliable, instantaneous supply of random input (for example, circuit shot noise or radioisotope decay). Computers can easily create pseudo-random strings that pass all the statistical tests for random distribution, way faster than any human can make up string they suppose are random, but which very often pass statistical evaluations for randomness.
But again, I predicated this on the belief that T-Stone accepted "maximal complexity" as the inability to compress the string.

And FWIW, I didn't invent this theory, but I don't remember where I read it either. If you must have a name for it, you can try to Google it. I've written it out for you so I don't feel the need to spend any more time on it.
There's no theory to claim invention for. And nothing coherent to even try to blame on someone else, some other source. Peter says he's written it out for us, yet the reader has perfectly nothing that will help determine the complexity or information content of a given string, according to Calvinist Information Theory, for lack of a better, or Peter-supplied name for it.

Peter Pike 8/14/2008 10:07 PM
BTW, just to clear up one other thing that may cause confusion, since Idahoev said:
---
You can only successfully compress when some sequences occur more frequently than others, so that you encode the frequent ones with short strings and the infrequent ones with long strings, and your net size of message + symbol table decreases. It doesn't work on random strings.
---

This is not accurate at all. And because I like T-Stone so much, I'll use the Wiki article to demonstrate it.

Wiki says:
---
...[T]he following string:

25.888888888

...can be compressed as:

25.[9]8

Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just written in a smaller form.
---
Idahoev makes a noble attempt to get through, but fails. He explains that compression is available "when some sequences occur more frequently than others", and points out that this "doesn't work on random strings". So, in response to this, Peter pulls out an example string ("25.888888888") that is precisely the kind of string Idahoev told him is compressible, and which is ridiculously far from statistical random distribution of its symbols. It's as if Peter thinks "25.888888888" has a highly random distribution of its symbols chosen from the phase space of digits (and apparently the '.' character). The string in question has the digit "8" occuring in 9 of the 12 positions in the string, making it close to the theoretically least statistically random string available for that length ("888888888888", or any 12 digit string consisting of the same symbol).

Peter tells Idahoev that his claims are "not accurate at all", then goes on to provide examples that affirm precisely what Idahoev said. How would Peter write this string "in a smaller form": "439806931752"? That would be a better test of what Idahoev was telling him.
In the same way, random data (especially binary data that only has 2 digits to use) will form "runs" that can be compressed in this manner. Suppose you get the sequence: 1000000000.

That can be compressed (as per T-Stone's beloved Wiki!) as: 1[9]0.

So even ignoring everything else with the symbol table, Wiki agrees with me that random data can be compressed so long as there are "runs" in it. And with "pure" random data...there will be runs.
"1000000000" is not a string that registers high for statistical randomness. By definition, if a string can be compressed into a program that is smaller than the echo of the string itself, it's not maximally random. So, algorithmic information theory provides a mathematical basis for identifying and measuring the randomness(complexity) of a string. There will be repetitions for any string of more than a few symbols, but it is the nature of randomness, proven my the maths of the theory, that the repetitions and patterns contained are not compressible, even in principle. No way, no how. Maximally random strings defy compression, tautologically, as that is what we mean when we assign the label "maximally complex" or "maximally random". If the reader consults the Wikipedia article Peter linked to, it will be seen that the strings in question are NOT presented as random strings. These are "best case" non-random strings, the opposite configuration of strings we are focusing on in terms of complexity and incompressibility.

Peter Pike 8/15/2008 8:09 AM
As I was dropping off to sleep last night, I think I discovered where the problem in understanding my position is coming in for folks like Idahoev. Idahoev would be correct if we were forced to use the same symbol "alphabet" to compress data as we used to create it. While this restriction is necessary to get computers to communicate with each other, it's an artificial restriction when it comes to information as a whole.
Information Theory and Algorithmic Complexity assume symmetric representations on both sides. It doesn't matter what alphabet gets used, as all data gets rendered as binary -- a stream of bits -- ultimately anyway. Alphabets and other symbols are just useful mnemonics for human conceptualization. Peter's computer doesn't store the letters "Peter" as letters on his hard disk when he saves his name in a document. What gets stored is "0101000001100101011101000110010101110010", and even that is a human-friendly set of symbols representing alternate magnetic states. Everything gets crushed down to bits, ultimately.
As a quick example, suppose that Adam writes a program and compiles it so it's in binary form. He wants to send it to Bill, but Bill lacks a compiler, an internet connection, and a portable storage device. He does, however, have the ability to type 1s and 0s into a program on his computer and have it save it as a binary file.

Adam wants to send the program to Bill so he can have it too, but when he prints out the binary file (as 1s and 0s) it is much too long to mail inexpensively. Therefore, he compresses it using the compression technique I already showed (i.e. "00" = "A", etc.), and drops that compressed form in the mail. Bill gets it, converts it back to binary, inputs it into his computer and has the program.

Now this technique is impossible for computers (although it may be possible for a quantum computer) since they can only function in binary. However, binary is very inefficient for humans, which is why we don't speak binary. The English alphabet serves as a meta-alphabet for the binary alphabet, and you can compress the binary alphabet in the English meta-alphabet (and then transmit it in the meta-alphabet before converting back to the binary alphabet).
This has been dealt with adequately above. Peter is confused about the binary representation of data, and supposes that "00" (binary) is bigger in terms of storage than "A" (ASCII), or "A"(2 bit symbols {A|B|C|D}). The missing idea for Peter is that when you have more than two symbols ({0|1}) in your symbol set you need more than one bit to represent a symbol. The lower-case 26-letter alphabet requires 5 bits to represent a letter, in other words. With 6 bits you can represent the 26 letters in both cases and the digits 0-9, etc. As your symbol space gets larger, the space you need to allocate for each symbol grows. This is what Peter does not understand and what defeats his compression strategies (if what he is saying is true, he's worth billions for his earthshaking innovations!).
If you keep that in mind (the difference between the alphabet and the meta-alphabet) it should help you understand my original point. Even if random strings cannot be compressed using the alphabet, some of them can still be compressed using a meta-alphabet. However, my argument is that there are still some strings that remain that are incompressible in both the alphabet and the meta-alphabet, and those would be the correct definition of "maximally complex." And because those strings must obey certain rules, the "maximally random" string is not the "maximally complex" string.
This is completely confused. Peter ends things up no more clued-in than he began. What are the "certain rules" that a maximally complex string must obey, Peter. If they are certain, please enumerate them. Here is a way for Peter to refute the whole of what I've been saying, along with the rest of the world that relies on Shannon, Kolmogorov, Solomonoff and Chaitin. If Peter can enumerate and demonstrate the "rules" that make a string maximally complex and yet NOT maximally random, he will have shown me up, and launched a revolution in information theory -- Famous by Friday, as they say. As it is, I predict that these rules will remain unenumerated and undemonstrated, and we will not see a revolution in information theory, because Peter is just making all this up as he goes, and covering as best he is able when his blunders are pointed out. Strings that have the attributes that defy further compression (small patterns, fairly and unpredictably distributed) are the conceptual basis for the term "random" in information theory and algorithmic complexity. What makes a string maximally complex/random is its *lack* of rules, or the single rule that there are no rules that apply, rules which can be used as the basis for further compression.

-Touchstone

3 comments:

zilch said...

Thank you, touchstone, for the clear and accessible explanation. I'm afraid, based on my own experience at Triablogue, that your fisking will just be ignored: they have the Truth already, and whatever doesn't fit into their little Bible-shaped box will simply be disregarded. But for those of us interested in how the real world works, it's nice to have our intuitions präzisiert (made more precise).

Touchstone said...

zilch,

I'm under no illusions about the reception of this stuff over at Triablogue. Just a case of 'keeping on it', documenting a pattern of BSing over there that is all too common. It's a pain to document it all, as they can spew it typically much faster than it can be properly exposed for what it is. But in this case, the errors are straightforward and can be demonstrated relatively easily.

-TS

MH said...

Its so nice to see them completely mess up a subject I know more about than is typical on debunking christianity. I work on AI as well as with networking both of which touch compression.

They're like children groping around in the dark. Like all other subjects they dont even try to understand the existing science.