Taming the Serpent

Words, Music, and Information

by Edward M. Wysocki, Jr.

How many of you would expect to find an article on Information Theory in Astounding?

Actually, there were four such articles by J. J. Coupling. These were “Chance Remarks” (October 1949), “Ergodic Prediction” (February 1950), “Science for Art’s Sake” (November 1950), and “Don’t Write: Telegraph!” (March 1952).

“J. J. Coupling” was the name that John Robinson Pierce (1910 – 2002) used for all but one of his articles that appeared in Astounding and Analog. Pierce also wrote a number of short works of science fiction. All of his fiction appeared with some variation of John R. Pierce as the author.

Just who was Pierce? All of his degrees were from the California Institute of Technology, and all were in Electrical Engineering: B.S. in 1933, M.S. in 1934, and Ph.D. in 1936. After he received his Ph.D., Pierce went to work at Bell Labs. He retired from Bell Labs in 1971, and returned to Caltech as a member of the engineering faculty. He became professor emeritus of engineering in 1980.

In 1980, Pierce retired a second time. From 1980 to 1983, he assumed the part-time post of chief technologist at the Jet Propulsion Laboratory (JPL). In 1983, he moved to Stanford as visiting professor of music associated with the Stanford Center for Computer Research in Music and Acoustics (CCRMA).

Now for a look at Claude Elwood Shannon (1916 – 2001). He attended the University of Michigan from which he graduated in 1936 with two bachelor’s degrees. One was in Electrical Engineering and the other was in Mathematics. In the spring of 1936, Shannon noticed a card posted on a bulletin board. It was an offer to come to MIT as a master’s student and as an assistant on the Differential Analyzer of Vannevar Bush.

Bush had joined the Department of Electrical Engineering at MIT in 1919. By 1932, he had been appointed a vice president of MIT and then Dean of the MIT School of Engineering. By the time that the United States had entered World War II, Bush had been made director of the Office of Scientific Research and Development.

The differential analyzer was an analog computer that was used to solve complex differential equations by mechanical means. Such systems were also built at other universities and laboratories. A short film segment showing the differential analyzer built at UCLA appeared as part of the design process for the rocket in the 1950 film Destination Moon. The same segment also appeared in the 1951 film When Worlds Collide. In this film, it was first referred to as “DA.” This required explanation that the acronym stood for differential analyzer.

The differential analyzer was a complex system that had to be reconfigured from one problem to the next. Imagine having to rebuild your digital computer to solve a new problem instead of simply loading and running a new program. Shannon worked with the complex collection of switches and relays that made the reconfiguration task easier to perform.

Designing complex switching circuits was in those days a combination of intuition with trial and error. In addition, the final design was not necessarily the most efficient solution. The answer that came to Shannon was that switching circuits could be described by Boolean algebra. When the relationship between the inputs and outputs of a circuit was described by a complex Boolean expression, basic rules could be used to simplify the expression and reduce the number of switches and relays needed to perform the same function. This approach, which was described in his master’s thesis, “A Symbolic Analysis of Relay and Switching Circuits,” is the basis for all digital design.

Then it was necessary to find a suitable topic for his doctoral dissertation. The suggestion by Bush was a departure from his previous work – genetics. Shannon’s period of study at the Cold Spring Harbor Laboratory led to his dissertation “An Algebra for Theoretical Genetics.”

Shannon received a National Research Fellowship in 1940 that permitted him to spend a year at the Institute for Advanced Study at Princeton, NJ. Before beginning at Princeton, he spent the summer at Bell Labs. He discovered that his master’s degree work on the design of digital circuits was being used at the Labs.

Following his time at Princeton, Shannon returned to Bell Labs, where he was to remain until 1956. One of Shannon’s close friends at Bell Labs was John Pierce. During the war, Shannon was involved in the development of fire control system and then with cryptography. The work in cryptography is recognized as the basis for his work in communication theory.

Shannon’s key paper "A Mathematical Theory of Communication," appeared in the July and October issues of the Bell System Technical Journal. The topics presented by Pierce in his four Astounding articles are derived from Shannon’s work.

Shannon remained at Bell Labs until 1956, when he returned to MIT. He remained a member of the MIT faculty until 1978.

My presentation of the life and accomplishments of Claude Shannon has been, of necessity, quite brief. If you wish to learn more, I would strongly suggest one of my references, A Mind at Play: How Claude Shannon Invented the Information Age by Jimmy Soni and Rob Goodman.

Now let us consider each of the four articles.

Pierce pointed out at the beginning of “Chance Remarks” that there was much material contained in Shannon’s 1948 paper. Not exactly suitable for a brief summary in a magazine article. He therefore focused on certain sections of the paper. The first was “The Discrete Source of Information,” which deals with the statistical nature of English.

It is reasonable to assume that someone who works for a company that transmits written or spoken English would be interested in communications. What a message means in such a situation, if anything, is of no concern. It might be total gibberish, but the objective is to ensure that what is received is the correct gibberish.

You can transmit any combination of letters in a message. Of all of the possible combinations of letters, most would not be recognizable as English. “Chance Remarks” looked at various means of generating sequences of letters and words that may or may not be seen as proper English.

Pierce started with Shannon’s statement that written English is redundant. This means that more symbols are transmitted than are actually needed. We can discard letters and still have a message that we can read and understand. One such example is “MST PPL HV LTTL DFFCLTY N RDNG THS SNTNC.” By various methods, Shannon estimated that English is about 50% redundant. This led him to an interesting observation regarding crossword puzzles.

If there was zero redundancy in English, any sequence of letters, such as RXKHRJFFJUJ, is a reasonable word. Any two-dimensional arrangement of letters would be a crossword puzzle. If the redundancy is too high, there would be less freedom of choice in the sequence of letters. It would not be possible to construct crossword puzzles. The amount of redundancy in English is just at the level that allows the construction of complex crossword puzzles.

Pierce then presented examples taken from the section “The Series of Approximations to English” in Shannon’s paper. I will only present two of them here.

The first example involved randomly selecting among 27 symbols (the alphabet plus a space). This generated “XFOML RXKHRJFFJUJ ZLPWCFWKCYJ FFJEYVKCQSGHYD QPAAMKBZAACIBZLHJQD.” All of the symbols are independent, and no attempt was made to respect the letter frequencies of English text. Successive examples introduced conditions regarding the selection of the next letter or word.

The last of Shannon’s examples required the use of a novel. Two words were chosen at random. Then the next occurrence of the second word was found. The word following it was placed in the generated text. Then the next occurrence of that word was found. This process was repeated to give the result “THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.”

The remainder of the article was concerned with techniques developed by Pierce. He wondered if more elaborate statistical methods could rule out word combinations that did not make sense. By this approach, he hoped to improve upon Shannon’s last example.

But what are the additional statistical methods and how could they be employed? Pierce observed that such statistics must reside in the human brain. One approach was to show only the last three words of a passage to someone and ask for the next word. There still remains an element of chance in such a process. The choice among possible acceptable words would vary from person to person.

Pierce started with “When the morning” and asked 21 people in turn to suggest the next word to follow the last three. His result was “When the morning broke after an orgy of abandon he said her head shook quickly vertically aligned in a sequence of words follows what.” As we might expect, only short segments of the result appear to make sense. If this passage had been written by one person, it might be said that the person’s mind had wandered.

This leaves open the question of the construction of coherent written English. It would seem that a definite purpose exists in the writer’s mind. Or, as Pierce proposed, is the writer unconsciously making use of some long-range statistical rules?

One result of further experiments by Pierce was “It happened one frosty look of trees waving gracefully against the wall.” The appearance of this sentence led Pierce to speculate about art. One might argue how much of art lies in the work of the artist. In the examples presented, particularly the last, there is no artist. There is only the likelihood that words will occur in a certain order. Does the reader understand them or have an aesthetic appreciation? We will return to the question of art in the discussion of one of the other articles.

The second article, “Ergodic Prediction,” is the shortest of the four – just over a thousand words. It is related to the first article in that it is concerned with the generation of sequences of words. The word ergodic is not one that is in everyday use, so I feel that I must briefly discuss it. Pierce referred to scientific prophecy and noted that it was only possible to say what might happen on the basis of past behavior. By assuming that the future statistics of a process are the same its past statistics, we have a basis for prophecy or prediction. Such a process is called ergodic. There are more rigorous definitions, but there is no need to introduce them here. Pierce then referred to Shannon’s 1948 paper and noted that to some degree written English can be regarded as an ergodic process.

All of Pierce’s discussion to this point formed the introduction to the main part of his article. How seriously one should consider the remaining 60% of the article should be clear from its date of April 1.

It is a report concerned with a weapon developed by the Nazis during the war. The Nazi scientist making the report was Dr. Hagen Krankheit, who gained entry to the United States after the war by posing as a rocket engineer.

The weapon, known as the Müllabfuhrwortmaschine, was supposed to be a means of automatically generating propaganda. The similarity in appearance to digital computers such as ENIAC and MANIAC was noted. It was suggested that original idea may have been stolen by the Russians.

The process was demonstrated by randomly selecting cards from different decks labeled “entities” and “operators.” We can recognize that such a process is related to the various schemes described in “Chance Remarks” in attempts to generate English text. The results of those schemes were not understandable. How was a successful result supposedly accomplished in the weapon?

Dr. Krankheit said that a great deal of labor had been involved, “but had been made easier by the fact that propaganda does not have to make sense as long as it achieves its objective.”

A government committee spokesman denied the existence of such a machine in the United States. A quoted section of his comments, however, sounded suspiciously like the examples generated by Dr. Krankheit. The same may be said of comments by a Russian spokesman who denied that his country would use such a device, but concluded that its true inventor was “an as yet unnamed Russian scientist.”

There is one interesting addition to my description of this article. In A Mind at Play, one of the notes at the end of Chapter 16 refers to an unpublished spoof by Shannon. The note mentions Dr. Krankheit, the full German name of the weapon, and the same examples of generated propaganda. This makes it perfectly clear that Soni and Goodman were incorrect in calling the spoof unpublished, as it is found in the pages of Astounding. Pierce stated that he obtained the material from “a man who is interested in cybernetics, communication theory and prediction.” Without saying that the material came from Shannon, it is obvious that is who he meant.

In “Science for Art’s Sake” (SFAS), Pierce began by repeating some of what he said in “Chance Remarks” of Shannon’s work regarding the nature of written English. His next step was to extend this approach to the arts, specifically music. Three short compositions were presented, with a very brief explanation of how they were generated. These were labeled Random I, II and III. A longer piece called Canon I was also shown with no explanation at all how it was generated other than it involved rolling a die.

This is the piece “Random II” that appeared in SFAS.

In the course of my research for this article, I encountered a Ph.D. dissertation that included a discussion of Pierce and SFAS. It also included “Random II.” The dissertation was “The Computational Attitude in Music Theory” by Eamonn Bell at Columbia University in 2019. Dr. Bell is now an Assistant Professor in the Department of Computer Science at Durham University.

Here we have a dissertation that referred to an article in Astounding. There have probably been a number of dissertations concerned with science fiction from Astounding or Analog. How many other than Bell’s have cited a work of nonfiction that appeared in this magazine? Also, the usual case would be for an article to refer to a discovery or theory from someone’s dissertation, not the other way around.

The section of the dissertation that discussed SFAS was titled “Three Pioneers.” The first part was not concerned only with Pierce, but also with Shannon. This was not Claude Shannon, however, but his wife Mary Elizabeth “Betty” Shannon (1922 – 2017). With a degree in mathematics from the New Jersey College of Women (now part of Rutgers University), she went to work at Bell Labs in 1944 where she was employed as a “computer.” Her boss was John Pierce. He introduced Betty to Claude in 1948 and they were married in March 1949.

The work that described their experiments in music was a Bell Labs Technical Memorandum from November 1949, “Composing Music by a Stochastic Process” by J. R. Pierce and Mary E. Shannon. Discussions of her life and this memo have commented that it was unusual for a woman to get her name on such a report, even if she had done the calculations on which it was based.

I need to explain a stochastic process. The word stochastic derives from a Greek word στόχος meaning aim or guess. A strict definition of a stochastic process is “a collection of random variables that is indexed by some mathematical set.” The index set could be the integers interpreted as a time line. The random variable at each point in time is determined by a process that could be as simple as flipping a coin or rolling a die. This example is only one of many types of stochastic processes. The selection of letters as described in “Chance Remarks” was also a stochastic process.

With regard to the composition of music, the simplest approach would involve numbering the 12 notes within an octave. One could then roll a 12-sided die to determine the next note to appear. Would the result be pleasing to listen to? Probably not. Something different was required.

As the Bell Labs memo appeared about a year before SFAS, I was naturally curious as to any connection. It is a bit difficult to get your hands on a 72-year-old Bell Labs internal memo. After much fruitless search, I was able to get into contact with Dr. Bell, who kindly provided me with a copy.

What was in the memo? The first 7 pages explained the various approaches used to generate music. The next 19 pages consisted of “catalogs” of chords. The final three pages showed 5 short pieces of music composed according to the different approaches. The connection with SFAS was quickly established as 3 of these 5 pieces appeared in SFAS as Random I, II and III.

Three approaches for generating music were described. The first involved a catalog of 68 chords, where the method used to select the next chord was described as using a table of four-figure random numbers. This catalog was not included in the memo. It was used to generate what appeared in SFAS as Random I.

The next approach used a long catalog of 260 chords. Appendix I of the memo contained tables of numbers representing these chords. The method for generating the next chord involved the generation of random numbers and the use of a complex set of rules. One piece was generated by this method, but was not repeated in SFAS.

A third approach used the same method as the second, but with a shorter catalog of only 59 chords. This was a subset of the longer catalog and appeared as Appendix II. This approach was used to generate the remaining three pieces. Two of these appeared in SFAS as Random II and Random III.

For those with an interest in what such compositions sound like, a piano performance of Random II appears as “Music by Chance” at https://www.youtube.com/watch?v=nEKLH-X5jCk.

Another article by Pierce on the subject of science and art is “Portrait of the Machine as a Young Artist,” which appeared in the June 1965 issue of Playboy.

And now for the fourth article, “Don’t Write: Telegraph!” It differs from the other three that I have discussed. The subject was communication over interplanetary and interstellar distances. This subject seems more suitable to a science fiction magazine than the statistics of language and the means of generating music by a stochastic process.

Most of the article looked at the technical details connected with such long-distance communication. It was only at the end of the article that Pierce referred to the work of Shannon.

In the discussion of such long-distance communications, it should be noted that only one such action had been taken prior to the appearance of this article. Project Diana had the objective of bouncing a radar signal off of the Moon and receiving the reflected signal. This was accomplished in January 1946. The experiment showed that signals could penetrate the ionosphere. Without such a capability, it would not be possible to communicate by radio with any bodies in space.

Pierce began by discussing the problem of communication on Earth. At the time, the best system for long-distance communication was by microwaves. Such signals travel in straight lines, so the curvature of the Earth requires a repeater about every thirty miles. The biggest problem is that each repeater in the system will introduce noise into the signal. There is also fading of the signal and other problems due to transmission through the atmosphere.

In contrast, the only problem affecting communication beyond Earth is distance. Pierce considered three distances: to the Moon, to Mars, and to the stars.

The first factor is that there will always be divergence of the beam, with the received power decreasing as the square of the distance. Other technical factors to be considered in such communications include the noise received along with the signal, the noise generated by the receiver, sizes of both the transmitting and receiving antennas, and the bandwidth of the signal being transmitted.

Without presenting here any of Pierce’s assumptions regarding these technical factors and the results of his calculations, he showed that communications with the Moon or Mars present no great technical problems. Of course, we know this as historical fact on the basis of what has been accomplished in the 70 years since the article appeared. The result that he obtained regarding communication with Alpha Centauri indicated the need for an impractical amount of transmitter power.

Pierce ended his article by presenting Shannon’s most important result. Here he referred not to Shannon’s paper, but to the book The Mathematical Theory of Communication by Claude Shannon and William Weaver.

Shannon’s approach was to convert any message into binary digits (bits). His objective was to determine the rate at which information can be reliably transmitted, measured in bits per second. Consider the formula

C = B log2 ((PS + PN) / PN) = B log2 (1 + (PS / PN))

where C is the rate in bits per second, B is the bandwidth of the system in cycles per second, PS is the signal power, and PN is the noise power. The value C is known as the channel capacity. I am using C, to reflect modern usage, rather than the letter used by Pierce in his article.

This formula is a correction to what appeared in Pierce’s article, which had the term (PS – PN). In a case presented by Pierce, what happens if PS is equal to PN? With the corrected formula, we have log2(2) or 1, which implies that C = B. With the formula as presented in the article, we would have log2(0), which is undefined.

There are different formulas for C in different situations. This particular formula applies to the case known as additive white gaussian noise (AWGN). “White” implies unform power across the frequency band. “Gaussian” implies a normal distribution in the time domain. AWGN does not account for many of the signal problems associated with communications on Earth, but is useful for modeling space communications

What does the value of C tell us?

Assume that you wish transmit information at the rate of R bits per second. If R < C, it is possible to encode the information such that probability of error may be made arbitrarily small. What happens if R > C? In such cases, there is a bound on how small the probability of error may be made.

To obtain a suitably small level of error requires that the information must be properly encoded. In error correcting codes, redundant bits are added to the original pattern of bits. How many bits are added and how their values are computed varies from code to code. Calculations performed on the received message involving both the original and redundant bits allow any errors to be detected and corrected.

As an example of what may be achieved, consider the New Horizons spacecraft that provided the images of Pluto and other Kuiper Belt objects. The images were encoded using Turbo codes, first developed in the 1990s, which permit communication at very close to the maximum channel capacity for a given noise level. Given the output power of the spacecraft’s transmitter and the great distance involved, the data rate from New Horizons to provide reliable communication was only 1 to 2 Kbps.

Although Shannon’s work was presented by Pierce in the context of interplanetary and interstellar communication, his results affect everyone today. The basis of our entire world is digital, something that could not have been foreseen by Shannon when he presented his discoveries in 1948. Without his work showing that digital communications could be made to operate with an acceptably small error level, would others have gone through the stages of development that have given us the devices to create, manipulate, store, and manage our massive flow of digital information?

And with such an effect on our world, how many people even know who Claude Shannon was?

SOURCES:

A useful guide to the interpretation of Shannon’s 1948 paper was a book by John R. Pierce. This was An Introduction to Information Theory: Symbols, Signals & Noise. Published in 1980, it is a revised version of a book published in 1961. The musical score labelled “Random II” in “Science for Art’s Sake” originally appeared the Pierce/Shannon 1949 technical memo. It was included here by permission of Nokia Corporation and AT&T Archives. My source for information on “Betty” Shannon was “On ‘Composing Music by a Stochastic Process’: From Computers that are Human to Composers that are Not Human” by Haizi Yu and Lav R. Varshney in the December 2017 issue of IEEE Information Theory Society Newsletter.

Edward M. Wysocki, Jr.

Author Researcher