Friday, June 14, 2019

Flag Day Post: Self-Ref and Self-Rep

Table of Contents

1. Introduction
2. Pappas Problem of the Day
3. The 490's
4. The 500's
5. The 510's
6. The 520's
7. The 530's
8. The 540's
9. Traditionalists: Fraction Multiplication
10. Conclusion

Introduction

Let me begin with welcome to our first summer post, and happy Flag Day! It's a national holiday, but one of the lesser known ones. It refers to the day that Betsy Ross first presented the national flag, with thirteen stripes and only thirteen stars, back in the year 1777.

As I promised you earlier, today I am making one last post about a Hofstadter Chapter. I choose to present Chapter 16. Why did I choose this chapter? Well, it's the longest Chapter in the book, and thus I obviously didn't describe it completely in my May 1st post.

Notice that the last several Chapters of the book are all long, and if I really wanted to, I could devote four more posts to Chapters 17-20. At the very least, I could cover Chapters 18-19, on artificial intelligence, since I combined those two in last month's post. But as I mentioned earlier, I don't want our spring side-along reading book to slide into summer. Technically, the summer solstice hasn't occurred yet, so we can still read Chapter 16 today. But there will be no more time for any more Chapters after today.

Chapter 16 is preceded by Dialogue 16, "Edifying Thoughts of a Tobacco Smoker." I feel that I covered this Dialogue well enough in my May 1st post, and so today's post is only on the Chapter.

This Chapter spans 54 pages in my edition of the text -- pages 495-548. I decided to divide today's post into sections corresponding to pages of the book -- the 490's, 500's, 510's, and so on. This is probably how we should have originally read the book -- ten pages a day, so that we're not rushing or skipping so often. But then the reading still would have dragged into summer.

Pappas Problem of the Day

Today on her Mathematics Calendar 2019, Theoni Pappas writes:

Find AC + BD in feet.

(Here is the given info from the diagram: In quadrilateral ABCD, its midpoint quadrilateral has sides of lengths 1.5 yd. and 30 in.)

Midpoint quadrilaterals are mentioned in the U of Chicago text but only in an exercise. They are emphasized more in other Geometry texts. Anyway, the important thing to note here is that the sides of a midpoint quadrilateral are exactly half of the diagonals of the original quadrilateral. (Notice that the Midpoint Connector Theorem, which is highlighted in the U of Chicago text, can be used to prove the properties of midpoint quadrilaterals.)

Thus AC = 3 yards and BD = 60 inches. We are asked to find the sum of the diagonals in feet, and so we must use dimensional analysis. Three yards is 9 feet and 60 inches is 5 feet. Therefore the desired sum is 14 feet -- and of course, today's date is the fourteenth.


The 490's


Chapter 16 of Douglas Hofstadter's Godel, Escher, Bach is called "Self-Ref and Self-Rep." Here's how it begins:

"In this chapter, we will look at some of the mechanisms which create self-reference in various contexts, and compare them to the mechanisms which allow some kinds of systems to reproduce themselves. Some remarkable and beautiful parallels between these mechanisms will come to light."

Oh, so "Self-Ref and Self-Rep" means "Self-Reference and Self-Replication." Let's start with some implicitly and explicitly self-referential sentences:

(1) This sentence contains five words.
(2) This sentence is meaningless because it is self-referential.
(3) This sentence no verb.
(4) This sentence is false. (Epimenides paradox)
(5) The sentence I am now writing is the sentence you are now reading.

The author imagines telling sentence (4) to a child:

"They may say, 'What sentence is false?" and it may take a bit of persistence to get across the idea that the sentence is talking about itself. The whole idea is a little mind-boggling at first."

And so Hofstadter provides us with some diagrams. The first is "This sentence is false," but written in the shape of Epimenides committing suicide. The next is an iceberg, with "Epimenides sentence" above the waterline and "cognitive processes required for understanding the self-reference in the Epimenides sentence" below it. He asks whether its possible to make something like (1) above into a self-ref without using the phrase "this sentence":

"This is actually possible, but only if you are willing to entertain infinitely long sentences, such as The sentence "The sentence "The sentence "The sentence ... etc., etc. ... is infinitely long" is infinitely long" is infinitely long" is infinitely long. But this cannot work for finite sentences."

The author now reminds us of the Quine sentence: "yields falsehood when preceded by its own quotation" yields falsehood when preceded by its own quotation.

"This resembles a floating cake of soap more than it resembles an iceberg. The self-reference of this sentence is achieved in a more direct way than in the Epimenides paradox: less hidden processing is needed."

And sure enough, the next diagram is of this bar of soap, with "Quine sentence" above the water line and "process required for understanding" below it.

Now Hofstadter shows us a self-producing program, written in the BlooP language. Such a program is also called a quine, and the author reverses the name and names his program "eniuq":

DEFINE PROCEDURE "ENIUQ" [TEMPLATE]: PRINT [TEMPLATE, LEFT-BRACKET, QUOTE-MARK, TEMPLATE, QUOTE-MARK, RIGHT-BRACKET, PERIOD].

ENIUQ
['DEFINE PROCEDURE "ENIUQ" [TEMPLATE]: PRINT [TEMPLATE, LEFT-BRACKET, QUOTE-MARK, TEMPLATE, QUOTE-MARK, RIGHT-BRACKET, PERIOD].
ENIUQ'].

"It is important to realize that the character string which appears in quotes in the last three lines of the program above -- that is, the value of TEMPLATE -- is never interpreted as a sequence of instructions."

Anyway, here is a link to some quines written in other languages:

https://cs.lmu.edu/~ray/notes/quineprograms/

"Before we call something a self-rep, we want to have the feeling that, to the maximum extent possible, it explicitly contains the directions for copying itself."

By the way, here is another link to some programs suggested by this Hofstadter chapter:

https://www.bamsoftware.com/hacks/geb/index.html

One of them is listed for pages 498-499 of the book. It is a crab program -- a program that reproduces itself backwards -- written in BlooP. The author at the link writes:

Here is a crab program in the BLooP-like language. It assumes the existence of a REVERSE procedure but not any string concatenation operators such as those used by the Python version. You can see how similar it is to Hofstadter’s ENIUQ.
DEFINE PROCEDURE ''CRAB'' [TEMPLATE]: PRINT [PERIOD, RIGHT-BRACKET,
QUOTE-MARK, REVERSE [TEMPLATE], QUOTE-MARK, LEFT-BRACKET, REVERSE [TEMPLATE]].
CRAB['DEFINE PROCEDURE ''CRAB'' [TEMPLATE]: PRINT [PERIOD, RIGHT-BRACKET,
QUOTE-MARK, REVERSE [TEMPLATE], QUOTE-MARK, LEFT-BRACKET, REVERSE [TEMPLATE]].
CRAB'].
The program would produce this output when executed.
.]'BARC
.]]ETALPMET[ ESREVER ,TEKCARB-TFEL ,KRAM-ETOUQ ,]ETALPMET[ ESREVER ,KRAM-ETOUQ
,TEKCARB-THGIR ,DOIREP[ TNIRP :]ETALPMET[ ''BARC'' ERUDECORP ENIFED'[BARC
.]]ETALPMET[ ESREVER ,TEKCARB-TFEL ,KRAM-ETOUQ ,]ETALPMET[ ESREVER ,KRAM-ETOUQ
,TEKCARB-THGIR ,DOIREP[ TNIRP :]ETALPMET[ ''BARC'' ERUDECORP ENIFED

The 500's

"To be sure, explicitness is a matter of degree; nonetheless there is an intuitive borderline on one side of which we perceive true self-directed self-reproduction, and on the other side of which we merely see copying being carried out by an inflexible and autonomous copying machine."

Hofstadter asks, "What is a copy?" (The U of Chicago texts also discusses a definition of "copy" when trying to define "congruence.") He gives an example of a self-reproducing song, which a nickelodeon will play if someone presses buttons 11-U:

Put another nickel in, in the nickelodeon,
All I want is 11-U, and music, music, music.

"Some readers might enjoy thinking about how to write such a program in the BlooP-like language above, using the given self-rep as a model."

The author at the above link already does this. Meanwhile, Hofstadter's next example is where Epimenides straddles the channel:

"est une expression qui, quand elle est precedee de sa traduction, mise entre guillements, dans la langue provenant de l'autre cote de la Manche, cree une faussete" is an expression which, when it is preceded by its translation, placed in quotation marks, into the language originating on the other side of the Channel, yields a falsehood.

"If the notion of 'self-rep by retrograde motion' (i.e., a program which writes itself out backwards) is reminiscent of a crab canon, the notion of 'self-rep by translation' is no less reminiscent of a canon which involves a transposition of the theme into another key. The idea of printing out a translation instead of an exact copy of the original program may seem pointless."

Hofstadter's next example involves a program that prints out its own Godel number. He alludes to this Godelian self-reference in one of his earlier Dialogues, Sonata for Unaccompanied Achilles -- where Achilles is mapped to the violin in one of Bach's sonatas:

"And this mapping is left, of course, for the reader to notice. Yet even if the reader does not notice it, the mapping is still there, and the Dialogue is still a self-ref."

The author's next example involves a self-rep by augmentation -- this is a program that calls itself, but runs at half the speed (so that each loop takes twice as long to run). Another example is a Kimian Self-Rep, referring to programmer Scott Kim.

I actually tried to do the Kimian Self-Rep in Mocha. The idea is to type in an error message, such as ?SN ERROR, and have Mocha print back the same error message. But this doesn't work because -- as it turns out, ?SN ERROR is actually a line of code that contains no error! In this case, the question mark ? is actually an abbreviation for PRINT. So the line becomes PRINT SN, where SN is the name of a variable. If it hasn't been initialized, then Mocha assumes that SN=0.

"That is the obverse side of the coin: 'What is the original?' This can best be explained by referring to some examples."

And here are his examples of self-reps:

(1) a program which, when interpreted by some interpreter running on some computer, prints itself out;
(2) a program which, when interpreted by some interpreter running on some computer, prints itself out along with a complete copy of the interpreter (which, after all, is also a program);
(3) a program which, when interpreted by some interpreter running on some computer, not only prints itself out along with a complete copy of the interpreter, but also directs a mechanical assembly process in which a second computer identical to the one on which the interpreter and program are running, is put together.

At this point, Hofstadter jumps into something he calls Typogenetics. It stands for "typographical genetics" and is an artificial solitaire game based on the structure of DNA.

"I have intended in Typogenetics only to give an intuition for those processes centered on the celebrated Central Dogma of Molecular Biology, enunciated by Francis Crick (on of the co-discoverers of the double-helix structure of DNA)."

DNA => RNA => proteins

The game of Typogenetics involves typographical manipulation on sequences of letters. There are four letters involved:

A C G T

Arbitrary sequences of them are called strands. Thus, some strands are:

GGGG
ATTACCA
CATCATCATCAT

"Thus, there are four kinds of enzyme -- those which prefer A, those which prefer C, etc. Given the sequence of operations which an enzyme performs, you can figure out which letter it prints, but for now I'll just give them without explanation."

Here is the author's example of an enzyme:

(1) Delete the unit to which the enzyme is bound (and then bind to the next unit to the right).
(2) Move one unit to the right.
(3) Insert a T (to the immediate right of this unit)

This enzyme happens to like to bind to A initially. And here's the author's example of a strand:

ACA

If the enzyme binds to the left A and starts working, it becomes CAT. Here is his next example:

(1) Search for the nearest pyrimidine to the right of this unit.
(2) Go into Copy mode.
(3) Search for the nearest purine to the right of this unit.
(4) Cut the strand here (viz. to the right of the present unit).

According to Hofstadter, here "pyrimidine" and "purine" have the same definitions as in DNA, as does "Copy mode," which involves complementary base pairing:

"The complements are shown below. You can perhaps remember this molecular pairing scheme by recalling that Achilles is paired with the Tortoise, and the Crab with his Genes."

purines complement pyrimidines
A          <=>              T
G          <=>              C

This enzyme will act on the following string:

CAAAGAGAATCCTCTTTGAT

Here is the result:

                          AGGAGAAAC
CAAAGAGAATCCTCTTTG

AT

"If the 'switch' command is given, but there is no complementary base where the enzyme is bound at that instant, then the enzyme just detaches itself from the strand, and its job is done."

At this point, the author introduces 15 types of commands, called "amino acids":

cut -- cut strand(s)
del -- delete a base from a strand
swi -- switch enzyme to other strand
mvr -- move one unit to the right
mvl -- move one unit to the left
cop -- turn on Copy mode
off -- turn off Copy mode
ina -- insert A to the right of this unit
inc -- insert C to the right of this unit
ing -- insert G to the right of this unit
int -- insert T to the right of this unit
rpy -- search for the nearest pyrimidine to the right
rpu -- search for the nearest purine to the right
lpy -- search for the nearest pyrimidine to the left
lpu -- search for the nearest purine to the left

"Let us write down an arbitrary enzyme and an arbitrary strand and see how the enzyme acts on the strand."

rpu - inc - cop - mvr - mvl - swi - lpu - int

TAGATCCAGTCCATCGA

the following is the result:

ATG, and TAGATCCAGTCCACATCGA

where the second string comes from the Copy mode, turned upside-down.

"Thus, the strands themselves will dictate the operations which will be performed on them, and those operations will in turn produce new strands which will dictate further enzymes, etc. etc.!"

The 510's

"This is missing levels with a vengeance! Think, for the sake of comparison, how different the MU-puzzle would have been if each new theorem produced could have been turned into a new rule of inference by means of some code."

Hofstadter now shows the following Typogenetic Code, where two bases code for an amino acid:

First Base  Second Base
A                A (punctuation)
                   C cut (s)
                   G del (s)
                   T swi (r)
C                A mvr (s)
                   C mvl (s)
                   G cop (r)
                   T off (l)
G                A ina (s)
                   C inc (r)
                   G ing (r)
                   T int (l)
T                A rpy (r)
                   C rpu (l)
                   G lpy (l)
                   T lpu (l)

For example, we convert:

TAGATCCAGTCCACATCGA

into

rpy-ina-rpu-mvr-int-mvl-cut-swi-cop.

"By its primary structure is meant its amino acid sequence. By its tertiary structure is meant the way it likes to 'fold up.'"

For example, the previous enzyme can be folded up (s=straight, r=right, l=left -- think Logo) as:

cop
^
swi <= cut <= mvl <= int
                                    ^
                                    mvr
                                    ^
            rpy => ina => rpu

The author assumes that the first segment is always =>. Then the direction of the last arrow gives the binding preference of the enzyme (=> is A, ^ is C, v is G, <= is T).

"If we do so, then the last segment determines the binding-preference, as shown in the figure. So in our case, we have an enzyme which likes the letter C."

Hofstadter now defines punctuation, genes, and ribosomes:

CCGATACTAAACCGA

codes for two enzymes:

cop - ina - rpy - off and cut - cop

The punctuation AA divides the strand into two genes. The ribosome is the mechanism that reads strands and produces enzymes -- the player of the game.

We are now on page 512. The link above gives some more examples for the game of Typogenetics -- the puzzle is to find a Typogenetical self-rep:

https://www.bamsoftware.com/hacks/geb/index.html

The hardest challenge was the Typogenetical self-rep on page 512:
...it would be most interesting to devise a self-replicating strand. This would mean something along the following lines. A single strand is written down. A ribosome acts on it, to produce any or all of the enzymes which are coded for in the strand. Then the enzymes are brought into contact with the original strand, and allowed to work on it. This yields a set of “daughter strands”. The daughter strands themselves pass through the ribosomes, to yield a second generation of enzymes, which act on the daughter strands; and the cycle goes on and on. This can go on for any number of stages; the hope is that eventually, among the strands which are present at some point, there will be found two copies of the original strand (one of the copies may be, in fact, the original strand).

I spent fruitless hours with pencil and paper, thinking there must be a short, self-evident solution. To cut to the chase, I wrote a program that found these seven self-reps: CGTATCTCCG, CGTATCTCTG, CGTCTCTAAG, CGTCTCTAGG, CGTTTCTTTG, CGTTTTTCTG, and CGTTTTTTTG.

There are more examples of self-rep at the link above. Returning to Hofstadter, his next big idea is the Central Dogma of Typogenetics:

enzymes => strands (typographical manipulation)
strands => enzymes (translation of ribosomes)

He makes analogies between Typogenetics and other formal systems, such as MIU:

rules of inference => strings (typographical manipulation)

"Similarly for TNT, and all formal systems. However, we have seen that in TNT, levels are mixed, in another sense."

Here he's referring to TNT's ability to make statements about itself. At this point, the author now compares the strange loops in TNT to real genetics (not our simplified version). In real DNA, for example, the bases are:

purines:
A: adenine
G: guanine

pyrimidines:
C: cytosine
T: thymine

There are several pictures here of the actual molecular structure of these four bases.

"It is the bases which are responsible for the peculiar kind of pairing which takes place between strands."

A picture of the DNA double helix is placed here.

"Single-stranded DNA does not exhibit this kind of coiling, for it is a consequence of the base-pairing. As was mentioned above, in many cells, DNA, the ruler of the cell, dwells in its private 'throne room': the nucleus of the cell."

The nucleus then communicates with the cytoplasm via messenger RNA, or mRNA. Here, mRNA is like DNA except that U, uracil, replaces T, thymine. Then mRNA is transcribed from DNA:

    DNA: ........CGTAAATCAAGTCA........ (template)
mRNA: ........GCAUUUAGUUCAGU........ ("copy")

"Enzymes belong to the general category of biomolecules called proteins, and the job of ribosomes is to make all proteins, not just enzymes."

Here's a list of the real amino acids:

ala -- alanine
arg -- arginine
asn -- asparagine
asp -- aspartic acid
cys -- cysteine
gln -- glutamine
glu -- glutamic acid
gly -- glycine
his -- histidine
ile -- isoleucine
leu -- leucine
lys -- lysine
met -- methionine
phe -- phenylalanine
pro -- proline
ser -- serine
thr -- threonine
trp -- tryptophan
tyr -- tyrosine
val -- valine

The Genetic Code is the conversion of three bases (codons) into amino acids. Unlike Typogenetics, three bases are required. This chart is more complex, and so I don't post it.

"It could be said that this process of translation is at the very heart of all of life, and there are many mysteries connected with it."

Therefore according to the author, CUAGAU would be divided as CUA-GAU, not Cu Ag Au. (Okay, Hofstadter, I get your little joke here -- Cu Ag Au is "copper, silver, gold.")

"In fact, it is one of the outstanding problems of contemporary molecular biology to figure out some rules by which the tertiary structure of a protein can be predicted if only its primary structure is known."

The 520's

"Another discrepancy between Typogenetics and true genetics -- and this is probably the most serious of all -- is this: whereas in Typogenetics, each component amino acid of an enzyme is responsible for some specific 'piece of the action,' in real enzymes, individual amino acids cannot be assigned such clear roles. It is the tertiary structure as a whole which determines the mode in which an enzyme will function; there is no way one can say, 'This amino acid's presence means that such-and-such operation will get performed.'"

Here Hofstadter shows a picture of the structure of myoglobine, deduced from hi-res X-ray data.

"It is still possible in principle to write a computer program which takes as input the primary structure of a protein, and firstly determines its tertiary structure, and secondly determines the function of the enzyme."

At this point, the author describes transfer DNA and ribosomes. He points out that the Genetic Code is stored in the DNA itself:

"When a new codon of mRNA clicks into position in the ribosome's 'playing head,' the ribosome reaches out into the cytoplasm and latches onto a clover whose anticodon is complementary to the mRNA codon."

Here there is a picture of a section of mRNA passing through a ribosome.

"Of course it is no accident that 'clovers' carry the proper amino acids, for they have all been manufactured according to precise instructions emanating from the 'throne room.' The real name for such a clover is transfer RNA."

Now the author writes about punctuation and the reading frame. There are three codons for punctuation here, UAA, UAG, UGA. Yet we can't fully tell where the genes actually start and end:

"There is even one gene contained entirely inside another! This is accomplished by having the reading frames of the two genes shifted relative to each other, by exactly one unit."

The author now makes an analogy with proteins, art, and music. All of these contain several different levels of structure and meaning:

"The four levels of primary, secondary, tertiary, and quaternary structure can also be compared to the four levels of the MU-picture in the Prelude, Ant Fugue."

Here there is a picture of a polyribosome, where one strand of mRNA passes through one ribosome after another.

"The corresponding image in music is a rather fanciful but amusing scenario: several different copyists are all at work simultaneously, each one of them copying the same original manuscript from a clef which flutists cannot read into a clef which they can read."

Here there is a picture of an even more complex scheme in mRNA, similar to two-part music. The author moves on to describe protein function and where enzymes bind to other molecules:

"This location is called its active site, and any molecule which gets bound there is called a substrate. Enzymes may have more than one active site, and more than one substrate."

But there is a need for a sufficiently strong support system:

"Now it is futile to hope that a strand of DNA in isolation could be a self-rep; for in order for those potential proteins to be pulled out of the DNA, there must not only be ribosomes, but also RNA polymerase, which makes the mRNA that gets transported to the ribosomes."

The 530's

"And so we have to begin by assuming a kind of 'minimal support system' just sufficiently strong that it allows transcription and translation to be carried out."

According to Hofstadter, DNA self-reps in two steps:

(1) unravel the two strands from each other;
(2) "mate" a new strand to each of the two new single strands.

Three enzymes are required here:

"The precision three-enzyme machine proceeds in careful fashion all the way down the length of the DNA molecule, until the whole thing has been peeled apart and simultaneously replicated, so that there are now two copies of it. Note that it the enzymatic action on the DNA strands, the fact that information is stored in the DNA is just plain irrelevant; the enzymes are merely carrying out their symbol-shunting functions, just like rules of inference in the MIU-system."

He now returns to computers and whether they can be used to find levels of meaning in DNA:

"The output of such a pseudo-epigenesis program would be a high-level description of the phenotype. There is another (extremely faint) possibility: that we could learn to read the phenotype off the genotype without doing an isomorphic simulation of the physical process of epigenesis, but by finding some simpler sort of decoding mechanism."

I assume the following is a joke here, but according to the author, the following is a section of the DNA of Felis catus (the common house cat):

...CATCATCATCATCATCATCATCATCATCAT...

Now Hofstadter describes the Central Dogmap -- the main analogy of the chapter. He mentions a couple of analogies -- one between molecular biology and mathematical logic, and the other between molecular bio and the Contracrostipunctus Dialogue (broken records and whatnot).

"The mapping from one onto the other is laid out in the figure and the following chart, which together constitute the Central Dogmap. Note the base-pairing of A and T (Arithmetization and Translation), as well as of G and C (Godel and Crick)."

Here is the first analogy:

DOGMA I (Molecular Biology) <=> DOGMA II (Mathematical Logic)
strands of DNA <=> strings of TNT
strands of mRNA <=> statements of N
proteins <=> statements of meta-TNT
proteins which act on proteins <-> statements about statements of meta-TNT
proteins which act on proteins which act on proteins <-> statements about statements about statements of meta-TNT
transcription (DNA=>RNA) <=> interpretation (TNT=>N)
Translation (RNA=>proteins) <=> Arithmetization (N=>meta-TNT)
Crick <=> Godel
Genetic Code (arbitrary convention) <=> Godel Code (arbitrary convention)
codon (triplet of bases) <=> codon (triplet of digits)
amino acid <=> quoted symbol of TNT used in meta-TNT
self-reproduction <=> self-reference
sufficiently strong cellular support system to permit self-rep <=> sufficiently powerful arithmetical formal system to permit self-ref

The author completes this analogy by introducing the Godel Code:

(odd) 1 <=> A (purine)
(even) 2 <=> C (pyrimidine)
(odd) 3 <=> G (purine)
(even) 6 <=> U (pyrimidine)

There is even another chart for the Godel Code, where three symbols code for an "amino acid," or symbol of TNT.

"One can therefore draw parallels between all three systems:"

(1) formal systems and strings;
(2) cells and strands of DNA;
(3) record players and records.

"In the following chart, the mapping between systems 2 and 3 is explained carefully."

And here is the second analogy:

Contracrostipunctus <=> Molecular Biology
phonograph <=> cell
"Perfect" phonograph <=> "Perfect" cell
record <=> strand of DNA
record playable by a given phonograph <=> strand of DNA reproducible by a given cell
record unplayable by a that phonograph <=> strand of DNA unreproducible by that cell
process of converting record groves into sounds <=> process of transcription of DNA onto mRNA
sounds produced by record player <=> strands of messenger RNA
translation of sounds into vibrations of phonograph <=> translation of mRNA into proteins
mapping from external sounds onto vibrations of phonograph <=> Genetic Code (mapping from mRNA triplets onto amino acids)
breaking of phonograph <=> destruction of the cell
Title of song specially tailored for Record Player X: "I Cannot Be Played on Record Player X" <=> High-level interpretation of DNA strand specially tailored for Cell X: "I Cannot Be Replicated by Cell X"
"Imperfect" Record Player <=> Cell for which there exists at least one DNA strand which it cannot reproduce
"Todel's Theorem": "There always exists an unplayable record, given a particular phonograph." <=> Immunity Theorem: "There always exists an unreproducible DNA strand, given a particular cell."

He restates the analog of Godel's Theorem for cells:

It is always possible to design a strand of DNA which, if injected into a cell, would, upon being transcribed, cause such proteins to be manufactured as would destroy the cell (or the DNA), and thus result in the non-reproduction of that DNA.

"This conjures up a somewhat droll scenario, at least if taken in light of evolution: an invading species of virus enters a cell by some surreptitious means, and then carefully ensures the manufacture of proteins which will have the effect of destroying the virus itself!"

Here there is a picture of the T4 bacterial virus (or "phage" and the E. coli bacterium:

"Thus the phage commits 'rape' on a tiny scale. What actually happens with the viral DNA enters a cell?"

And according to another picture, viral infection begins.

"The sequence of actions directed by the T4 phage has been carefully studied, and is more or less as follows."

0 min. Injection of viral DNA
1 min. Breakdown of host DNA.
5 min. Replication of viral DNA begins
8 min. Initiation of production of structural proteins which will form the "bodies" of new phages.
13 min. First complete replica of T4 invader is produced.
25 min. Lysozyme (a protein) attacks host cell wall, breaking open the bacterium, and the "bicentuplets" (200 copies of the virus) emerge.

The 540's

"Thus, when a T4 phage invades an E. coli cell, after the brief span of about twenty-four or twenty-five minutes, the cell has been completely subverted, and breaks open."

The virus uses recognition, disguises, and labeling to be successful:

"But if the host cell has some special mechanisms for examining whether DNA is labeled or not, then the label may make all the difference in the world."

Hofstadter compares viruses to Henkin sentences in TNT. We start with something like:

Ea: Ea': <TNT-PROOF-PAIR{a,a'}^ARITHMOQUINE{a",a'}>

"Now by arithoquining this very uncle, you get a Henkin sentence. (By the way, can you spot how this sentence differs from ~G?)"

Ea: Ea': <TNT-PROOF-PAIR{a,a'}^ARITHMOQUINE{SSS...SSS0/a",a'}>

where there are h S's, with h the Godel number of the original "uncle." Both Henkin sentences and viruses undergo self-assembly.

"Not only viruses, but also some organelles -- such as ribosomes -- assemble themselves. Sometimes, enzymes may be needed -- but in such cases, they are recruited from the host cell, and enslaved."

There are two outstanding problems: differentiation and morphogenesis. How does a complex organism have so many different types of cells?

"How are homing instincts built into the brain of a bird, or hunting instincts into the brain of a dog? In short, how is it that merely by dictating which proteins are to be produced in cells, DNA exercises such spectacularly precise control over the exact structure and function of macroscopic living objects?"

At this point the author describes feedback and feedforward (to control the amount of a certain substance in a cell). Repressors and induces reduce excess enzymes in a cell:

"The effect of the successful repression of an operon is that a whole series of genes is prevented from being transcribed, which means that a whole set of related enzymes remains unsynthesized."

The author compares feedback to strange loops:

"A hypothesis like this could account for the phenomenal differences between cells in different organs of the body of a human being. The process by which one initial cell replicates over and over, giving rise to a myriad of differentiated cells with specialized functions, can be likened to the spread of a chain letter from person to person, in which each new participant is asked to propagate the message faithfully, but also to add some extra personal touch."

He gives a mathematical example of differentiation, where cells can reproduce identically except with a different value of N, and these are used to calculate 1 - 1/3 + 1/5 - 1/7 + 1/9 ... = pi/4.

"I hope that the descriptions of processes such as labeling, self-assembly, differentiation, morphogenesis, as well as transcription and translation, have helped to convey some notion of the immensely complex system which is a cell -- an information-processing system with some strikingly novel features."

He summarizes how level mixing occurs in the cell, and compares it to a computer program (where we have high-level language, assembly language, and so on):

"What we have seen is that nature feels quite comfortable in mixing levels which we tend to see as quite distinct."

Hofstadter concludes the chapter by asking, where does the Genetic Code come from? In other words, what is the origin of life?

"And perhaps experiencing that sense of wonder and awe is more satisfying than having an answer -- at least for a while."

Traditionalists: Fraction Multiplication

With this post being so long, there's not much room for traditionalists here. But today, Barry Garelick makes his first post in over a month, so let me at least link to it:

https://traditionalmath.wordpress.com/2019/06/14/misunderstandings-about-understanding-dept/

What do we mean by “understanding” in math? I gave a talk about this at the researchED conference in Vancouver. I have included an excerpt from my talk, and added some commentary at the very end which is designed  1) to further elucidate the issues and 2) to infuriate those who disagree with my conclusions.

OK, so I can tell already that this will be the usual complaint about the Common Core requiring students to demonstrate understanding.

I find the following part of Garelick's post interesting:

Many of us math teachers do in fact teach the conceptual understanding that goes along with an algorithm or problem solving procedure. But there is a difference in how novices learn compared to how experts do. Requiring novices to retrieve understanding can cause cognitive overload. Anyone who has worked with children knows that they are anxious to be able to solve the problem, and despite all the explanations one provides, they grab on to the procedure. The common retort about such behavior is that such behavior comes about because math is taught as “answer getting”. But as students acquire expertise and progress from novice to expert levels, they have more stored knowledge upon which to draw. Experts bundle knowledge around important concepts called “neural links” which one develops in part through “deliberate practice”.

Since this is still a Hofstadter post, we can view this paragraph in light of his book. This bundling of knowledge and "neural links" remind me of the symbols in the brain that Hofstadter describes in Chapter 11, "Brains and Thoughts."

But most of Garelick's post is complaining about having to draw pictures to show understanding of fraction multiplication. Traditionalists say that students shouldn't have to draw the pictures -- they should just follow the algorithm and multiply the numerators and denominators.

Here's why I disagree with Garelick -- students get algorithms mixed up. For example, how many times has Garelick see students add the denominators in fraction addition problems? If multiplication were the only fraction operation that students had to learn, I'd agree with Garelick. But students must learn how to add fractions too.

I believe that by drawing pictures, students can see why 1/2 + 1/3 is 5/6 and not 1/5, 2/5, or any fraction with a 5 in the denominator. The idea is that by doing enough of these problems, students will see a problem like 1/2 + 1/3 and avoid trying to add the denominators. When that happens, they no longer need to draw any pictures.

Students who try to add the denominators in a fraction addition problem will never be able to solve that same college placement exam problem that Garelick posted once again.

There's only one comment here worth responding to:

Chester Draws:
I like to ask the proponents of the area model to do ones with improper fractions, variables and negatives. What does the box for x/2 x -5/4 look like?
Since that is where we need to get to, why would we go down a dead end path on the way, only to do it the traditional way in the end anyway?
OK, let's try the addition problem x/2 + (-5/4). A student tries to add the denominators and gives something over 6 as the answer. How would Draws or Garelick convince this student that adding the denominators is wrong?

With the area model, it's easier to see why 1/2 + 1/4 is 3/4, and why halves added to fourths do not make sixths. Of course we can't show x/2 + (-5/4) directly using area, but once again, after the students learn that they must convert halves to fourths before adding them to fourths, they'd be ready to add algebraic fractions.

I'm willing to compromise with Draws and Garelick here. We drop the area model for multiplication provided that we keep it for fraction addition, where many more mistakes occur because the students have trouble remembering the standard algorithm.

Conclusion

I've learned a lot about molecular biology in today's Hofstadter chapter. Even devoting an entire post to this chapter, I still feel that I've left out or glossed over a lot. There is much more in this chapter than what an eighth grader is expected to learn about DNA under Next Generation Science Standards.

But we're now finally done with our side-along reading of Hofstadter -- at least all that I'm going to post about it on the blog. Once again, this was our spring reading book, and I won't make my next post until it's officially summer.

No comments:

Post a Comment