Monday, July 6, 2009

how can genes possibly do all they do?

There are a little over 3 billion base pairs of DNA in the human genome. The four possibilities for each one (A C G T) can be encoded in two bits, so that's 750 Megabytes of information, and the genome has lots of common sequences and repeating sections so you can compress it down a lot. It amounts to as much data as there is on a music CD, far less than on a DVD. That is completely ridiculous!

parts of earThose genes not only lay out the biochemistry of individual cells, they also are blueprints for the physical arrangement of our bodies. When you go to a doctor's office and see the pink medical illustrations of something like your eye or intestine or ear, and notice that every single nook and layer has a name, somehow your genes result in that structure, and thousands upon thousands of other structures in your body. Just the CAD program for the shape of the bones in your inner ear would take up many megabytes. The design and operating manual for the eye would be thousands of pages. You could accept this if each cell had specialized blueprints, but every one of 10 trillion cells in your body carries the identical DNA. How can one compact set of instructions make tens of thousands of complicated parts!!

But wait, there's more. The same genome also codes for innate behaviors. Bees dancing in a certain way. My dog biting the ankle to herd cattle. The staple of popular magazine thought: males who fool around perpetuate their DNA better so their cheatin' genes win. And the mother of all behaviors, human language. Now this would be plausible if humans had a monster additional section appended to our genome, an extra hundred million DNA pairs for altruism and brain development and tool-building and grammar and so on, but the chimpanzee genome is 95% identical to the human genome. No way!!!

This overloading is insane. It must mean that the genetic tweak to have, say, a tendency to Asperger's must also affect lower-level things, and that not all variations are possible, and that the command & control explanation of the genome "the DNA says build this way, operate this way, develop this way, behave this way" must be insufficient. I tried to put this absurdity to gene fan Matt Ridley at a talk, but he brushed it off, he maintains it's all in the genes. But in our DNA there are only "20,000–25,000 protein-coding genes, far fewer than had been expected before its sequencing. In fact, only about 1.5% of the genome codes for proteins," Huh??!

A long time ago I went to an Ask a Scientist talk by Dr. Terrence Deacon. It was dizzyingly complex and hard to follow, but he acknowledged the essential craziness. Instead of throwing up his hands and expressing disbelief as I'm doing, he (and presumably other scientists) reasons out what the mismatch between a CD's worth of information and the outcome must imply.

Very roughly speaking, our DNA can't possibly code for this complexity and yet the tiny differences in our DNA resulted in it. Therefore the complexity has to be emergent somehow. There has to be some interplay with the environment that emerges over time, at every level — with the chemistry in the cell, with the differentiating cells nearby, with the other synapses in the brain, with the other animals in the group, and now in humans, with language. As we evolve, these external factors co-evolve with us, and the system as a whole reliably produces the complexity. Some of those ideas are presented in this interview. Heady stuff, but it seems undeniable. Meanwhile, you can go to Ensembl.org and walk the genome of various animals; here are the first million base pairs on chromosome 1 in us, export as text to get the CCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCCTAACCCTAACCCTAACCCTA goodness.



  • A hacker thinks about influenza and DNA as data. "So it takes about 25 kilobits — 3.2 kbytes — of data to code for a virus that has a non-trivial chance of killing a human"

    By Blogger skierpage, at September 03, 2009 5:00 AM  

  • Given our present state of ignorance:

    Hi, found your page through Wiki on an article about Monsoon Speakers.

    I'd like to ask you a question about that page, but thought I'd jot a thought about this blog on DNA and fetal development. I was wondering when you said "emergent" you meant dialectical, or something more mundane. For example, cells have long known to communicate with each other through chemical messengers.

    The efficiency is achieved through the ruthless process of Natural Selection and billions of years. If mankind can purposely go from football-sized computers to the quantum limits of silicon in a few decades, maybe evolution can do something similar on a scale to large to comprehend. To mention "emergence" is merely to replace ignorance with speculation.

    Anyway, I found a pair of Monsoon speakers at my local Goodwill (!) and need more info than I've been able to find online. Could you please contact me so I could pick your brain??


    By Blogger Les, at October 10, 2009 10:52 PM  

  • Complexity in structures and behavior doesn't sound like "a method of argument". It does sound like the way complex systems and patterns arise out of a multiplicity of relatively simple interactions, even though that's just giving a name to a process we don't understand yet.

    To mention "emergence" is merely to replace ignorance with speculation.
    Well, it's to acknowledge that DNA is nothing like computers. The simple rules of circuit design precisely describe how circuits behave and I can see how an arbitrarily complex program maps to digital logic just by running it in a debugger. We can see how a cell transcribes the DNA in a gene into a protein, but we can't observe or simulate DNA making a cell, or the iris in my eye, or a talking human being, even though that's what DNA somehow does.

    By Blogger skierpage, at October 11, 2009 12:25 AM  

Post a Comment

Links to this post:

Create a Link

<< Home