A mighty rickroll, and the warrior gene

Cariaso subjected me to a bit of a high-tech rickroll yesterday with his helpful link into information about the SNP I shared. It wasn’t until he prodded me to actually click the link in his comment that I realized that he had sent people to rs9332964 rather than the one I posted, rs3094315. The latter is fairly boring, being distinguishable mostly for being at one end of the biggest chromosome (and the first line in my raw data from 23andme). The former, well, it appears to be correlated with a little syndrome called MICROPENIS.

Thanks Mike. Way to keep it classy.

However, the point is well taken: That’s the sort of thing that most people wouldn’t choose to share about themselves. It’s also the sort of thing that a sensitive parent wouldn’t share about their kid, thus his comments about heritability. The saying is that genes load the gun, the environment pulls the trigger – but that’s a bit subtle for the teenage crowd. “Dude, I googled you, and look what I learned about your dad!” I can see it happening in the not too distant future.

It gets worse, though, because we have no idea how the vast majority of the genome works. We’ve got the first tiny glimmers, mostly from brutally crude population correlations – but you could make a case that sharing a data point with no known effect is far more risky than the alternative. It’s roughly akin to posting a picture of yourself in front of a green screen. It just makes it all too simple for someone to photoshop you into whatever situation they choose, should the desire ever arise.

So, maybe a more interesting data point is in order. The MAOA gene has been in the news lately. It’s been nicknamed the “warrior” gene, since it seems correlated with agressive responses to provocation. The PNAS paper is pretty cool, just from the abstract. I mean, how often do you see collaborations between economists, biologists, and psychologists? The authors put it well, they “address an individual’s willingness to pay to punish others.” Also, they use hot sauce. I love hot sauce.

MAOA is on the X chromosome, and 23andme provides data for it. Here’s mine:

# rsid chromosome position genotype
rs3788862 X 43402308 A
rs6520893 X 43404120 A
rs1465108 X 43423153 A
rs909525 X 43438146 C
rs3027397 X 43442025 A
rs2283724 X 43444520 G
rs1800464 X 43456141 A
rs1800659 X 43459113 G
rs6323 X 43475980 G
rs2235186 X 43480372 A
rs2072743 X 43484465 T
rs979606 X 43486086 C
rs979605 X 43486307 A
rs1137070 X 43488335 T

Note that this one is different from the last, since I have only one copy of the X chromosome. Therefore, rather than the pair of letters from yesterday’s gene – I have only one. Also, potentially of interest, is that this “gene” is actually a long-ish series of locations on a chromosome, spanning about 86,000 base pairs. This pushes up against my favorite question in Bioinformatics: “What is a gene?” There are a number of competing possible answers – and they’re all correct to some extent.

In this case, I just want to know which population I would have been in, if I had been in the study. That’s cocktail party bioinformatics: “Hey, did you read about study X? Which kind are you?”

Unfortunately, 23andme falls down for me here. SNPedia puts it well:

the variation that has been most studied consists of a 30 base-pair variable number tandem repeat (VNTR) located in the promoter region of the gene. Alleles with 3.5 and 4 repeats are 2-10 times more productive than the allele with 3 repeats. Several studies have shown an association between the 3-repeat allele and neuropsychiatric conditions such as alcoholism, antisocial personality, impulsivity, and poor reaction to stress.

Sounds fascinating and important! However, while I can tell which base pair I have at 14 locations out of those 86,000 – what I really want to know is how many times a particular substring repeated itself along the way. None of the genetic tests I’ve taken will (directly) give me this information.

The truly astute, but novice, reader might point out that there are very specific coordinates on those base pairs in my data. If this repetition is happening in between SNP locations, they might ask, couldn’t we somehow use those coordinates to figure it out? The answer is “that’s not how it works.” The coordinates refer to the reference human genome. They’re mile posts, not measurements.

Anyway, in this case, I don’t have the data to tell which population I’m in – but I can tell you from experience that I have a strong built-in desire to punish the wrong-doers – and a little extra hit of the hot sauce sounds like a great way to do it.



Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.