BIO/CSC295 2011F, Class 21: Phylogenetics (2) Overview: * Web Exploration, Revisited. * Discussion of Krings et al. * Project Elevator Talks. Admin: * Due Tuesday: Short reviews of your colleagues' proposals. * EC for Sunday's ISO Food Bazaar (sold out) * EC for Singers and Community Choir (7:30 p.m. Friday in Sebring-Lewis) * EC for Orchestra Concert 2pm on Sunday (Songs of Sorrow, plus more) * EC for FreeSounds concert - Showvember - 9 or 9:30, or something like that in Gardner. See CB play music Web Exploration * Mutation in Humans of the RecQ gene causes premature aging * Why do we see "RecQ-like" in lots of genomes? * Not just variation within multiple subjects in the species * But rather multiple copies/versions within the same genome - at different loci * There are three or four RecQ genes (on different chromosomes) * Genes get duplicated * Sometimes from whole genome duplication * And then divergence * When you do a RecQ search, it can be important to establish which version of the gene you're looking at. * When doing phylogeny, you need to make sure that you're dealing with the true ortholog and not the derivatives * But it's hard to figure out which one is the true ortholog * The true ortholog maintains essentially the same function * Figuring out function is not obvious from looking at the sequence * Lazy Sam would just include all of them, but that might be a good starting point. * Evidence of danger of GMO foods - Mice got dna from rice! - That was a joke. * If we were seriously trying to do evolutionary relationships, we would work with a larger population (and probably more similar things) * And we see the danger of non-orthologous genes * GIGO - You need the right collection of genes to look at * An important moral of bioinformatics * Biologists using computational tools without understanding them: Bad * Computer scientists using biological data without understanding them: Also Bad Moving on to Krings et al. * What is the overall objective of the paper: Did humans contribute DNA to humans? * Their conclusion: No, it did not. * Technique for coming to that "controversial" conclusion? * Quantitative pcr on mtDNA - Mitochondrial DNA * We have lots of mitochondria in each cell, rather than one nucleus * Passed from mother to children * We believe that they are bacterial in origin * Mitochondria have their own genomes * Creation of Zygote; fusion of sperm and egg. Sperm mitochondria get broken down in that creation * The mitochondria were there in the oocyte * The use of mtDNA is relevant because it means you need an unbroken matralineal line. * If at any point, that line is broken, you wouldn't see contributions * So it may not say as much about nuclear DNA * So why look at mitochondria? * There's more of it - It's easier to do, particularly for fossils * You don't want to use a lot of your samples - they are not abundant * Are all mitochondria in your cells identical? * There is a possiblity of heteroplasmy - not contamination, but can skew PCR results * Contamination is also a big issue * How do you amplify? PCR * They used human primers - Hmm ... might lead to more contamination * How do you deal with these problems? * Replication - in the same lab, at multiple labs * Controls * Clean rooms * How do you know that the result is neandertal? * PCR makes mistakes! And certainly late 1990's polymerase makes mistakes * Degraded template is likely to increase errors * Getting better results * Shorter sequences * Overlapping sequences * Multiple samples - Clone, then sequence the clone * Errors get amplified * Multiple runs, sequence multiple times from each run * How do you interpret the data (Figure 2)? * What critiques? * A fairly small segment, compared to the full genome * How did they go from figure 2 to figure 4? * They build a new primer based on what they know * Analysis: Compare humans-vs-humans, humans-vs-neandertal, human-vs-chimps (chimps-vs-neandertals?) * What would have happened if we had more neandertals? * Are you persuaded? * Well, this isn't the only evidence; it is supported by other evidence (e.g., morphological evidence) * Cell is not typically a human evolution journal * Is this published, in part, b/c it's headline-grabbing? * The methods make it more meaningful - What the caveats are, how we addressed issues - They show a high level of care. * You may not be persuaded by their analysis of the correctness of PCR * Is it reproducible? Well, you could take a different fossil and do the same thing. * Partial validation from another laboratory * Remember: Every paper has limits and flaws; As a reader, you need to recognize the possible caveats * A coming attraction: the 2010 paper (Science 2010) * Whole neandertal genome! (From new sequencing technology.) * 99.5% similarity to humans * And it looks like we have some neandertal DNA in our genomes - Occured in middle east prior to migtrations to Europe and Asia * T: Michael, Kevin, BenSpears, James * Reviews by Jonah, Jay, Ian, Guadalupe * U: Jonah, Bill, Bonnie, Logan * Reviews by Radhika, BenSuh, Nancy, Abraham * V: Jay, Radhika, Marcus, Noelle * Reviews by Chris, Pelle, Karissa, Michael * W: Ian, BenSuh, Chris, David, Bozo * Reviews by Katherine, Dilan, Kevin, Bill * Y: Guadalupe, Nancy, Pelle, Katherine * Reviews by Bogdan, BenSpears, Bonnie, Marcus * Z: Abraham, Karissa, Dilan, Bogdan * Reviews by James, Logan, Noelle, Bozo, David * Each group gets three minutes to present. * Everyone else gets three-to-five minutes to comments. * We distribute proposals for further review. * T * Look at functionally similar groups of proteins and use that to find other proteins of that functional group * E.g., transcriptional regulators from mice, yeast, bacteria * Align with ClustalW * Scoring method for applying to sequences of the genome * U * Neurodegenrative disease * There's a highly conserved structure (polyq) * Modify protein finding algorithm to find that structure * Characterize region around the structure * Look at "normal" patients to see if they might develop this region * Hypothesis: It's not the structure, but some surrounding region * Pretty ambitious * What is new? A different approach to the same issue * V * Sense of touch * Considering sense of pain: nociceptors * Interesting biology of receptor * Goal: Find in genome * They have databases of these in mice in rats * Success will depend on current biological sequence knowledge * Test case: Run on already sequenced * Then run on sequenced by not analyzed * Perhaps similar to Chou-Fasman * How are you going to deal with differences? Just conserved reasons. * W * Phylogeny looking at cDNA or Protein or both * Homologous and functionally similar sequences * Mostly incorporating preexisting algorithms, but one big addition * Some weighting depending on conservation - affects scoring * How much does this vary from current algorithms? Need to check the literature. * Y * Trying to identify novel transcription factors * Gather some number of sequences - What DNA binding motif does it exhibit? * Do something CF-esque to identify binding motifs - Relationship between motif and DNA sequence it binds to * Might then identify DNA sequences that get bound * Expandable, depending on how well early things go * Start by finding 90ish sequences; 50 for building the algorithm, testing on additional 40; Then work on uncharacterized or putative transcription factor * Transcription factors are wobbly, so this is difficult * Vida says "There's a lot going on ..." * They've found a nice database. * Using 3D database for more information * Z * Taking advantage of contact at UNC Pharma * Identified protein that binds to 8mer * Also a splicing promoter * Question of which 8mer we should bind to to get the splicing we want * Need an appropriate scoring metric - one of their challenges * Want to minimize off-target effects