BIO/CSC295 2009F, Class 19: Structure Prediction (2) Admin: * Pop-Tarts! * Don't forget the Medical Informatics talk at 4:30 today in 3821. Refreshments at 4:15 in the CS Lounge. * Don't forget the Destination ImagiNation Fun Run/Walk this weekend! * Look ahead to the Rights and the Environment symposium next week. * Today's programming project will be your next homework assignment. + Due Thursday * We anticipate returning your exams some Tuesday. + Sorry for the delay. * Next reading from the literature: Krings et al. 1997. + It's on P'Web. + Read and respond for Tuesday. * Singers, 4:30 p.m. Sunday, S-L hall Overview: * Discussion: Picking independent projects. * About the projects * Ideas * Timetable * Form of the proposal * Programming project: 7.5: Chou-Fasman. About the projects * Groups of three-four students + At least one biologist (or biochemist) + At least one non-biologist * Pick an interesting problem in bioinformatics to work on + Requires biological data [probably from NCBI] + Requires computation - Can use a pre-built computational tool - Must use some tool you build yourselves WRITE CODE! + Use tool to do some analysi + Requires writeup - What was your hypothesis - What did the tool help you learn - Etc. + Requires a proposal Questions * Do we have to use existing algorithms? + You may use algorithms we've implemented already + You may use known algorithms that need to be implemented + A variant of the Kellis algorithm on a set of data you find + You may extend/change one of the algorithms we've implemented + You may create your own algorithm + Gene finder: Pick a set of similar genes. Analyze the prefixes and suffixes of those genes for common paterns. Write an algorithm that looks for those suffixes and prefixes to find genes Project Timetable * Thursday, 12 November: Turn in project proposals * Tuesday, 17 November: Turn in peer reviews of project proposals * Thursday, 19 November: Short (five minute) overviews of propoals for visitor * Tuesday, 24 November, Status reports; In-class time for working on projects * Thursday, 26 November, Thanksgiving * Tuesday, 1 December, Status reports; In-class time for working on projects * Thursday, 3 December, PRESENTATIONS * Tuesday, 8 December, PAPERS + Code * Thursday, 10 December, Wrapup Vida says "Get together and talk about cool ideas." * Look at some of the followup projects from the book that we didn't do! Project 7-5: Chou-Fasman * Find secondary struture in proteins + Alpha helices + Beta strands / sheets + Hairpin turns * Statistical approach + "We've looked at lots of known protein structure and identified probability of each AA appearing in each structure" + For a new protein, we find sequences of high probability Your goal(s) * Implement the algorithm + Some infrastructure built already * Test it carefully find 3 proteins and test your algorithm How many did it predict? How many were actually a-helix? How many did it miss? Did it find the entire helix (or predict one too long)? Improvements - find 3 and describe them. How do you find the improvements? Literature search! Link to code in today's outline...