Protein Computer lab PDF

Title Protein Computer lab
Course Biological Chemistry
Institution Hunter College CUNY
Pages 6
File Size 191.9 KB
File Type PDF
Total Downloads 42
Total Views 138

Summary

Protein Computer Lab Assignment Biological Chemistry Lab...


Description

Computer Lab: Protein Structure This is a one-day lab designed to demonstrate how protein structures are formed and how the structure of a protein is related to its function. Many proteins, such as enzymes, can lose their ability to function if their structure is altered by temperature, pH or solvent change. This is called protein denaturing and you can observe it in egg albumen proteins when an egg is cracked into a hot pan; the clear albumen proteins are denatured, lose their structures and turn into the tangled mess of the semi-solid, egg white. If functionally favorable conditions are returned, some proteins, such as RNase, can refold into their active form. Most proteins cannot do this, they remain permanently denatured and no longer perform their functions (e.g. on cooling, egg albumen proteins stay denatured and do not go back into solution). Some proteins, when they are created, require helper proteins called chaperones, in order to achieve their correct functional conformations. Chaperones also fix misfolded proteins in the cell and play a role in protein misfolding diseases such as the neurodegenerative diseases Alzheimer's, Parkinson's and Huntington's where misfolded proteins accumulate in the brain. There are four levels of structure that determine a protein's final structure. The first (primary structure) level is the protein's sequence of amino acids. The second level (secondary structure), is composed of structures such as alpha-helixes, beta-sheets and random coils. The third level (tertiary structure) denotes the arrangement of areas of related secondary structures to form compact globules called domains. The fourth level (quaternary structure), is created by the association of separate folded peptides into a final protein molecule. The structures of protein molecules are determined by obtaining data from several techniques. The primary structure (peptide sequence) is determined by protein sequencing. The amount of secondary structures can be estimated by the technique of circular dichroism (CD) which utilizes the differential absorption of polarized light by -helixes, -sheets and random coils. The position of secondary structures in a peptide can be guesstimated by using computer software which gives the probability of a given peptide sequence forming alpha helixes, beta-sheets or random coils. The overall tertiary and quaternary structure must be solved by using the techniques of X-ray crystallography and/or nuclear magnetic resonance (NMR). For these techniques to work, the protein must first be highly purified. For X-ray crystallography, the protein is carefully crystalized under exacting conditions in order to get a perfect crystal. The crystal is then mounted on a stand and rotated while being subjected to an X-ray beam. The X-rays are diffracted off the crystal's inner structures and onto a piece of X-ray film. The deflected X-rays form a pattern of spots on the film (see below). The pattern is interpreted by mathematical analysis (Fourier transform) in order to determine the structure of the protein. The first protein structure to be determined by this method was myoglobin in 1957 by John Kendrew.

146

X-ray diffraction pattern (actual size) of yeast phenylalanine tRNA courtesy Gary Quigley PhD, Hunter College

Once a protein's structure has been determined, the data generated is submitted to the Protein Data Bank (PDB). The data is compiled into a file in which the spatial positions of each atom in the protein is specified by three-dimensional coordinates (x, y, and z). This information can be obtained from PDB and downloaded onto a computer where a structure modeling program can be used to interpret the data and produce a three-dimensional model of the molecule on the computer screen. The model can be rotated and looked at in various modes such as space-fill, stick model, or ribbons with various features of the model highlighted in different colors. In this lab you will explore the PDB data files for some protein molecules, experiment with the display of this data using the DS Visualizer display program (Accelrys) and interpret the structures that you find in order to answer some questions about the proteins.

147

Protein Structure Computer Lab 1. Access the Internet by double clicking the Internet Explorer icon on the computer screen. 2. The Brookhaven Protein Data Bank (PDB) should be the homepage. If it is not, access it in the favorites as RCSB Protein Data Bank or by typing the URL location: http://www.rcsb.org/. 3. In the white box at the top of the PDB page (to the right of “SEARCH | All Categories:”), type in the name of a protein from the table on the next page. Other search suggestions will appear in a text box, if the item appears under "PDB TEXT" you can click on it otherwise, click the search button on the right (looks like a magnifying glass). 4. Record in the table below, the number of “structure hits” for that molecule from the tab at the top left of the page. If there are additional search characteristics needed, go down to the “Query Refinements” box and click on the appropriate items. Try to narrow down the number of entries that you will need to look through to a reasonable number. 5. Look through the titles of the molecules and pick one that meets all the requirements needed and is not a mutant structure. On the worksheet below, write down the PDB I.D. number (the number on the left above the thumbnail picture of the structure). Also, record the PDB title along with the release date. Select the structure by clicking the I.D. number or the title. 6. If this is the molecule you wish to download, click “Download Files” (in the upper right next to the PDB number). From the dropdown list, select “PDB File (Text)”. Click on “save” file, click on the “student data” folder, name the file, and then click “save”. 7. Repeat steps 3 through 6 with each of the molecules listed in the table below. Also, pick any protein you are interested in (that is not on the list) and record its information and structure date. 8. Once you have the seven molecules, open the “student data” file and click on one of the saved files and it should open in the program DS Viewer. Using the various display options in the “View” menu, you should be able to observe the structure of the proteins you have saved and answer the questions listed below. You can also view the structures in 3D stereo by clicking, in the “View” menu “stereo” and “hardware”. It will look like two images of the molecule superimposed but by putting on the shuttering eyewear the molecule will appear to be three dimensional.

148

149

Name: Srinjoy Goswami,

protein name

Section #10, Date 12/12/20

CD4

refined search characteristics human, x-ray

PDB file deposit number date 3CD4 1992

# of structures:1219 Trypsin

human, x-ray

2RA3

2007

Human Cationic trypsin complexed with bovine pancreatic trypsin inhibitor (BPTI)

# of structures:1635 Hemoglobin

human, x-ray

1FDH

1976

Structure of Human Foetal Deoxyhaemoglobin

yeast, lsozyme 1, reduced

2CYP

1984

human

1XFB

2004

Crystal Structure of Yeast Cytochrome C Peroxidase refined at 1.7 angstroms resolution Human Brain Fructose 1,6 (bis)Phosphate Aldolase (C isozyme)

1LRP

1983

2FHA

1997

# of structures:603 Cytochrome C # of structures:2246 Aldolase # of structures:408 Lambda repressor # of structures:9976 Ferritin

Human X-ray

PDB file title Refinement and analysis of the first two domains of human CD4

Comparison of the structures of CRO and Lambda Repressor Proteins from Bacteriophage Lambda Human H chain Ferritin

# of structures:111 1) Which of the proteins is a DNA binding protein? Lambda Repressor. Cytochrome C is a sequence specific DNA binding protein according to 2004 paper in yeast. Citation: Bhatnagar A, Raghavendra PR, Kranthi BV, Rangarajan PN. Yeast cytochrome c is a sequencespecific DNA-binding protein. Biochem Biophys Res Commun. 2004 Sep 3;321(4):900-4. doi: 10.1016/j.bbrc.2004.07.044. PMID: 15358111. 2) Which proteins are composed primarily of -sheet? CD4 and Trypsin 3) Which proteins are composed primarily of -helix? Hemoglobin, Cytochrome C, Aldolase, Lambda repressor, Ferritin 4) Which of the proteins is a "beta-barrel" protein? CD4 5) What kinds of information are in the 3-D structure that you cannot learn from the protein sequence? A few things can be given from 3-D structure that the sequence wont give or give less information is the type of bonding that occurs. The interactions, while the amino acid sequence does give some idea, the entire story with how it interacts with side groups cannot fully understand without the 3-D structure. 6) Can you predict the 3-D structure from the sequence? Why or why not? While the sequence can provide some ideas about the potential interactions because of the amino acid sequence for example if there is valine, then there is going to be a hydrophobic interaction somewhere, but no a 3-D structure cannot be fully predicted from the sequence alone. 7) How many structures are contained in the PDB? How many proteins that have been sequenced? What percentage of the proteins that have been sequenced have known structures? 171,916 structures are contained are contained in the PDB file. There are 168,293 proteins contained in the structure. 186,482,096 Proteins have been sequenced.

150

(168,293 / 186,482,096) * 100 = 0.09%

8) Outline what you would need to know to answer the following question: of all the proteins that exist on earth at this time, what fraction of them have a known structure? Take into consideration that many proteins in an organism are splice variants and that mutations of the wild type can occur in individuals. Also, do you think it is now, or would it ever be, possible to answer this question, and what would have to be done to get the answer? Can an estimation be made, and what factors would need to be estimated? Discuss these questions in class before submitting your answer. We currently as per the question above know roughly about 0.09% of the percentage of proteins that have been sequenced with known structures. However, even the accuracy of that number is debateable. I think that the methodology to go about this is finding every single organism, take a sample and sequence it and do the same with know organisms and consider splice variants and mutations. In theory, even with the advancement of technology such as Google’s AI technology makes it possible in theory, the shear scope of what needs to be done makes it unlikely.

151...


Similar Free PDFs