Department of Energy Argonne National Laboratory Office of Science NEWTON's Homepage NEWTON's Homepage
NEWTON, Ask A Scientist!
NEWTON Home Page NEWTON Teachers Visit Our Archives Ask A Question How To Ask A Question Question of the Week Our Expert Scientists Volunteer at NEWTON! Frequently Asked Questions Referencing NEWTON About NEWTON About Ask A Scientist Education At Argonne Amino Acid Sequencing
Name: Ben I.
Status: Student
Age: 15
Location: N/A
Country: N/A
Date: 2001

How do biologists determine the amino acid sequence of a protein molecule?

It is NOT simple...but a whole lot easier than it was for the first group who figured out the 51 amino acids that make up insulin. To day one of the first steps is figuring how many chains are in the protein. We do this by finding out how many end (called C and N termini) are in a single protein. Then we have to digest any disulfide bonds that cross-link the protein chain...Ten we separate the protein chains in a gel filtration. Then before we find the actual sequence we like to know the total # of different amino acids in each chain. This is done by a machine which separates the amino acids based on their ionic charges. Once all this is done there are a number of ways to figure out the sequence...An "old fashion way is the protein is chopped up (cleaved ) by certain enzymes and then the fragments are analyzed for overlapping similarities. Mass spectrometry is also used.

Peter Faletra Ph.D.
Assistant Director
Science Education
Office of Science
Department of Energy

Ben, there are two ways to go about this. In the "old days," one would purify as much of the protein as possible, and then subject it to a series of chemical reactions. The reaction would break off the "last" amino acid in the chain, and then scientists would use a machine to determine which of the 20 possible amino acids it was. Then the cycle would start again: break off the new "last" amino acid on the chain, identify it, etc. The process was largely automated, but is still technically challenging to do. You need a good preparation of pure protein, and even then, you might only be able to identify about the last dozen amino acids in the chain, since you take losses at every step of the procedure.

Nowadays, the sequence is often not obtained directly from the protein. Rather, the PREDICTED amino acid sequence is determined by sequencing the DNA that encodes the protein. This is actually the easier way, using DNA cloning technology. The gene is cloned, and the order of the bases (the "rungs" of the DNA ladder) is determined by a series of reactions that is (sort of) similar in concept to the way proteins are sequenced. When you have the DNA sequence, you look for the information that encodes the protein; it's sort of like reading the Morse code and printing out the coded words. Bear in mind that this information allows you to _predict_ the sequence of the protein, but on rare occasions the actual protein sequence may differ (technical disclaimer).

Incidentally, a nice gentleman named Fred Sanger was instrumental in working out the first method to sequence the actual protein. For this he was awarded the Nobel Prize. Many years later, he was instrumental in working out the way we presently sequence DNA; for this, he was awarded his second Nobel Prize!

Paul Mahoney, Ph.D.

Click here to return to the Molecular Biology Archives

NEWTON is an electronic community for Science, Math, and Computer Science K-12 Educators, sponsored and operated by Argonne National Laboratory's Educational Programs, Andrew Skipor, Ph.D., Head of Educational Programs.

For assistance with NEWTON contact a System Operator (, or at Argonne's Educational Programs

Educational Programs
Building 360
9700 S. Cass Ave.
Argonne, Illinois
60439-4845, USA
Update: June 2012
Weclome To Newton

Argonne National Laboratory