- THIS ARTICLE
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Stretton, A. O. W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Stretton, A. O. W.
The First Sequence: Fred Sanger and Insulin
Antony O. W. Strettonaa Department of Zoology, University of Wisconsin, Madison, Wisconsin 53706
FRED Sanger is an amazingly modest man, and his own retrospective, written after he retired, a delightful prefatory chapter for the Annual Reviews of Biochemistry, is called "Sequences, sequences, and sequences" (![]()
Especially now, with the human genome largely finished, it is almost impossible to imagine a world without sequences of proteins and of nucleic acids. The fact that it has been only 50 years since Sanger showed that there was such a thing as the unique amino acid sequence of a protein seems amazing from where we now standsequences are such a dominant part of the world we work in today. Amazing, but perhaps not surprising, when we think of, say, the change in the power of computers that has occurred over just about the same time period. (I would love to be around, marbles intact, 50 years from now, to find out how the brain really works.)
Before Fred's work, it was already known that different proteins had different amino acid compositions, different biological activities, and different physical properties and that genes had an important role in controlling them. But in a world of biochemistry dominated by the role of enzymes in intermediary metabolism, it was not at all clear how molecules as large as proteins could be synthesized; the idea that proteins were stochastic molecules, with a sort of "center of gravity" of structure but with appreciable microheterogeneity, was taken seriously. This is the paradigm that Fred's results shifted.
This essay is a celebration of his first triumph: the first complete amino acid sequence determination of a protein, the B chain of insulin, published just over 50 years ago. Although I was not involved in the work in any way, I was scientifically aware enough to feel its contemporary impact. In 1957, Vernon Ingram, who at the time was working in the MRC Unit for the Study of the Molecular Structure of Biological Systems in the Cavendish Laboratory at Cambridge (later this morphed into the Laboratory of Molecular Biology), took me on as a research student. Vernon had just shown that sickle- cell hemoglobin differed from normal hemoglobin by a single amino acid substitution, the first characterization of the molecular consequences of mutation on proteins (![]()
![]()
![]()
![]()
![]()
![]()
![]()
Another profound lesson I learned was that completely different scientific styles can be equally successful and valid and that personality has little to do with success. Again, the contrast between Fred and Francis is vast. After as little as half a minute with Francis, you know he has an exceptional mind, but with Fred it is much more subtle. He was brought up as a Quaker, which probably has a lot to do with his low-key and quiet demeanor, and during World War II he was a conscientious objector. In his 1988 article he writes, "Of the three main activities involved in scientific research, thinking, talking, and doing, I much prefer the last and am probably best at it. I am all right at the thinking, but not much good at the talking" and "Unlike most of my scientific colleagues, I was not academically brilliant," and this after he had won two Nobel Prizes! So much for academic brilliance.
Another unusual aspect of Fred's character is his ability to pick and nurture people. I first got to know Fred indirectly through his graduate student Mike Naughton. Vernon moved to MIT in 1958 and took me with him, and Mike joined Vernon's lab as a post-doc during my second year there. Mike and I became good friends, and I learned a lot of Fred's techniques from Mike, so in some way I feel like one of Fred's scientific grandchildren. Mike is a big, gentle Irishman from the West, who loved to sing and tell awful jokes. He had been a schoolteacher, and after doing his National Service in the Royal Air Force, he joined Fred as a technician. Soon he was transformed into Fred's Ph.D. student, working also with Brian Hartley on a beautiful piece of work showing that the sequences around the active sites of the pancreatic serine proteases are identical (![]()
I will quote extensively from the introduction to Fred's 1952 review, because it sets the stage beautifully for what was happening, a reevaluation of the nature of proteins. Fred wrote:
It has frequently been suggested that proteins may not be pure chemical entities but may consist of mixtures of closely related substances with no absolute unique structure. The chemical results obtained so far suggest that this is not the case, and that a protein is really a single chemical substance, each molecule of one protein being identical to every other molecule of the same pure protein.
Another earlier model (![]()
These results [the insulin sequence] would imply an absolute specificity for the mechanisms responsible for protein synthesis and this should be taken into account when considering such mechanisms.
He concludes,
It is certain that proteins are extremely complex molecules but they are no longer completely beyond the reach of the chemist, so that we may expect to see in the near future considerable advances in our knowledge of the chemistry of these substances which are the essence of living matter.
Remember that this was written only one year after Fred and Hans Tuppy had solved the structure of the B chain of insulin. The accuracy of protein synthesis remained an issue for many years after that.
The previous review of the covalent structure of proteins had been written by Synge in 1943 (![]()
... up to that time [1943] only a few simple peptides had been clearly identified from proteins by the classical and rather laborious methods of organic chemistry and Synge concluded that "the main obstacle to progress in the study of protein structure by the methods of organic chemistry is inadequacy of technique!" Probably the greatest advance that has been made recently in this field was the development byMARTIN and SYNGE 1941 of the entirely new technique of partition chromatography. The great problem in peptide chemistry has always been to find methods of fractionating the extremely complex mixtures produced by the partial degradation of a protein. Older methods of fractional crystallization and precipitation with various reagents were as a rule inadequate to deal with these mixtures, and countercurrent methods of high resolving power, which could fractionate non-volatile, water-soluble substances, were needed. Partition chromatography, especially in the form of paper chromatography (CONSDEN et al. 1944 ), is such a method, so that it has already been possible to identify as breakdown products of proteins more peptides using this technique than had previously been identified by the classical methods of organic chemistry. During the last few years, work in this field has centered largely on the development of methods, so that this review will be more a consideration of techniques and their uses than a discussion of results, which are still rather few.
| N-terminal sequences of insulin |
|---|
One of the main reasons Sanger chose insulin for this work is that it was one of the few proteins available in pure form, and it was available in gram quantities because of its medical importance. At the time, the physical chemical evidence suggested a molecular weight of about 12,000. Fred invented the N-terminal labeling method using 1:2:4 fluorodinitrobenzene (FDNB), which reacts with amino groups under mild conditions that avoid degradation of the polypeptide chain. After complete acid hydrolysis of the dinitrophenyl (DNP)-protein, the DNP groups remain attached to the N-terminal amino acid and can be isolated and identified. Fred showed that there were four N-terminal residues per 12K insulin molecule, two of which were glycine and two phenylalanine (![]()
![]()
![]()
|
The lack of tryptophan was particularly fortunate, because it degrades upon acid hydrolysis, and one of the most important methods Fred used to get at the structure of insulin, with great success, was partial acid hydrolysis, which splits the peptide bonds almost randomly (more about that later). In fact, the first amino acid sequences of insulin came from partial acid hydrolysis of DNP-labeled A and B fractions. Some DNP peptides can be extracted into ethyl acetate from acid solution and then separated by silica gel columns; since DNP compounds are usually yellow, this was real "chroma"-tography. For the B fraction, these peptides turned out to be the DNP-labeled N-terminal Phe followed by one or more other amino acids. Fred identified the DNP-amino acid and the other amino acids in the peptide after complete acid hydrolysis and then assembled the N-terminal sequence Phe.Val.Asp.Glu. Among the peptides that contained Asp or Glu, several different peptides had the same amino acid composition, so he concluded that both Asp and Glu were amidated in the original sequence.
Other DNP peptides were not extracted into ethyl acetate. They were peptides derived from internal sequences surrounding Lys, to which DNP was linked by the amino group on the side chain. Since they all had free N-terminal amino groups (liberated from internal peptide bonds by partial acid hydrolysis), they were positively charged in acid, which explains why they did not extract into organic solvents. Amino acid analysis and relabeling with FDNB to determine the end groups gave the internal sequence Thr.Pro.Lys.Ala. The A fraction yielded the N-terminal sequence Gly.Ileu.Val.Glu.Glu. This article (![]()
![]()
| The complete sequence of the B chain |
|---|
The B chain was tackled first (![]()
The completion of the sequence depended on the use of specific proteasestrypsin, chymotrypsin, and pepsinto produce larger fragments (![]()
![]()
The main advantage of protease digestion is that trypsin and chymotrypsin cleave at few sites, but do so relatively completely, so the complexity of the mixture to be separated is low. The protein is chopped into neat chunks that can be isolated and characterized by the same methods as in ![]()
Fred's initial approach to sequencing by random cleavage was too difficult for larger proteins, mainly because of the problems of fractionation of complex mixtures. At the time this work was done, the Edman sequential degradation technique had already been described (![]()
| The A chain |
|---|
Two years later, ![]()
![]()
It would thus seem that no general conclusions can be drawn from these results concerning the general principles which govern the arrangement of the amino-acid residues in protein chains. In fact, it would seem more probable that there are no such principles, but that each protein has its own unique arrangement; an arrangement which endows it with its particular properties and specificities and fits it for the function that it performs in nature.
Yes! This is the conclusion that is so monumental, and had so much influence on the rise of molecular biology (sine qua non).
| The -S-S- bonds |
|---|
The A chain and B chain by themselves are inactive physiologically. Sanger and his colleagues Ryle, Smith, and Kitai went on to finish another functionally very important piece of covalent chemistry: the three disulfide bonds in the unoxidized molecule. They had to develop new methods, because the disulfide bonds rearrange under some of the conditions they used to break peptide bonds. They discovered that adding thiol blocking reagents like N-ethylmaleimide prevented the exchange under the slightly alkaline conditions used for digestion with pancreatic proteases. One of the interchain disulfide bonds was easily identified, but the A chain includes two adjacent cysteines, and they found no protease that could cleave the peptide bond between them. Further experiments showed that rearrangement under acid conditions was inhibited by added thiols, so they were able to proceed with their original sequencing techniques by partial acid hydrolysis. However, the separation of partial acid hydrolysis products was now even more challenging because the mixtures were inevitably more complex (two different random cleavage products joined together by the disulfide bridges). By now, paper ionophoresis had been added to the fractionation methods, and, by using different pH conditions and combining the electrophoretic separations with paper chromatography in various solvent systems, they found the necessary fragments that gave the unambiguous assignments of the disulfide bonds (![]()
Two overall comments: first, the methods used in this sequence determination were largely nonquantitative. Amino acid analysis was done by comparing the ninhydrin staining intensities by eye with standards, and with short peptides that is good enough. A little later, Moore and Stein and their colleagues developed methods for separating peptides on ion exchange columns (![]()
![]()
Second, on a very different level, those of us who did sequencing using the original methods would probably make a really interesting epidemiological group to study, because we were heavily exposed to a variety of organic solvents in the separations on paper, both by ionophoresis and chromatography. For ionophoresis, the paper was immersed in toluene, which acted as a coolant (it was cooled with circulating water in coils immersed in the toluene), and it was hard to avoid getting it on your hands as you manipulated the paper. Later, toluene was replaced by Varsol, a refined, high-boiling-point petroleum fraction. The most commonly used buffers contained pyridine and were really stinky. The mixture to be separated was loaded onto the paper in a small volume and dried with a hair dryer; then the buffer was applied to the dry paper, which was spread out on a glass plate. The art was in wetting the paper on each side so that the solvent fronts met at the origin at exactly the same time, and this needed a steady hand and was best not done right after coffee or tea time. This concentrated the sample into a very narrow line, if you did it right, but also left you breathing pyridine for quite a while. The solvents for paper chromatography were also pretty pungent, and on the occasions (thankfully rare) when I used hexane/formic acid, my nose would bleed, and I would have a headache until the next day. In those days we were quite ignorant of the dangers of this sort of exposure.
It is also fascinating to see the evolution of the methodology in the series of articles that describe the covalent structure of insulin. We tend to recognize Fred's development of the FDNB method of N-terminal labeling as crucial, and indeed it was, but his exploration of chemical and enzymatic cleavage methods and his use of new fractionation methods for mixtures of peptides, were also critical for his success. Paper chromatography was a relatively new technique that he and his colleagues used extensively, but even two-dimensional chromatography was not adequate for the complete separation of the complex mixtures generated by partial acid hydrolysis, so they used various prefractionation methods, including absorption onto charcoal to selectively remove peptides containing aromatic amino acids, batch ionophoresis in solution or in silica gels, and ion exchange. Relatively late in the game (![]()
Fred Sanger's stunning, startling, mind-expanding 1951 articles (with Hans Tuppy) on the sequence of the B chain of insulin deserve a huge worldwide Jubilee celebration, particularly among geneticists! The linearity of genetic maps was already well known, and a few years later Seymour ![]()
| ACKNOWLEDGMENTS |
|---|
I am very grateful to Philippa Claude for her insightful criticisms and suggestions; as always, her judgment is penetrating and accurate.
| LITERATURE CITED |
|---|
BENZER, S., 1955 Fine structure of a genetic region in bacteriophage. Proc. Natl. Acad. Sci. USA 41:344-354.
BERGMANN, M. and C. NIEMANN, 1938 On the structure of silk fibroin. J. Biol. Chem. 122:577-596.
BERGMANN, M. and J. S. FRUTON, 1941 The specificity of proteinases. Adv. Enzymol. 1:63-98.
CONSDEN, R., A. H. GORDON, and R. L. M. SYNGE, 1944 Qualitative analysis of proteins: a partition chromatographic method using paper. Biochem. J. 38:224-232.
CRICK, F. H. C., 1958 On protein synthesis. Symp. Soc. Exp. Biol. 12:138-163.[Medline]
DOVE, W. F., 1987 Paradox found. Genetics 115:217-218.
EDMAN, P., 1950 Method for determination of the amino acid sequence in peptides. Acta Chem. Scand. 4:283-293.
HARFENIST, E. J. and L. C. CRAIG, 1952 The molecular weight of insulin. J. Am. Chem. Soc. 74:3087-3089.
HARTLEY, B. S., M. A. NAUGHTON, and F. SANGER, 1959 The amino acid sequence around the reactive serine of elastase. Biochim. Biophys. Acta 34:243-244.[Medline]
HIRS, C. H. W., S. MOORE, and W. H. STEIN, 1960 The sequence of the amino acid residues in performic acid-oxidized ribonuclease. J. Biol. Chem. 235:633-647.
INGRAM, V. M., 1956 A specific chemical difference between the globins of normal human and sickle-cell anaemia haemoglobin. Nature 178:792-794.[Medline]
INGRAM, V. M., 1957 Gene mutations in human haemoglobin: the chemical difference between normal and sickle cell haemoglobin. Nature 180:326-328.[Medline]
MARTIN, A. J. P. and R. L. M. SYNGE, 1941 A new form of chromatography employing two liquid phases. Biochem. J. 35:1358-1368.
NEEL, J. V., 1949 The inheritance of sickle cell anemia. Science 110:64-66.
PAULING, L., H. A. ITANO, S. J. SINGER, and I. C. WELLS, 1949 Sickle cell anemia, a molecular disease. Science 110:543-548.
RYLE, A. P., F. SANGER, L. F. SMITH, and R. KITAI, 1955 The disulphide bonds of insulin. Biochem. J. 60:542-556.
SANGER, F., 1945 The free amino groups of insulin. Biochem. J. 39:507-515.
SANGER, F., 1949 The terminal peptides of insulin. Biochem. J. 45:563-574.
SANGER, F., 1952 The arrangement of amino acids in proteins. Adv. Protein Chem. 7:1-66.
SANGER, F., 1988 Sequences, sequences, and sequences. Annu. Rev. Biochem. 57:1-28.[Medline]
SANGER, F. and E. O. P. THOMPSON, 1953a The amino-acid sequence in the glycyl chain of insulin. 1. The investigation of lower peptides from partial hydrolysates. Biochem. J. 53:353-366.[Medline]
SANGER, F. and E. O. P. THOMPSON, 1953b The amino-acid sequence in the glycyl chain of insulin. 2. The investigation of peptides from enzymic hydrolysates. Biochem. J. 53:366-374.[Medline]
SANGER, F. and H. TUPPY, 1951a The amino-acid sequence in the phenylalanyl chain of insulin. 1. The identification of lower peptides from partial hydrolysates. Biochem. J. 49:463-481.[Medline]
SANGER, F. and H. TUPPY, 1951b The amino-acid sequence in the phenylalanyl chain of insulin. 2. The investigation of peptides from enzymic hydrolysates. Biochem. J. 49:481-490.
SPACKMAN, D. H., W. H. STEIN, and S. MOORE, 1958 Automatic recording apparatus for use in the chromatography of amino acids. Anal. Chem. 30:1190-1209.
SYNGE, R. L. M., 1943 Partial hydrolysis products derived from proteins and their significance for protein structure. Chem. Rev. 32:135-172.
This article has been cited by other articles:
![]() |
C. Yanofsky The Favorable Features of Tryptophan Synthase for Proving Beadle and Tatum's One Gene-One Enzyme Hypothesis Genetics, February 1, 2005; 169(2): 511 - 516. [Full Text] [PDF] |
||||
![]() |
V. M. Ingram Sickle-Cell Anemia Hemoglobin: The Molecular Biology of the First "Molecular Disease"--The Crucial Importance of Serendipity Genetics, May 1, 2004; 167(1): 1 - 7. [Full Text] [PDF] |
||||
![]() |
G. M. Edelman Biochemistry and the Sciences of Recognition J. Biol. Chem., February 27, 2004; 279(9): 7361 - 7369. [Full Text] [PDF] |
||||
![]() |
S. Altman RNA Processing: A Postdoc in a Great Laboratory Genetics, December 1, 2003; 165(4): 1633 - 1639. [Full Text] [PDF] |
||||
- THIS ARTICLE
- Full Text (PDF)
- Alert me when this article is cited
- Alert me if a correction is posted
- SERVICES
- Similar articles in this journal
- Similar articles in PubMed
- Alert me to new issues of the journal
- Download to citation manager
- Reprints & Permissions
- CITING ARTICLES
- Citing Articles via HighWire
- Citing Articles via Google Scholar
- GOOGLE SCHOLAR
- Articles by Stretton, A. O. W.
- Search for Related Content
- PUBMED
- PubMed Citation
- Articles by Stretton, A. O. W.

