Sickle-Cell Anemia Hemoglobin: The Molecular Biology of the First “Molecular Disease”—The Crucial Importance of Serendipity
Vernon M. Ingram

IN the spring of 1952 I sent 32 letters to England, trying to find a job in academic research. I was finishing my second postdoctoral year in the United States, working in Joseph Fruton's Biochemistry Department at Yale University. My topic was peptide chemistry, in particular the development of a novel end-group method by reductive methylation of the N-terminal α-amino group, followed by paper chromatography of the hydrolysate to detect a “basic” derivative of the end amino acid. The method worked and was published (Ingram 1950). To my knowledge, my method was never used by anybody, but of course I did not know this at the time! I was an organic chemistry Ph.D. with an interest in animal physiology developed through an excellent undergraduate course by Alistair Graham at Birkbeck College, London University, during the war years. My entrance to modern biology was, and still is, through the chemistry of proteins and peptides. Before Yale, I had spent a year with Moses Kunitz learning how to purify a yeast enzyme.1 Then I had that year at Yale with peptides. Now I was ready for new things.

I had become quite discouraged by my lack of success in finding a job back in England. After all, 3000 miles was and still is a long distance when you are job hunting. Quite by chance a new postdoc appeared just then in the Fruton lab. He was Herbert “Freddie” Gutfreund, straight from Cambridge, England. He knew well the Medical Research Council (MRC) Unit2 at the Cavendish Laboratory, where Max Perutz was director, and told me that Perutz, a protein X-ray crystallographer of note, needed a protein biochemist to place a “heavy atom” in a specific location in his hemoglobin (Hb) crystals. This approach would enable him to determine the phases of most of the X-ray reflections, the big problem in X-ray crystallography at that time. Freddie said “do apply” and I did, not expecting much. To my surprise, I was accepted for September 1952 and I was happily on my way back to England.

The laboratories of the MRC Unit were modest, but modern. We had one large biochemistry room for about four people. Next door was a large office for Francis Crick and, later, for Jim Watson and Sydney Brenner. The suite also contained a modest office for Max Perutz and John Kendrew, both X-ray crystallographers working on the structure of hemoglobin and myoglobin, respectively. There also was a well-equipped machine shop where Tony Broad developed the rotating anode X-ray source, one basis for the supremacy of the laboratory; the much more powerful X-ray beam produced greatly reduced needed exposures of the vulnerable protein crystals. The X-ray cameras were in the basement, but the films on which the X-ray reflections were recorded had to be developed on the top floor (no elevators). My first impression of Francis Crick was of a research student rapidly oscillating between the basement and the top floor! The laboratory was, however, a close and congenial environment [Sidney Altman (2003) has written a Perspectives of the latter-day MRC-LMB]. The success of Kendrew and Perutz's X-ray studies soon brought a number of visitors, some, like Brenner, very long lived. We became overcrowded, always a good sign of a productive lab. Within a few years (in 1956?) we moved into a refurbished bicycle shed in the courtyard of the Cavendish laboratory, leaving the X-ray machines where they were. The new environment, although drafty, provided much-needed expansion space for protein chemistry, microbiology, and the large group of people reading X-ray films manually. Visitors from spacious labs in the United States found this a quaint place. However, it was in the bicycle shed that I first produced the early “fingerprints” of sickle-cell hemoglobin peptides, with the help of Leslie Barnett and Rita Fishpool, the technicians.


I had long finished the project I was originally given by Max Perutz—to insert a single heavy atom into a unique position in the Hb molecule and then crystallize that derivative. This turned out to be straightforward, since the asymmetric Hb half-molecule, αβ, has only a single reactive sulfhydryl-cysteine side chain. This was readily reacted with p-chloromercuribenzoate. This heavy atom Hb derivative allowed Perutz and his student David Green to proceed to produce the first three-dimensional projection of a protein molecule (Greenet al. 1954). This left me at a loose end. I became interested in the connection, if any, between the heme group of hemoglobin and any amino acid side chains. It was considered possible that the heme group might be covalently attached. Accordingly, I started preparing and characterizing tryptic hemoglobin peptides that contained the highly visible heme group. Nothing much came of this, because, as was realized later from the X-ray structure, there was no specific covalent link to a particular peptide. Instead, the precise location of the heme group in hemoglobin and myoglobin depended on noncovalent interactions with several specific amino acid side chains.

Serendipity appeared again! Just at that time a very interesting visitor appeared at the MRC Unit: Tony Allison, bringing with him samples of sickle-cell anemia hemoglobin. Following Jim Neel's demonstration that sickle-cell anemia is caused by homozygosity for a sickle-cell gene, Tony had elegantly shown that the frequency for this gene, lethal when homozygous in an African population, was so surprisingly high because heterozygosity conferred protection against the endemic malarial parasite Plasmodium falciparum. The exciting story of this important discovery is described by Tony Allison in the previous issue of Genetics (Allison 2004). There were two reasons for Tony's coming to the MRC Unit: Max Perutz was interested in sickle-cell anemia hemoglobin, and he had the best X-ray equipment in the field. Perutz et al. (1951) had shown earlier that deoxygenated sickle-cell anemia Hb was much less soluble and formed “one-dimensional” crystals, leading to the well-known distortion of red cells in the disease; Tony wanted to put such “crystals/aggregates” of deoxygenated sickle-cell Hb into an X-ray beam and study the 3-D structure of these formations. As Allison (2004) describes, Linus Pauling had called the anemia a molecular disease and had shown with his colleagues that the Hb protein carried a chemical change, which was manifested as a charge difference for the whole protein, seen by Tiselius electrophoresis (Paulinget al. 1949). But was it one amino acid that was different, or two or three, or a whole group? Ordinary amino acid analysis at that time was too imprecise to decide. After all, it was only a very few years earlier that Sanger and Tuppy (1951) had convinced the world that a protein (insulin) was composed of a chain of amino acids, covalently linked, with a unique and defined amino acid sequence (see Stretton 2002). Things were moving fast in protein chemistry in those exciting days!

Since I was already developing methods for characterizing large peptide fragments of proteins—and of hemoglobin and myoglobin in particular—Perutz and Crick suggested that I use these methods on sickle-cell anemia Hb and compare it with normal human hemoglobin (mine). I was able to use the remaining samples of sickle-cell anemia Hb left behind by Tony Allison, who had by then moved on. He had tried to take X-ray pictures of deoxygenated sickle-cell hemoglobin, a very difficult technical problem, because the hemoglobin to be mounted in capillaries had to be kept reduced; otherwise the “crystals” would redissolve. I had available the crucial abnormal protein on which to use the new concepts and techniques of Sanger (Sanger and Tuppy 1951; Sanger and Thompson 1952), which I was adapting to the much larger peptides I was preparing to examine. Hemoglobin, even the αβ half molecule, is 10-fold larger than either peptide chain of insulin. Sanger cleverly fitted together the amino acid sequences of a large number of very short overlapping peptides. To do the same with the much larger hemoglobin would have been a Herculean task. I needed to characterize larger peptide fragments, such as might be obtained by proteolytic digestion with trypsin, which gave some 26 peptides. In addition, I decided in the first instance not to attempt a full amino acid sequence of the whole protein, but to look first at the chemical behavior of these tryptic peptides and to sequence only those that became interesting in the sense of showing a difference between wild type and sickle-cell hemoglobin. In the event, that strategy proved to be enough to pinpoint a difference in chemical behavior and therefore in chemical structure. Sanger, dealing with very short peptides, was able to separate them cleanly by paper chromatography, then the most up-to-date method, using various solvents. My goal was twofold: first, to find a peptide fragment that showed an electrophoretic difference, as had the whole protein, and second, to show that the rest of the protein was likely to be the same, at least by the methods used. As so often experienced in molecular biology, we were doing chemistry! These considerations lay behind my evolving the method of “fingerprinting,” i.e., characterizing each peptide by its position on a two-dimensional map, a sheet of “blotting paper” (retold in Ingram 1989). I would digest with trypsin the two samples of protein, wild type and sickle-cell mutant, and then spot the resulting mixture onto a sheet of this paper moistened with buffer at pH 6.4 (near the isoelectric point for the whole protein). In stage 1, water-cooled electrophoresis under glass plates distributed groups of peptides with similar charge densities along a straight line. Stage 2 was partition chromatography at right angles, originally in a butanol:acetic acid mixture, which would resolve differences involving uncharged amino acid side chains. The resulting map or fingerprint of colorless peptides was “developed” by spraying with ninhydrin reagent to develop the purple color due to reaction with the α-amino group of each peptide (and also our fingers!). Figure 1 shows this result with the improved technique developed later by Corrado Baglioni (1961), a postdoctoral fellow. In these early days we argued that the differences seen for only one peptide were the only ones in the protein and that the rest of the sequence was normal. Actually, this conclusion was only as good as our techniques, which were still quite primitive. It remained for full amino acid sequencing of the HbS and wild-type HbA peptide chains by others to prove that our initial assumptions were correct. We were fortunate in that respect. With this caveat, we decided to go on and examine as many examples of other (inherited) abnormal hemoglobinopathies as we could. It happened that Herbert Lehmann of St. Bartholomew's Hospital in London, a doctor and good friend of Max Perutz's, was very interested in the epidemiology of sickle-cell anemia and related diseases (Perutz and Lehmann 1968). He soon moved to Cambridge. His freezer was stocked with an extensive collection of abnormal hemoglobin samples collected by him on his travels or supplied by his network of friends and colleagues. Without his support, much of our early and later work could not have been done. He became a constant source of inspiration and constructive criticism.

Figure 1.

—Fingerprints of hemoglobins A and S (improved method); photograph of ninhydrin-positive peptide spots on filter paper (Baglioni 1961).

A very important phase of this work was the ability to replicate these fingerprints of HbS with five different samples of unrelated sickle-cell anemia patients before we could publish our work. Hermann Lehmann supplied these samples. Very soon I was able to hire Leslie Barnett and Rita Fishpool to help with the increasing demands for more and more fingerprints of new and exciting abnormal human hemoglobins. The isolation of interesting tryptic peptides from these hemoglobins led to amino acid analyses and Edman stepwise degradation to figure out the amino acid sequence and thereby the missense mutations and amino acid substitutions of the early abnormal hemoglobins. Two early graduate students joined me at the MRC Unit, John Hunt, now a professor in Hawaii, and Tony Stretton, now a professor at Madison. There was a “temporary interruption” when I went for a sabbatical leave to the Biology Department at the Massachusetts Institute of Technology (MIT) in the “other” Cambridge, where I have stayed for 45 years! Other graduate students and postdocs joined me there. The early work in England with sickle-cell anemia and with hemoglobin C and E diseases was later followed at MIT with more and more complicated inherited hemoglobinopathies and thalassemias. A whole new world of fascinating research had opened up! The driving force was the realization that we were chemically exploring the mechanisms of Mendelian inheritance and the evolution of gene clusters.


The important conclusion from our earliest experiments was that a simple Mendelian trait resulted in the substitution of just one amino acid. This realization occurred just at the time when Crick and Brenner were figuring out the fundamental properties of the genetic code for an amino acid sequence. Our finding of a single amino acid substitution made impossible some coding schemes, then popular in the days before DNA sequencing, that involved an overlapping triplet code. Such a code was made impossible because in that scheme a single base substitution in DNA would affect not one but several adjacent amino acids, depending upon the particular coding number and pattern of overlap.


The abnormal hemoglobin of sickle-cell anemia, HbS, was the best-known and most-studied example of the protein mutants in the group of human inherited diseases known as the hemoglobinopathies. We now looked at others. My graduate students, John Hunt and Tony Stretton, and visiting scholars Makio Murayama, Corrado Baglioni, Seymour Benzer, Barbara Bowman, and I applied our methods to other abnormal hemoglobins, for example, HbC (Hunt), HbD (Benzer: Murayama), HbE (Hunt), HbLepore (Baglioni), etc. (Figure 2). We also concluded that the δ-chain of the minor human hemoglobin, HbA2, was different in 10 places in the sequence of its β-like δ-peptide chains (Ingram and Stretton 1961, 1962a,b). This made it necessary to formulate the adult hemoglobins as HbA=α2βsA,HbS=α2β2S,HbA2=α2δ2. It was known that the normal minor hemoglobin HbA2 (2.5%) doubled to 5% in amount in certain thalassemias. Moreover, in sickle-cell anemia, the HbA2 component, although minor, was clearly normal in electrophoretic behavior. A different gene, the δ-gene, was involved in controlling the “β-like” peptide chain of HbA2. To make hemoglobin, one needed the activity of α-chain genes for both HbA and HbA2. In turn, this realization led to the notion that there is one α-chain gene and at least a pair of “β-chain” genes, the β-chain gene itself and the δ-chain gene, each acting independently. Normally, equivalent numbers of α- and β-chains combine to form the α2β2-molecule, and any excessive β-chains are degraded. Much less δ-chain is made, combining with α-chains to form HbA2. A new concept of the thalassemias developed, in which a defective β-chain gene made little if any β-chain, leading to an anemia and incompletely filled red cells. In this circumstance, HbA2 is made as usual, but since there is less HbA, there will be a higher proportion of HbA2 in β-thalassemia.

Figure 2.

—Amino acid sequences of the N-terminal peptides of the β-chain of hemoglobins A, S, C, and GSan José; the last should not be confused with hemoglobin GPhiladelphia, which is an α-chain mutation.

Quite quickly the amino acid substitutions were seen to demonstrate the molecular basis at the phenotypic level of allelism, heterozygosity, and homozygosity. They illustrated codominance, evolutionary changes after gene duplication, and unequal crossing over. Through fingerprinting analysis of amino acid sequences of hemoglobins, Corrado Baglioni showed unequal crossing over in the abnormal HbLepore, which contains a β-like peptide chain that was part β-chain and part δ-chain (Baglioni 1962). Eventually, different types of mutations were found in the hemoglobins by other groups: readthrough mutations and deletions. The latter were particularly important in explaining certain common thalassemias. The so-called Mediterranean thalassemias, which were defective in making β-chains, were often found to involve deletions. They also were far more frequent than expected, owing to resistance to malaria. The Asian thalassemias, which are α-chain defects, are very frequent and also involve deletions. Thus, the molecular basis of a wide range of hemoglobinopathies could be explained by studying amino acid sequences of their hemoglobins, originally with the fingerprinting techniques, but soon through the use of more sophisticated methods of peptide chemistry.

Less debilitating diseases than sickle-cell anemia, the HbC and HbE hemoglobinopathies involve single amino acid missense mutations. They also are much more frequent than expected, but the mechanism of balanced polymorphism, which presumably exists, is less well understood. At the other end of the spectrum, the rare abnormal hemoglobins HbDLos Angeles and HbDPunjab have an identical amino acid substitution. Is this a case of two independent but identical missense mutations, or is it due to a traveler a long time ago (Figure 3)?

As far as sickle-cell anemia is concerned, it was thought at one time that perhaps an unexpectedly high rate of mutation might cause the high HbS allele frequency. One study showed 2 children among some 300 with full-blown sickle-cell anemia whose mothers were not sickle-cell anemia carriers, i.e., were not heterozygous for the mutation. This would indicate an extraordinarily high mutation rate. It later turned out that child exchanges between mothers were quite frequent in that particular population and were not officially documented.

Plotting the restricted geographical distribution of the rather frequent HbC mutation (β6Glu→Lys), one can deduce a discrete local area of origin for this allele. In this case, a single mutational event is postulated.

In last month's Perspectives, Tony Allison (2004) detailed the fascinating story of how he documented the role of malaria in the balanced polymorphism model that explains the high frequency for the disease, so lethal when homozygous, at least in Africa where it originated. Here is another illustration, reported by J. H. P. Jonxis (1965). Some 300 years ago, Dutch settlers brought many slaves from the malarial West Coast of Africa to the island of CuraÇao and to the neighboring mainland of Suriname in South America. While the mainland of Suriname was swampland, infested with malaria, the mountainous island of CuraÇao has always been free of malaria. In 1965, many generations later, Jonxis reported that the frequency of the HbS allele was only 6.5% on CuraÇao, but 18.5% on the mainland. In contrast, the frequency of the HbC allele was 5.8 and 6.0%, respectively, indicating that the two populations came from the same pool. The study shows that malaria greatly increased the frequency of sickle-cell anemia. Curiously, it seemed to have little effect on the frequency of the HbC allele. Presumably there is some other explanation for the relatively high frequency of the HbC allele in West Africa.


Mendelian genetics can account for the inheritance patterns of all the abnormal human hemoglobins. Sometimes two genes for different abnormal hemoglobins affecting both α- and β-globin chains are present in the same individual (Baglioni and Ingram 1961). We were able to explain why one individual's red cells had no fewer than three abnormal hemoglobins in addition to the wild-type HbA. Electrophoresis showed four bands with different mobilities. One, dubbed HbX, moved much more slowly in the electric field than any previously observed Hb. Might it contain two amino acid substitutions in one globin gene? That was certainly possible, but if that mutation occurred in the same gene as, say HbS, then only three Hb bands should appear. Isolation of the four Hbs and fingerprint analysis solved the problem, together with pedigree analysis. Electrophoreses showed that the individual had wild-type HbA, more slowly moving HbGPhiladelphia, still slower HbC, and extremely slow HbX. By this time, much more sequence information was available from other laboratories. Combining their findings with our fingerprint results, we could show that the peptide chain constitutions were: HbA=α2β2HbGPhiladelphia=α2G-Philadelphiaβ2(containingα68AsnLys)HbC=α2β2C(containingβ6GluLys)HbX=α2G-Philadelphiaβ2C. This was clearly a case of double heterozygosity.

Figure 3.

—Diagramatic summary of the abnormalities in human hemoglobin (Ingram 1963), as known in 1962.

We also needed to know for sure which of the two different peptide chains in hemoglobin, the α- or the β-chain, was involved in the sickle-cell mutation. Perhaps both were involved, since at that time the amino acid sequences of the two chains were not known. Much to our relief, it turned out that only the β-chain carried the sickle-cell mutation. We could now with confidence reformat the Beadle and Tatum hypothesis of “one gene-one enzyme activity” to state “one gene-one peptide chain.” Our finding of a single mutation changing one amino acid greatly strengthened the concept of a direct relationship between the linear genetic molecule (DNA, whose structure had just been elucidated) and the linear protein peptide chain (just established by Sanger for insulin). A good geneticist now also had to be a good biochemist!

Protein chemists now began to define the peptide chains of all the human hemoglobins: the adult α-, β-, and δ-chains, the fetal γ-chains, and the embryonic ϵ- and ζ-chains. This was necessary to assign missense mutations to particular globin chain genes. The work was spearheaded by Walter Schroeder at Cal Tech and by Lyman Craig, Bob Hill, and Bill Koenigsberg at the Rockefeller Institute. Eventually it was taken up by many others. The knowledge of these peptide chains was instrumental in unraveling the structure of the human (and animal) globin gene “clusters”: the α-chain gene cluster containing the γ-, α-, and ζ-genes and the β-chain gene cluster containing the β-, δ-, and ϵ-genes. These gene clusters are still used to study gene expression and factors that activate globin genes during development.

Figure 4.

—Evolution of the hemoglobin chains. The point in time of a gene duplication is indicated by a solid circle (Ingram 1963).


Our analysis of the amino acid sequences of adult and fetal human hemoglobin peptide chains led us to postulate an evolutionary scheme for the genes that control the peptide chains. Added to this understanding was the coemerging understanding of the close sequence and structural relationship between the monomeric myoglobin and the four-chain hemoglobins. We used a parsimonious model of sequence evolution (Figure 4) in which we simply put next to each other in evolution those genes encoding peptide chains with the fewest amino acid differences. We postulated that evolution occurred via gene duplication, itself not a new idea, followed by independent evolution of the resulting “daughter” genes (Lewis 2003). This evolution was not quite independent, however, since α-like and β-like genes could evolve only within the confines of the concept that they must remain able to form α2X2 tetramers. This was essential to preserve the cooperative interaction between subunits that allows the very advantageous sigmoid oxygenation curve. The scheme was well received and adapted to many other evolutionary situations.


The sickle-cell mutation causes disease in children and adults, but not in the fetus, since it affects only adult β-chains, which are not expressed in embryonic or fetal red blood cells. The disease makes itself felt after ∼2 years of age. A mutation in the α-chain, on the other hand, would affect hemoglobins in the mid- and late fetal stages as well as in the adult stages of human development, since the α-chain is common to all the human hemoglobins at all stages. Tony Stretton and I thought to bring some order to the confusion that existed among the descriptions of the various types of thalassemia. This is a widespread group of inherited hemoglobinopathies in which there are only normal hemoglobin peptide chains, but a gross deficiency in certain hemoglobins or, more precisely, in one or another hemoglobin peptide chain. We postulated that the thalassemias could be divided into α- and β-thalassemias (Ingram and Stretton 1959). Since α-chains are common to all hemoglobins, a thalassemic mutation affecting the production of α-chains would cause an anemia at all stages of development. On the other hand, a β-thalassemia would affect only adults. Through the more recent work of Bernard Forget, Arthur Bank, Stuart Levy, David Weatherall, and others, the molecular biology of the thalassemias has become clearer. The powerful techniques of DNA cloning and sequencing were essential in this work. For example, certain thalassemias illustrate the effects of large and small deletions.

Humans might at first appear to be poor subjects for the study of molecular genetics, since planned crosses are not popular! In addition, the generation time is very long. Yet it was in the human system that the “one gene-one peptide chain” idea was first proved. Also in humans, we were able to discover that a mutation so small as to affect only a single amino acid in a protein is not only possible, but also quite common. Charles Yanofsky's elegant and elaborate studies on the tryptophan synthase gene of Escherichia coli soon followed. Thus, the new ideas in human genetics applied also to the much more accessible bacterial system.


Sickle-cell anemia remains a very significant disease, especially among African Americans. The discovery of the molecular basis of the disease in the 1950s, the significant single amino acid substitution, was of limited benefit to the patient population, except that it allowed for the development of a prenatal diagnostic based on the change in DNA sequence in position β6 of the β-globin gene. One might have thought that a homozygous sickle-cell fetus would provide a blood specimen that by simple electrophoresis would disclose the genotype of that fetus. However, getting a blood specimen at that stage is difficult. More importantly, the fetus makes hemoglobin F, the fetal hemoglobin (HbF), that is unaffected by the mutation. Hemoglobin S is the adult type and is not produced until ∼2 years of age. The DNA sample obtained from amniocentesis, on the other hand, is clearly diagnostic, and that is helpful for genetic counseling. It is not a therapy and not a cure, however. Entirely different approaches were needed to find a therapy; that took a long time, because they were not based on the molecular biology of HbS.

To date the best treatment for sickle-cell anemia appears to be the use of hydroxyurea (Bunn 1997). This antineoplastic drug increases the proportion of HbF for sickle-cell anemia patients. HbF is known to be “a very potent inhibitor of the polymerization of deoxyhemoglobin S” (Bunn 1997). How hydroxyurea accomplishes this is not clear. Moreover, the additional HbF may be restricted to a subset of red cells, known as “HbF cells.” How this fact helps reduce the polymerization of HbS-containing red cells is a mystery. However, the treatment works at least to some extent and brings relief from the very painful sickle-cell anemia crises. Fortunately, hydroxyurea is very well tolerated and has few side effects. It can be taken for years and is therefore useful. The drug may have other useful effects in patients. Another angle: “Increasing attention is being paid to bone marrow transplantations as a cure [sic] for sickle cell disease” (Bunn 1997). The “increased HbF notion” as treatment arose in the 1980s as a serendipitous observation in Heller's laboratory of inducing fetal hemoglobin production in adult anemic baboons (DeSimoneet al. 1982). Serendipity again!

This writer feels that without a lot of “dumb” luck, the sickle-cell mutation in hemoglobin would not have been pinned down, at least not at that time by us. No doubt it would have been figured out by somebody sometime. The story leaves one with a warm feeling toward “luck”!


  • 1 Kunitz and Northrop were among the first to purify and crystallize protein enzymes.

  • 2 Officially, the Medical Research Council Unit for the Determination of the Structure of Biological Molecules—rather a mouthful.

  • Serendipity: the faculty or phenomenon of finding valuable or agreeable things not sought for.

    Merriam-Webster's Collegiate Dictionary