## Abstract

Wilhelm Weinberg (1862–1937) is a largely forgotten pioneer of human and medical genetics. His name is linked with that of the English mathematician G. H. Hardy in the Hardy–Weinberg law, pervasive in textbooks on population genetics since it expresses stability over generations of zygote frequencies *AA*, *Aa*, *aa* under random mating. One of Weinberg’s signal contributions, in an article whose centenary we celebrate, was to verify that Mendel’s segregation law still held in the setting of human heredity, contrary to the then-prevailing view of William Bateson (1861–1926), the leading Mendelian geneticist of the time. Specifically, Weinberg verified that the proportion of recessive offspring genotypes *aa* in human parental crossings *Aa* × *Aa* (that is, the segregation ratio for such a setting) was indeed . We focus in a nontechnical way on his procedure, called the simple sib method, and on the heated controversy with Felix Bernstein (1878–1956) in the 1920s and 1930s over work stimulated by Weinberg’s article.

MORE than a decade after the rediscovery of Mendelism, there was controversy over whether *human* inheritance actually followed Mendel’s laws. Wilhelm Weinberg (1862–1937) was a prominent member of the German “school” of genetics in the first third of the 20th century. In his article of 1912 Weinberg gave as stimulus Bateson’s remark that *human* data seemed not to be in accord with Mendel’s principles.

William Bateson (1861–1926) is often called the founder of genetics, a term that he is said to have coined and a discipline for whose acceptance at the University of Cambridge he was primarily responsible. He quickly became the leading promoter of Mendelism after its rediscovery, not least on account of his book *Mendel’s Principles of Heredity* (Bateson 1909; Fisher 1952).

The first sentence of Weinberg (1912) reads in translation: “In Bateson’s book *Mendel’s Principles of Heredity* it is stated repeatedly that the distribution of recessive and dominant types are not in accord with the classical numbers.” While he was critical of Bateson, with Mendel in mind, Weinberg was motivated by Bateson to carry research into inheritance in humans to a new level. Weinberg (1912) is his successful attempt to adapt Mendel’s experimental approach to human data, where there were no experimental data, and thereby to verify that Mendel’s principles still held.

In the first decade of the 20th century when the implications of Mendel’s experiments were being realized, Weinberg had an established medical practice in Stuttgart. At about that time he started to produce an impressive list of publications, since his activities and contacts as a medical practitioner gave him access to medical data and provided a stimulus to develop genetic and statistical concepts. His “Hardy–Weinberg law” article, Weinberg (1908), was a study of the genetics of twinning in humans (Stark 2006). Edwards (2008) begins with tribute to Weinberg’s comprehensive formulation of the Hardy–Weinberg law.

Dunn (1965), who is more appreciative of the contributions of Weinberg than many writers, relied on Stern’s (1962) survey of Weinberg’s research, including the comment: “In 1912, he [Weinberg] developed the methods of correcting expectations for Mendelian segregation from human pedigree data under different kinds of ascertainment applied to data from small families.” But he went on to say: “Weinberg’s highly original work failed to interest his contemporaries, who were not prepared to find in algebra and probability theory the keys to the fields of population genetics and human genetics.”

Our object is to give an outline of Weinberg (1912) and some details of related work and contributors. The next section gives some background to Weinberg’s article. Then there are sections on Weinberg, on Felix Bernstein, on Weinberg’s simple sib method, on Bernstein’s and Berwald’s response to it, and on a landmark article by Fisher, followed by our conclusion.

## Context and Background

Behind the approach of Weinberg (1912) is the idea that according to Mendelian principles and a recessive mode of inheritance, a cross of heterozygotes, *Aa* × *Aa*, analogous to Mendel’s self-pollination, is expected to produce recessive offspring *aa* with frequency . However, in a medical setting, a rare trait such as Friedreich’s ataxia (an autosomal recessive inherited disease that causes progressive damage to the nervous system) indicates (*ascertains*) a pair of heterozygotic parents only when an affected child is born. That is, a heterozygote cross, *Aa* × *Aa* in the parents is detected only when an affected child, *aa*, is born, so a study of the sibs produced by such a cross is conditional on there being at least one affected child.

To estimate the proportion *p* of affected offspring from the population of *Aa* × *Aa* parental crossings, the simple proportion of affected offspring of all offspring in the selected families would give a *biased* estimate, since these families already have at least one affected offspring and therefore do not form a representative sample of all *Aa* × *Aa* parental crossings. Weinberg’s problem was to make statistical compensation for this *ascertainment bias* from data containing sib numbers only for such ascertained families (that is containing at least one affected sib), with the intention of verifying that indeed , according to Mendelian principles *even in a human genetics setting*, contrary to Bateson’s thinking.

The affected child is referred to as a *proband* or *propositus*. This term, introduced by Weinberg, became ubiquitous in subsequent ascertainment studies. The most useful definition of the term proband is contentious but generally it is applied to the affected individual, or individuals, who cause the family or sibship to be brought into a study.

The symbol *p* is referred to variously as *segregation proportion* or *segregation ratio*. The investigations of Weinberg and Bernstein are in nature early manifestations of *segregation analysis*. Khoury *et al.* (1993, p. 233) give the following definition: “In medical genetics, classic segregation analysis has focused almost exclusively on estimating the proportion of affected and nonaffected offspring in sibships, and the hypotheses of interest were whether or not this proportion was compatible with the expectations of simple Mendelian models.”

Segregation ratios are an important part of the stock-in-trade of genetic counselors, one of whose functions is the calculation of genetic risks, such as the risk of recurrence of disorders in families.

Weinberg’s (1912) achievement was therefore to verify segregation in human genetics by clever compensation for ascertainment bias. He was thus a pioneer in the application of Mendelian principles to human well being, but has not been adequately honored in this respect.

Part of the reason is that he was a poor publicist and poor expositor of his own work. Another reason was the language barrier: Weinberg published exclusively in German. Some idea of the activity of those who published mainly in this language in the early decades of the 20th century can be gained from Baur *et al.* (1931), an English version of the third edition of *Menschliche Erblichkeitslehre und Rassenhygiene*, first published in 1921 (Baur *et al.* 1921). About 500 pages of more than 700 were written by the third author Fritz Lenz, including a substantial section on methodology. Chapter XI “Methods for the Study of Human Heredity” (pp. 495–561), by Lenz, then Professor of Racial Hygiene at the University of Munich, contains a number of references and shows muted respect for Weinberg. It is evident from the references in this book that the German authors were well aware of what was published in English.

The reverse was not the case, due to the general inability of native English speakers to read German, and also perhaps because German journals on heredity were obsessed with racial issues, and articles were sometimes published in disorganized format and style. Until the note of Stern (1943), it was little known that Weinberg (1908) had introduced the formula for the distribution of genotypes in a randomly mating population. Before that note, what we know today as the Hardy–Weinberg law was called Hardy’s law, based on Hardy (1908).

Felix Bernstein (1878–1956) was another important member of the German school. His training was in mathematics rather than medicine, but he achieved fame for having resolved the genetics of the A, B, AB, and O blood groups, discovered by Landsteiner who also published in German (Gottlieb 1998; Bernstein 1925). Familiarity with the ABO system is essential for every medical student and was so when it was believed that the human genome had 48 chromosomes (Crew 1947). About 10 years after Weinberg’s article of 1912 Bernstein began to take an interest in segregation analysis of the kind started by Weinberg.

In his ABO article (1925), Bernstein referred to Weinberg’s article and commented that his own confirmatory analyses of his ABO model were *a priori* in nature; that is, they *tested a hypothesis* about the value of a parameter, whereas Weinberg’s formula led to the *estimation* of a parameter, specifically *p*. Despite this realization much of the later conflict between Weinberg and the Bernstein “camps” was due to confusing the two types of statistical analysis. Bernstein’s contribution to segregation analysis went almost entirely unnoticed by later, especially English-language, writers, even though one of his striking and influential insights was later attributed to Haldane, who himself had in fact acknowledged Bernstein as source.

## Wilhelm Weinberg, 1862–1937

Weinberg was born in Stuttgart on December 25, 1862, and was an outstanding student at the Gymnasium (high school), presumably with a special interest in mathematics. He studied medicine in Tübingen, Berlin, and Munich, receiving his M.D. in 1886. He returned to Stuttgart in 1889 and remained there, practicing as a private obstetrician–gynecologist, until his retirement in Tübingen for a few years. He died there on November 27, 1937.

For the reader’s convenience we extract some detail from within the article of Crow (1999), in *Perspectives* in this journal, who, in describing the achievement of Weinberg based on Stern (1962), begins: “With the techniques now available for the study of human genetics, it is hard to imagine how difficult and limited the subject was at a time when only superficial phenotypes were observed…” and then continues:

[A]mong other things in his busy life, he delivered 3500 babies. Somehow he managed to fit into this crowded schedule time to write papers, many of them long and full of carefully analyzed data.… He wrote more than 160… He worked alone and had neither students nor colleagues. Indeed he appears to have had few friends. He remained outside the circle of geneticists. In his writings he was often argumentative and abusive. His criticisms were pointed and often personal. He clearly felt that he was not being properly recognized…yet he was benevolent and clearly had a strong social conscience and sense of justice.

Weinberg was half Jewish. It is not clear as to whether he was affected by the Nazi steps against Jews, which began in 1933, although Früh (1996) believes this to be the case. Shortly after Weinberg’s death Kallmann (1938) published a sympathetic obituary, and Stern (1962) later published another in this journal. Früh (1996) is a summary of Weinberg’s lifework as a public doctor and population geneticist.

Crow (1999) says: “Weinberg was the first to recognize the problem of ascertainment bias” and it is correction for ascertainment bias that is the central theme of this article.

## Felix Bernstein, 1878–1956

Bernstein was born in Halle on February 24, 1878, and spent his childhood and youth there. One of the central theorems of set theory was proved by him as an 18-year old. After his schooling in Halle, he became a student of fine arts in Pisa. Having been persuaded to study mathematics, he moved to Göttingen, where he was one of the great David Hilbert’s first doctoral students, completing in 1901. After spending some years as a *Privatdozent* teaching university mathematics (primarily pure mathematics) in Halle, he was appointed to Göttingen in 1911 as Associate Professor of Mathematical Statistics and Insurance Mathematics. He was a close and lifelong friend of Albert Einstein from about this time.

Bernstein was an original and creative mathematician in a number of areas, not least in probability theory. Frewer (1981) writes that before Bernstein, mathematical statistics was not of great significance as a subject of study in Germany. Mathematical statistics at the time had a biometric motivation and nature, due to the interests of the founder of the English Biometric School, Karl Pearson (1857–1936), and the then direction of his journal, *Biometrika*, which, like *Genetics*, continues to thrive. Consequently, an Institute of Mathematical Statistics was founded at Göttingen University, already world famous for its mathematics, at the end of World War I in 1918, with Bernstein as its director from 1921. Crow (1993) remarks: “Bernstein and Weinberg were the two leaders in developing techniques for human genetic study, not only in Germany but throughout the world.”

Presumably during Bernstein’s biostatistics course at Göttingen, with a German and Mendelian direction in mind, he observed in studying Weinberg’s estimator of *p* that it was important to complete Weinberg’s method by finding a measure of *precision* of this estimator. A subsequent publication by Bernstein’s student, Berwald (1924), on this topic ignited a controversy, based on misunderstanding on all sides.

The founding of Richard von Mises’s Institute for Applied Mathematics in Berlin also occurred in about 1918. Unsurprisingly, a rivalry with Bernstein was to develop. Von Mises was processing editorially an article of another of Bernstein’s students on Weinberg’s method (it appeared as Von Behr 1927), so Weinberg appealed to von Mises for mathematical help in the controversy. This is recorded at the end of Weinberg (1928), a long article that was received on December 22, 1926. It is occasionally cited, with less than adequate comprehension, as Weinberg’s *magnum opus* on ascertainment. Von Mises (1930) launched a well-founded defense of Weinberg, but his biting criticism of Bernstein is not to the point. Bernstein’s abrupt and emotional writing style in his own articles is not a little to blame. Yet Bernstein’s (1929) own *magnum opus* in a biometric context, a disciplined and tightly written work with plentiful citation of contemporary English-language work including articles in *Genetics*, remains totally unknown to English-language writers.

Bernstein’s activity in biometrics in Germany lasted up to his departure for a visit to the United States in 1932. Being of a prominent Jewish family, he saw the signs in Nazi intentions toward Jews and never returned to Germany permanently. He spent a year at Columbia University and had the agreement of the university that they would offer him a permanent position at the end of the year. The university reneged on the agreement to offer him a permanent post, however, which led later to a comment from the great English population geneticist and mathematical statistician R. A. Fisher, “But Bernstein, why did you not come to England? In England, a handshake from a gentleman is as good as a signed contract.”

Crow (1993) and Schappacher (2005) give further biographical and technical detail. In comparison with other German–Jewish émigré mathematicians, especially von Mises, Bernstein’s career in the United States was less successful (Siegmund-Schultze 2009).

Benstein died in Zurich on December 3, 1956.

## The Simple Sib Method

To fix ideas, we consider a rare recessive disorder and suppose that *N*sibships, with the total number of children and the number of affected in the *i*th one noted, have been assembled. The trait albinism has often been used as an illustration. Suppose that we want to make an estimate of the segregation ratio *p*, that is, without assuming that it is ¼. However, because the families have been ascertained only if at least one child in each is affected, taking the quotient of total number affected over the total number of children would yield an upwardly biased estimate of *p*. Weinberg (1912) proposed the estimate(1)The simple sib method [*die einfache Geschwistermethode*] uses Equation (1) to *estimate the segregation ratio*, that is, the proportion of affected individuals within the population, by using a sample of sibships with at least one affected individual.

Von Mises (1930) gives an illuminating justification of Equation 1. In English translation: “If one asks all the sick children [in a family of size *s* with *r* sick children] what is the number of their sick sibs and adds all the answers one obtains *r(r − 1)*. Further, if one asks all the normal children what is the number of their sick sibs, the number comes to *(s − r)r*, so that the combined number comes to *r(r − 1) + r(s − r) = r(s − 1)*.”

The first sentence of Weinberg (1912) reads in translation:

In Bateson’s book

Mendel’s principles of heredityit is stated repeatedly that the distribution of recessive and dominant types are not in accord with the classical numbers.

Weinberg then discusses difficulties encountered in studying human heredity, offers some suggestions, and proceeds to his own approach with the intention of proving Bateson wrong.

Weinberg’s first example is of the disorder myoclonic epilepsy using data of Lundborg (1903, 1913):

For the data in the table below Weinberg notes that the “uncorrected” estimate, the simple proportion of recessive children among all children in families with at least one recessive, is, as expected, an overestimate since it does not take account of the fact that some couples produce no affected children. The estimate using (1) is (2)Clearly the small amount of data would not be sufficient to make a firm conclusion about inheritance, but it might serve to stimulate further interest in the question. In the case of recessive inheritance, particularly so in the case of rare conditions, human geneticists soon learned to look for a relatively high rate of consanguineous parentage (Lenz 1919). Crew (1947, p. 106) stated that the frequency of cousin marriages in “our society” was 0.5%. For very rare conditions it would be more like 5 or 10%.

Weinberg (1912) tries to justify the need for his corrected estimation procedures in two mathematical appendices, but these beat about the bush. There is also a verbal justification, but it is convoluted and bears out the belief that, even in German, he was not a good publicist of his own work. Further, Weinberg (1912) did not know how to address, for his estimator, what later became the conventional statistical properties to consider, such as bias and standard error. The best of the English-language accounts of these two appendices is in Bailey (1961, Sect. 14.321), but is still somewhat inaccurate.

While Weinberg’s formula produced an estimate of the segregation ratio *p*, efforts to elucidate the genetic basis of various traits turned also to tests of hypothesis of, for example, autosomal recessive inheritance. A value, such as , was hypothesized, and the observed counts of affected offspring in sibships were compared with those expected after making due allowance for the method of ascertainment. Apert (1914), following Weinberg’s suggestion, applied this procedure to data on albinism that Karl Pearson and colleagues had published in the *Treasury of Human Inheritance* series (Pearson *et al.* 1911–1913). For an assessment of the value of the *Treasury* series of publications under successive editors Karl Pearson, R. A. Fisher, and Lionel Penrose, see Harper (2005). These procedures became known as *a priori* methods. Lenz also used these and, in Baur *et al.* (1931, p. 509), argues in their favor while giving faint praise to Weinberg. Hogben (1931) used the *a priori* method in analyzing several traits including albinism using Pearson’s *Treasury* data.

## Bernstein and Berwald on Weinberg’s Work. And, Once Again, the Language Barrier

Bernstein observed in studying Weinberg’s estimator (1) that it was important to complete Weinberg’s method by finding a measure of *precision* (“Genauigkeit”) of this estimator of *p*. One of the students of Bernstein’s biostatistics course at Göttingen, F. R. Berwald, was familiar with the extensive studies on the properties of such estimators and eventually arrived, as published in Berwald (1924), at the formula for mean square error of Weinberg’s estimator.

From the mid-1920s also, there was considerable misunderstanding in attempts to compare Weinberg’s *einfache Geschwistermethode* and Bernstein’s *apriorische Methode*, in spite of the valiant effort of Just (1930). In Bernstein (1929), his magnum opus in a biometric context, Part 2, Chapter IV, Section 20, is entitled *Die Korrektur der Auslesewirkung nach rezessiven Kindern* (“Correction for selection effect of recessive children”). It finally illustrates clearly what Bernstein actually means by his *apriorische Methode*, illustrated by his analysis of Lundborg’s (1903, 1913) data. He calculates E*R*, the expected number of recessives in a sibship of size *s*, which is given by (3)where *q = 1−p*, at *p = *, for each sibship size *s =2* to 10. He then enters the relevant quantities for each family size *s* in Lundborg’s data and sums over all the family sizes *s* to obtain 54, as above, for total sibs, sums over all recessive sibs, to obtain 17 as above, and sums all the corresponding expected values, to obtain 16.666. On the basis of some further calculations to assess variability, he then declares the agreement between the expected value 16.666 and the obtained number 17, as good, thus verifying the hypothesis that *p = *. We stress the point that Weinberg’s *einfache Geschwistermethode* and Bernstein’s *apriorische Methode* were not, and are not, directly comparable. As noted above, the first is an *estimation method* fo*r p* and the second is a *method of hypothesis testing* for a presupposed value of *p*.

This misunderstanding was exacerbated by Bernstein’s development of an *alternative estimator* to Weinberg’s (1), for *p*. Bernstein’s estimation procedure was called the *direkte Methode* by Berwald (1924) and by Weinberg (1926), who expressed a claim to priority. It is based, when one considers fixed family size *s*, on (3). For selected families of *s* siblings, at least one of which (in each family) is recessive, E*R* in (3) is replaced by the average number of affected sibs per selected family. Bernstein’s estimator is the solution *p* of the resulting polynomial equation. For example, in Lundborg’s data above there are three families with six sibs each, so *s =* 6, and E*R* would be replaced by (3 + 2 + 2)/3 = 2.333. The relevant solution of (3) can then be calculated.

Calculations on mean square error in Berwald (1924) were misinterpreted by both Bernstein and Berwald as asserting that Bernstein’s estimator for *p* was statistically more efficient than Weinberg’s. Weinberg (1926) is a rather incoherent response in the same journal to Berwald (1924). Weinberg (1928) is cited by English-language authors as his definitive work on ascertainment, although it is focused on defense against Bernstein and Berwald.

Eventually it turned out that Bernstein’s alternative estimator (Bernstein 1931a,b) was the maximum likelihood estimate (MLE) of *p* and thus optimally efficient. Haldane (1932) acknowledges that the MLE of *p* was derived by Bernstein (1931b). Thus Bernstein should be given credit for introducing the MLE, contrary to the views of all subsequent authors of the English tradition such as Li (1993), and Bailey (1961), who, like Irwin (1936), pays peripheral homage to Weinberg, but does not mention either Berwald or Bernstein. Presumably, the language barrier prevented proper pursuit of attribution.

As is clear from its title, the main theme of Crow (1999) was the language barrier that delayed recognition of Weinberg’s statement of the Hardy–Weinberg law. He says that, comparably, Gustave Malécot did not receive adequate recognition because most of his research was published in French. Stark and Seneta (2012) find that Sergei N. Bernstein’s articles on Mendel’s first law, published in French and Russian, remained largely unknown. Similarly, Stark and Seneta (2011) find that a report, published in Russian, detailing repetition of some of Mendel’s experiments attracted no interest outside Russia, even though it was contested there during the Lysenko era, and by no less than the great mathematician Andrei Kolmogorov.

## R. A. Fisher

R. A. Fisher (1890–1962), J. B. S. Haldane, and Lancelot Hogben began to take an interest in segregation analysis about 20 years after Weinberg’s 1912 article. Although Fisher (1934) was not the first contribution of the trio, it is central for our purposes. It has the magisterial character typical of many of Fisher’s articles, but only one explicit reference (Haldane 1932), and even so, no explicit mention of either Weinberg or Bernstein. Thinking presumably mainly of Weinberg, in the introductory section Fisher writes: “Thus in the German literature two methods of estimating the frequencies of recognisable conditions, usually rare defects, in the sibships in which they occur, have long been discussed.” We point out that Fisher gives a heuristic justification of Weinberg’s basic formula without giving his name or clearly acknowledging the existence of the formula.

In his article, Fisher (1934, Sect. III) states: “The Sib method has a wider validity than the Proband method [Section II] and rests on a very simple argument. Ascertainment is supposed to be not through families, but, as is usual in actual practice, through detection of the exceptional individuals themselves. For each of these in the record we may ask how many normal and how many defective sibs (whole brothers or sisters) they possess. Then, taking the original case as proof that the family is one capable of containing albinos, the proportion in which these are produced is judged from the incidence among the remaining members of the family. If more than one albino in the same family are independently ascertained, the family will be counted repeatedly in the aggregate.”

In Fisher (1934, Sect. II) he uses the theoretical frequencies, assuming segregation ratio ¼, to make his points. It appears that Fisher called his Section III “The Sib Method” because the explanation given is actually a justification of Weinberg’s simple sib method.

The greatest living statistician, C. R. Rao (1989), then remarks: “In a classical article, Fisher (1934) demonstrated the need for such an adjustment in specification depending on the way data are ascertained. The author [Rao] had extended the basic ideas of Fisher in Rao (1965) and developed the theory of what are called weighted distributions as a method of adjustment applicable to many situations.” In a 1965 article C.R. Rao revisited the analyses by Haldane (1938) and Hogben (1931) of data on albinism collected by Pearson. Thus have Weinberg (1912) and Bernstein (1931a, 1931b) disappeared from statistical historiography.

## Conclusion

Confining attention to Weinberg (1912), our main conclusion is that Weinberg has not received adequate credit. Part of the reason is that he did not develop his method fully and did not promote it effectively. However, he was in the vanguard of efforts to exploit Mendelism to advance human and medical genetics. For example Weinberg (1908) was a notable contribution to the genetics of twinning in man (Stark 2006). A few writers, such as the late James Crow, did appreciate his work (Crow 1999). Edwards (2004) covered the period 1918–1939, before which Weinberg had done his best work. We do not agree with the view of Edwards that most of human genetics as it existed before the 1950s was created in connection with the blood groups. This overlooks the large amount of medical genetics done in Europe and the United States as well as in the British Isles.

The reputation of Weinberg suffered from the language divide. His lack of mathematical skills and tendency to verbosity did not help his cause.

## Footnotes

*Communicating editor: A. S. Wilkins*

- Copyright © 2013 by the Genetics Society of America