^{1}

^{*}

^{2}

^{*}

^{1}

^{1}

^{2}

Edited by: Mikhail Lebedev, Duke University, United States

Reviewed by: Frank Pollick, University of Glasgow, United Kingdom; Luca Francesco Ticini, University of Manchester, United Kingdom

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Our past studies have led us to divide sensory experiences, including aesthetic ones derived from sensory sources, into two broad categories: biological and artifactual. The aesthetic experience of biological beauty is dictated by inherited brain concepts, which are resistant to change even in spite of extensive experience. The experience of artifactual beauty on the other hand is determined by post-natally acquired concepts, which are modifiable throughout life by exposure to different experiences (Zeki,

In an earlier study (Zeki et al.,

We therefore wanted to add to our previous study of the experience of mathematical beauty by analyzing our results further, with the following questions in mind: what was the degree of variability in the ratings given to equations that had been rated as beautiful and did that variability differ in any significant way from the variability in the “non-beautiful” ratings that had been assigned to other equations? Our only hypothesis in this regard was that, if mathematical beauty belongs to the biological category, then there should be significantly less variability among equations given high ratings than among others. We indeed found this to be the case, which reinforced our view that mathematical beauty belongs to the category of biological beauty, for reasons which we have explored before (Zeki et al.,

A full description of the subjects and methods used to rate mathematical equations is given in Zeki et al. (

In brief, 15 mathematicians (three females, in the age range of 22–32 years) and all of them post-graduate students or post-doctoral fellows in mathematics took part in the study. Each was given the 60 mathematical equations to study at leisure and rate according to the aesthetic experience aroused in them on a scale of −5 (ugly) to +5 (beautiful). Subsequent to a brain scanning experiment, to determine the brain areas in which activity correlates with the experience of mathematical beauty (the results of which are reported in Zeki et al.,

As for comparison, we asked 12 controls (i.e., non-mathematicians) to give beauty and understanding ratings to the same equations, exactly as for the mathematicians (see Zeki et al.,

The five equations (out of 60) given the top ratings (on a scale of −5 to +5) are shown in the upper section while those given the lowest ratings are shown below.

1 | 1+^{iπ} = 0 |
Euler's identity links five fundamental mathematical constants with three basic arithmetic operations each occurring once. | 3.6000 |

2 | cos^{2}θ +sin^{2}θ = 1 |
The Pythagorean identity, which states that for any angle, the square of the sine plus the square of the cosine is 1. | 3.2667 |

54 | Cauchy-Riemann equations are a system of two partial differential equations which must be satisfied if a complex function is complex differentiable. | 3.1333 | |

5 | ^{ix} = cos |
Identity between exponential and trigonometric functions derivable from Euler's formula for complex analysis. | 3.0000 |

6 | Definite Gaussian integral-ubiquitous in mathematical physics. | 2.9333 | |

. . . | |||

39 | |∅| = 0 | The cardinality of the empty set is zero. | −0.4000 |

51 | ^{C} = ∅ |
The complement of the universal set is the empty set. | −0.5333 |

15 | 1729 = 1^{3}+ 12^{3} = 9^{3}+ 10^{3} |
The smallest number expressible as the sum of two cubes in two different ways. | −1.1333 |

28 | 3^{2}+4^{2} = 5^{2} |
Pythagoras' theorem for a 3:4:5 triangle. | −1.1333 |

14 | Equation expressing the inverse value of π as an infinite sum. | −1.8000 |

In our statistical analyses of the results we use the following notations:

Let _{ij} denote the ^{th} subject gives to the ^{th} formula; let _{ij} denote individual ^{th} formula; let ^{th} formula across subjects, respectively; let ^{th} formula across subjects, respectively.

We undertook the following statistical analyses on the ratings.

We first normalized the beauty rating scores for each subject, following which the ratings from each subject were centered at 0 with a standard deviation of 1. The intra-class (between subject) correlation coefficient (ICC) for ratings across subjects became −0.02, indicating that there was no tendency for subjects to systematically give all equations higher or lower ratings (see Figure

We calculated the m-BR and the sd-BR (i.e., the mean and standard deviation of the normalized beauty ratings) as well as the m-UR and sd-UR (mean and standard deviation of the normalized understanding ratings), for each formula across subjects. This gave 60 m-BR values with 60 corresponding sd-BR values, and 60 m-UR values with 60 corresponding sd-UR values. Although the range for beauty ratings was from −5 to 5, and that for understanding was from 0 to 3, the ranges for m-BR and m-UR, after normalization, are (−1.36, 1.07) and (−1.31, 1.08), respectively. Therefore, we removed, through normalization, confounding effects that can be caused by a difference in the ratings' original scales. For simplicity, data analyses were all conducted using normalized data, and we therefore omit the term normalized below.

We plotted the m-BR values against the sd-BR values for both mathematicians and non-mathematicians (see Figure ^{−5}). In simpler terms, there was a higher consensus among our sample of 15 subjects regarding beautiful equations than about the not beautiful ones since, unlike the equations rated as beautiful, there was greater variability for those rated as not beautiful. This is the primary finding reported here.

Left: A plot of the mean pre-scan beauty rating (m-BR) for each equation against the standard deviation (sd-BR) of the ratings given to each equation for mathematicians (Pearson ^{−5}). Right: The same plot for non-mathematicians (Pearson

In contrast, the graph relating the m-BR to the sd-BR for the controls (non-mathematicians) (right in Figure

Although there is a significant relationship between m-BR and sd-BR, there also exists a possible confound since we know of (and might reasonably expect) a positive correlation between the mean beauty ratings and the mean understanding ratings of the equations. Thus, the relationship between m-BR and sd-BR might primarily be due to the understanding rating rather than the beauty rating. Figure

We investigate first the linear relationship between the m-BR and the sd-BR across equations. Formally, consider the equation
_{j} and _{j} refer to the m-BR and the sd-BR of the ^{th} equation, β_{0} and β_{1} are parameters for the intercept and slope, and ε_{j} indicates the residual term (i.e., the information not explained). Our data shows that the estimate for β_{1}, or ^{−5}). In other words, if a formula is on average rated one point higher than a second formula, the standard deviation of the ratings across subjects (which quantifies the disagreement among subjects) for the former formula is 0.21 units less than the standard deviation of the ratings for the latter formula. More simply stated, this means that where there is a higher beauty rating for a mathematical formula, there is less variation in the rating among individuals.

Although a low sd-BR is significantly associated with a high m-BR (the more beautifully rated formulae have smaller standard deviations), it remained possible that these are confounded by subjects' understanding of the mathematical formulae. To check for confounds, we reran Model (1), where the regressor _{j} (the m-BR of formula _{j} (the m-UR of formula

So far, we have shown that the m-UR was not significantly associated with the sd-BR; the possibility remains, however, that the m-UR and the m-BR may _{j} denotes the m-UR of the ^{th} formula, and _{j} indicates the residual term. The

The _{1} and _{2} denote residual sums of squares for Models (1) and (2), respectively, and _{1} and _{2} denote degrees of freedom (i.e., number of data points minus number of parameters) for Models (1) and (2), respectively (Fisher,

Using this test, we show that adding the m-UR (mean understanding rating, μ) in Model (2) does not significantly reduce prediction errors in Model (1) (

Taken together, our analyses show that high mean beauty rating of formulas in a population is significantly (negatively) associated with the standard deviation of the beauty ratings. Specifically, one unit increase of mean rating leads to −0.21 units decrease of standard deviation of a formula. Further, such association is neither due to, nor can be further explained by, one's understanding of the formulas, suggesting that there is a unified aesthetical appreciation of mathematics among individuals, and that such aesthetic appreciation is separate from one's understanding of the mathematical formulae.

A schematic representation of the association between mean beauty ratings (m-BR), mean understanding ratings (m-UR), the standard deviation of understanding ratings (sd-UR) and standard deviations of beauty ratings (sd-BR). The thickness of the connecting lines denotes the pairwise Pearson correlations between them. We show below that although there is an association between m-BR and m-UR, there is no association between m-UR and sd-BR. Moreover, adding m-UR to m-BR does not improve the established association between m-BR and sd-BR.

A graph relating the standard deviation of the pre-scan beauty ratings (sd-BR) against mean understanding ratings (m-UR) for each equation (Pearson

Since the beauty rating scales were from −5 to 5, it is possible that there was a ceiling effect for those equations that were rated either −5 or 5. In other words, were the scale changed to −10 to 10, some of those equations that were rated 5 and −5 on our scale of −5 to 5 may have been rated lower than −5 or higher than 5. If so, then the ceiling effect could have potentially reduced the variance of the highly (or lowly) rated equations, thus confounding our results. Although normalization reduces the ceiling effect, it could still be argued that the results would be different (and possibly insignificant) were the ratings on a different scale, even with normalization. We therefore conducted simulation studies to learn whether such a ceiling effect, even if it exists in our studies, would modify our conclusion.

The likelihood of a ceiling effect for ratings between −4 and 4 is low, because one can always choose a higher (i.e., 5) or lower (i.e., −5) rating. Thus, we focused on addressing the potential ceiling effect with regards to ratings of 5 and −5. The first set of simulations was conducted as follows: for any equation that was rated 5, we simulated a positive integer and added it to 5. Similarly, for any equation that was rated −5, we simulated another positive integer and subtracted it from −5. Each equation-specific integer

We repeated the above simulation study 10,000 times, in each of which different randomly simulated non-negative integers were added to (or subtracted from) 5's (or −5's). The 95% confidence interval for the 10,000 Pearson correlations between the simulated sd-BR and m-BR ranged from −0.420 to −0.419, and the 95% confidence interval for the corresponding ^{−3} and 1.6 × 10^{−3} (see Figure

Taken together, even if there was a ceiling effect, the significant association between sd-BR and m-BR still exists; this conclusion is based on extensive simulations, where serious ceiling effects were considered for the original rating data.

The results from simulation studies. The histogram represents 10,000 correlations between the mean beauty ratings (m-BR) and the standard deviation of the beauty ratings (sd-BR), analyzed by Pearson's test. During each simulation, non-negative integers generated from a Poisson distribution were added to (or subtracted from) the original ratings of 5 (or −5). The 95% confidence interval for the distribution is (−0.420, −0.419).

That the experience of mathematical beauty (Zeki et al.,

We have in the past suggested that the classification of experiences, ranging from ordinary sensory ones, such as that of color, to aesthetic experiences, such as that of beauty, can be subdivided into two broad categories (Zeki, ^{1}

At the other end are experiences determined by acquired brain concepts, examples being that of man-made artifacts consisting of a variety of manufactured goods. The concept underlying these experiences are acquired post-natally and are modifiable throughout post-natal life (Zeki,

This naturally raises the awkward question of whether the experience of mathematical beauty belongs in the biological or the artifactual category.

The experience of mathematical beauty is perhaps the most extreme aesthetic experience that is dependent upon culture and learning; those not versed in the language of mathematics cannot experience the beauty of a mathematical formulation. And yet, once the language of mathematics is mastered, the same formulae can be experienced as beautiful by mathematicians belonging to different races and cultures. Indeed, Paul Dirac coined the term “the principle of mathematical beauty” (Farmelo,

In what does the beauty of a mathematical formula lie? We gave thought to the possibility that the beauty ratings given to our mathematical equations had “low-level” sensory sources, such as curvatures, the position and number of elements, symmetry, and so on. Although this remains a remote possibility, we discount it and Table

Perhaps the most forceful way of accounting for the experience of mathematical beauty, and the one nearest to our belief, comes from Immanuel Kant on the one hand and Bertrand Russell on the other. Kant's views are opaque and difficult to understand, and his use of the term “intuition” especially vague. For an interpretation of what constitutes mathematical beauty for Kant we rely on Breitenbach's (

The implication of the statements made by Russell and others quoted above can be taken to mean that there is a biological basis to mathematical logic and, by extension, a biological basis to the experience of mathematical beauty. Our results are not inconsistent with such a supposition; but we are anxious to emphasize that they are merely suggestive in that direction and that we cannot assert, through them alone, that mathematical beauty is incontrovertibly biological in nature. Rather, we believe that our work, reported here, opens a new and useful discourse on the roots of mathematical beauty and how it can be studied and quantified.

The logical deductive system of the brain, whatever its details, is inherited and is therefore similar in mathematicians belonging otherwise to different races and cultures. It is in this sense that mathematical beauty has its roots in a biologically inherited logical-deductive system that is similar for all brains. It is only by adhering to the rules of the brain's logical deductive system that a formulation can gain universal assent and be found beautiful. Any departure from that would mean that it has lost the universal agreement. Implicit in our argument is that the experience of mathematical beauty, being the result of the application of the brain's logical-deductive system, is a demonstration that the logical deductive system of mathematical brains, no matter what their cultural background may be, is the same. And since mathematical beauty, in our categorization, belongs to the biological category, it is not surprising that there is significantly less variability among mathematicians in rating mathematical equations as beautiful.

All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the University College London Ethics Committee for experiments with human participants.

SZ designed the project, analyzed the results with JR and wrote the paper. JR ran the experiments and analyzed the results, and contributed to the statistical section. OC undertook the statistical analyses.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

^{1}When we speak of similar color experiences, we restrict ourselves to saying that different humans do not differ when assigning different colored patches, surfaces or objects to chips belonging to different color categories. In saying so, we do not address the vexed question of color qualia.