^{1}

^{2}

^{3}

^{*}

^{2}

^{3}

^{1}

^{2}

^{3}

Edited by: Plamen Ch. Ivanov, Boston University, United States

Reviewed by: George Datseris, Max-Planck-Institute for Dynamics and Self-Organisation (MPG), Germany; Axel Hutt, Inria Nancy—Grand-Est Research Centre, France

This article was submitted to Dynamical Systems, a section of the journal Frontiers in Applied Mathematics and Statistics

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Tension-resolution patterns seem to play a dominant role in shaping our emotional experience of music. In traditional Western music, these patterns are mainly expressed through harmony and melody. However, many contemporary musical compositions employ sound materials lacking any perceivable pitch structure, rendering the two compositional devices useless. Still, composers like Tristan Murail or Gérard Grisey manage to implement the patterns by manipulating spectral attributes like roughness and inharmonicity. However, in order to understand the music of theirs and the other proponents of the so-called “spectral music,” one has to eschew traditional categories like pitch, harmony, and tonality in favor of a lower-level, more general representation of sound—which, unfortunately, music-psychological research has been reluctant to do. In the present study, motivated by recent advances in music-theoretical and neuroscientific research into a the highly related phenomenon of dissonance, we propose a neurodynamical model of musical tension based on a spectral representation of sound which reproduces existing empirical results on spectral correlates of tension. By virtue of being neurodynamical, the proposed model is generative in the sense that it can simulate responses to arbitrary sounds.

Music gives rise to some of the strongest emotional experiences in our lives. Even though the first surviving theoretical treatments of the power of music to move the soul were written in the fifth century B.C. [

Creating and resolving tension is an easy task for composers who follow the nineteenth century Western tradition (as, e.g., most “mainstream” composers of film music do); any standard textbook on harmony and voice leading provides them with plenty of recipes [e.g., [

Devoid of any perceivable pitch structure, the ferocious sound materials contemporary art music is at times so fond of can only be conceived in terms of loudness and timbre. This forces any composer seeking a full control over these “beasts” to dive from the lofty heights of venerable musical abstractions like pitch, harmony, and tonality to the cold depths of spectral representations of sound. However, beauty emerges even from such depths; by careful manipulation of roughness and inharmonicity composers like Tristan Murail or Gérard Grisey “tense” their audience no less than Richard Wagner by his mastery of tonal harmony; indeed, the term “spectral music,” used when referring to the music style pioneered by the former two composers [

As usual, music-psychological research somewhat lagged behind compositional practice; loud music has been shown to be perceived as more tense than soft music [

For standard Western musical intervals, roughness is a principal source of perceived dissonance in musical material, which thus gives rationale to mathematical models of musical dissonance [_{1} and _{2} the fundamental frequencies of the tones spanning the interval, _{1}, _{2}] with a pair of coprime natural numbers, Ω′ = [

The quantity above, called

Motivated by the latter observation, we put forward a neurodynamical model of tension which is in line with the basic concepts of pitch perception of complex sounds and reproduces the results concerning the effect of roughness and inharmonicity reported in Farbood and Price [

Everyone interested in neurodynamical modeling faces the same basic dilemma: which model to use? For modeling perception of music, the most common choices are the leaky integrate-and-fire (LIF) model [

The pioneering work of the latter approach in our field is Large et al. [

Our choice of model class is motivated by the observation that the auditory system is sensitive to the periodicity of the signal (see section 1). A possible explanation for the observation is that the system comprises an array of oscillatory “detectors” with external auditory signal input; these can be viewed (and indeed physically seem to be) ordered tonotopically with respect to their eigenfrequency. Within this framework, eliciting a sustained oscillation in one of the oscillators represents detection of the corresponding period in the input signal. Or, on a continuous scale, the more sustained an oscillation is, the more “‘confident” is the auditory system that the input signal exhibits the corresponding period. In order for a stimulus to elicit an oscillation in a model belonging to our class of choice, it needs to destabilize the (originally stable) quiescent state; if this shift in stability is relatively small or intermittent, the oscillation will have small or fluctuating amplitude.

To avoid introducing unnecessary complexity, we start building our model class of interest by considering the simplest possible model of an oscillator:

where _{1}, _{1} ∈ ℝ and ẋ_{1}, ẏ_{1} are time derivatives. In matrix form:

Here, _{1} and _{1} could be interpreted as the amount of local inhibitory and excitatory synaptic activity, respectively, but the particular physiological interpretation of the variables is not important for our discussion. In line with the scenario outlined above, we want the oscillator to transit from a quiescent state (say, [_{1}, _{1}] = [0, 0]) to an oscillation when subject to an input having the oscillator's period (1 in this case). By definition, the spectrum of such an input consists of frequencies which are a subset of 1, 2, …, _{1}, ω_{2}, …, ω_{n}] = Ω = [1, 1, …, 1, 2, 2, …, 2, …, _{i} as

where

Two more steps are needed in order to make the model class amenable to derivation of a normal form. First, we need to rewrite this non-autonomous system as an autonomous one. This is straightforward, since [_{i+1}(_{i+1}(_{1}, _{1}, _{2}, _{2}, …, _{n+1}, _{n+1}]:

Second, we expand the

where _{d}(_{d}(

Before delving into the actual derivation, a few remarks are in place. First, the idea of modeling the auditory system as a tonotopically arranged array (or rather a series of arrays) of oscillators is in fact not new [

As the first step of the derivation, we diagonalize the linear part of Equation (1) using the following two matrices:

where

The diagonalization defines a change of variables:

After this change, Equation (1) reads:

with _{ζ}(ζ) which is a function in the complex vector space corresponding to

When in normal form,

where _{1}, _{2}, …, _{n}], _{1}, _{2}, …, _{n}], and [

Analogously, the exponents in

(see Equation 2). Since

For conciseness, from now on, whenever

and, by equality of polynomials,

for all [

To broaden the class of systems covered by our normal form, we unfold Equation (3) using small parameters α ∈ ℝ and β ∈ ℝ^{n}:

This way, in addition to the models with Taylor expansion around the origin of the form (Equation 1), our class now includes models whose Taylor expansion around the origin has the form

The corresponding normal form then reads

Intuitively, the α parameter makes it possible to change the stability of the origin (see section 2.1) whereas the β_{k} parameters allow the model to resonate when the inputs are not exactly its harmonics. In fact, it might happen that two different inputs approximate the same harmonic. That is, their frequencies equal (1 + β_{i})ω_{i} and (1 + β_{j})ω_{j}, respectively, with ω_{i} = ω_{j}. This is the reason why we allowed for duplicate frequencies in the input frequency vector Ω (see the beginning of section 2).

It might appear that Equation (9) only models period detection in a full harmonic spectrum (up to the ^{harm} ⊂ Ω by removing the dependence on those _{j} and _{j} for which _{1} and ẇ_{1} (assuming the dimension of the system is large enough to accommodate any stimulus of practical interest). Since the right-hand sides of the equations are polynomials, it suffices to zero-out the coefficients of those terms containing nonzero powers of the offending _{j} or _{j}. This will turn out to be useful when extending the model to an array by making scaled copies of Equation (9); while changing the eigenfrequency by scaling the left-hand side, we can zero-out coefficients as needed to reflect the changing relation between the eigenfrequency and the inputs.

It might be interesting to compare Equation (9) to [[

where

In Equation (10), the sum runs over all such vectors [_{>0}, _{≥0},

In this subsection, we analyse the normal form (Equation 9) derived in section 2. More precisely, we study the stability of the origin (_{1} = _{1} = 0). The choice of the origin as the focus of this section is motivated by our previous (arbitrary) choice of the origin as a “quiescent state” of the oscillator (see the beginning of section 2). The reason why we treat its

As the first step of the analysis, note that all solutions to Equation (9) are of the form

Consequently, using the simplified notation Ω = [1, 1, …, 1, 2, 2, …, 2, …, _{1}, β_{2}, …, β_{n}] as above, and introducing ρ = [ρ_{1}, ρ_{2}, …, ρ_{n}], φ = [0, φ_{2}, …, φ_{n}], and Ω_{β} = Ω ◦ β = [ω_{1} β _{1}, …, ω_{n} β _{n}], we can drop equations for ż_{2}, ż_{3}, …, ż_{n+1} and ẇ_{2}, ẇ_{3}, …, ẇ_{n+1} from Equation (9) and write

Introducing new coordinates relative to a rotating frame of reference ^{ιt},

and new parameters,

Equation (13) reduces to:

As we will see in section 3, under a rather generic restriction on Equations (16) and (17), the stability of

Assuming the origin is a fixed point of the system of Equations (16) and (17), its stability is determined by the Jacobian of the system evaluated at the origin:

where

In particular, the fixed point solution at the origin is stable, if all the eigenvalues of the Jacobian have negative real parts; while it is unstable if at least one eigenvalue of the Jacobian has positive real part. Apparently, without input (ρ = 0), the stability is solely determined by the matrix

With input, one can view the Jacobian as the matrix _{pq}. Thus, if we consider the neural auditory system as spontaneously possessing stable fixed point for a given pitch-detector, i.e., its α < 0, only inputs with high amplitude ρ and/or spectral content giving rise to suitable solutions [1, 0,

Let us now in detail assess which monomials appear on the right-hand side of the reduced equations. Note that all solutions to Equation (18) correspond to non-negative integer linear combinations of a finite set of minimal solutions, i.e., they have the structure:

where _{i}, _{i}], equal to the

and

As noted above, we model stimulation with ^{harm} ⊂ Ω by zeroing-out those monomials containing nonzero powers of _{j} or _{j} for which _{kM} will be nonzero if and only if

where

(see Equations 18 and 20).

In this section, we show how the system of Equations (16) and (17) reacts to stimulation with complex tones varying in relative periodicity and inharmonicity. For the specific case of complex tones consisting of two harmonics, analytical treatment is feasible as we are basically dealing with an interval comprising two pure tones. Let the frequency ratio of the two harmonics be approximated as ^{harm} = [_{j} − _{j})Ω_{β}) and the exponents of ρ (i.e., _{j} + _{j}) for this case. Note that both the frequencies (in absolute value) and the exponents grow monotonically with _{kM} does not grow superexponentially with

Context-derived heterogeneous functions of monocyte subsets.

[1, 0] | [1, 0] | [2, 0] | 0 |

[0, 1] | [0, 1] | [0, 2] | 0 |

[ |
[0, |
[ |
_{i} − β_{j}) |

[0, |
[ |
[ |
−_{i} − β_{j}) |

^{harm} = [i, j], i ≤ j)

Further, it can be shown that the frequencies above also grow (in absolute value) with the inharmonicity of the interval, the other factor in perception of musical tension considered here. Let _{1} and _{2} denote the lower and the higher frequency of the interval, respectively, that is,

(see Equation 12). Additionally, let

The inharmonicity of the interval [_{1}, _{2}] with respect to the fundamental frequency _{0} is defined as its weighted Manhattan distance to the interval [_{0}, _{0}], comprising the _{0}. The distance is weighted by the squared signal amplitudes and normalized by _{0} and the sum of the squared signal amplitudes [

Indeed, the frequencies grow (in absolute value) with the inharmonicity of the interval. Consequently, noting that _{kM}, pure-tone intervals with lower relative periodicity and lower inharmonicity (i.e., those perceived as less tense) cause a higher-amplitude and slower fluctuation of the driven system eigenvalues around those of

Note that there is an ambiguity of approximation represented by a choice of _{0} in Equation (22) so that the entire array essentially works as a pitch detector. Time traces from simulations of such an array are depicted in

Time traces from simulations of Equation (24) with a soft harmonic _{[0, 0]}) (see Equation 2); the amplitude _{1k}|.

Time traces from simulations of Equation (24) with a loud harmonic _{[0, 0]}) (see Equation 2); the amplitude _{1k}|.

The equations for the array were derived by applying the above restriction on linear terms to Equation (9), writing-out the inputs (Equations 12, 16, 17) and using Equations (6) and (18),

then truncating the higher-order terms,

and, finally, setting

and scaling the time for convenience by the eigenfrequency, _{k}, which yields a parametrically-forced normal form for supercritical Andronov-Hopf bifurcation:

Here, (ω_{j})_{i} signifies a vector with ω_{j} at the

(see Equations 1, 8), where each element of Ω_{24TET} approximates the corresponding element of Ω as a power of _{24TET} are aligned in such a way that ω_{5} = ω_{24TET, 5} = 1 and hence β_{5} = 0. The oscillators (Equation 24) receive connections from a bank of input units—linear oscillators with eigenfrequencies spanning from _{0} to _{4} in quarter-tone steps. In accordance with Equation (8), each oscillator (Equation 24) is only connected to input units with frequencies (_{k}ω_{i}β_{i} in Equation 8, after scaling by _{k}) approximating its harmonics (in the above tuning) and, additionally, to frequencies up to 4 quarter-tones below and above these. In other words, it does not have fixed homogeneous connectivity input strength from all input units, but rather receives (weighted) input only from input units with frequencies close to its first six approximate harmonics; the connectivity of each oscillator is thus effectively defined by a connectivity pattern or kernel consisting of six unimodal elementary Gaussian kernels (^{−0.5l2}; _{0} in Equation 22) and input units and restricting the connectivity to (near-)harmonics, there remains no ambiguity in approximation of the input; each oscillator, as long as the input falls within the reach of its connectivity kernel, approximates the input in its own, unique, way.

Connectivity of the oscillator with eigenfrequency _{1} (_{0} to _{1} and target the oscillator's “slots” corresponding to ρ_{1,1}, ρ_{1,2}, …, ρ_{1,9} in Equation (24); likewise, the second blob connects _{1}, …, _{2} to ρ_{1,10}, ρ_{1,11}, …, ρ_{1,18} etc. The connectivity of any other oscillator is obtained by shifting this kernel so that the center of the first blob is aligned with the oscillator's eigenfrequency.

All simulations were run from initial conditions

with

a parameter setting corresponding to the (almost loss of) stability (without input) of the fixed point _{1k} = 0. Three alternative inputs were applied, whose spectra can be seen in

Spectrum of the harmonic

As can be seen from

Trajectory of the _{1} oscillator with soft/low-amplitude input corresponding to

Minima and maxima of the oscillator amplitude traces from simulations with soft/low-amplitude input corresponding to

We propose the absence of a stable unambiguous pitch detection modeled as the absence of a pronounced amplitude peak in an array of oscillators to be a correlate of timbre-induced musical tension. In the class of oscillators we chose for populating the array, the amplitude of the limit cycle is determined by the stability of the origin; if the stability switches between a stable and an unstable regime fast enough, the amplitude doesn't have enough time to grow. We show that the frequency and magnitude of this switching depends on inharmonicity and roughness of the input to the oscillator. Imagine such an oscillator is actually present in the brain; when subject to a tense (inharmonic and/or rough) stimulus, it will remain almost silent, leading to an “unclear,” “unstable,” “difficult to memorize” etc. percept (see

Of course, tension is clearly not a one-dimensional phenomenon and different aspects of it could be related to different aspects of the underlying neurodynamics. For instance, in a nonlinear model like the one proposed here, loudness of the input is going to affect both the general amplitude of the oscillations and their temporal fluctuations—in a frequency-dependent manner, as our example simulations for two loudness levels suggest. We consider disentangling these not necessarily orthogonal dimensions of tension as a natural extension of the currently proposed modeling framework.

We have proposed a neurodynamical model of musical tension (see Equation 24) which reproduces existing empirical results on timbral correlates of tension, is consistent with neuroimaging findings [

Considering the simulation results reported above in more detail, we note that the overall increase in fluctuation of stability of the origin for the inharmonic and the rough input as compared to the harmonic one can be explained based on the analytical insights into the dynamics of a single oscillator obtained earlier. More precisely, the nearly-harmonic relations in the inharmonic and the rough input introduce oscillating terms into most of the oscillators' coupling functions; the increase of amplitude modulation is, in turn, accounted for by the fact that the amplitude of the stable limit cycle of Equation (24) is determined by the stability of the origin. The decrease of the peak amplitude is, for the inharmonic input, probably due to the connectivity; there are no exact harmonic relations in the input and hence no oscillator can align its connectivity kernel optimally with the input (see

As for our general approach, a few comments are in order. First, for the sake of simplicity, we chose a subclass of

Also for the sake of simplicity, we only considered relative periodicity and inharmonicity of pure-tone dyads. For general sounds, we would be dealing with the set of nonnegative solutions to a general linear Diophantine equation (Equation 18). To the best of our knowledge, the structure of the set (its minimum generators) can only be determined algorithmically [e.g., [

Further, concerning the phenomenon wherein loud music is perceived as more tense than soft music [

Finally, even though the choice of spectral representation was motivated by our interest in contemporary art music, especially the so-called “spectral music,” the model presented here is applicable to any kind of music; indeed, even music composed with traditional categories in mind ends up being rendered as sound which can be fed into our model.

To conclude, mapping perception to neurodynamics is hard. However, from time to time, a favorable constellation of research sheds light on the underlying physiology. The fruitful concept of

The datasets generated for this study are available on request to the corresponding author.

JH and MH contributed to conception, theoretical analysis, design of the study, manuscript revision, read and approved the submitted version. MH implemented the simulations and visualizations and wrote the first draft of the manuscript.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

We thank Pavel Sanda and Hana Markova for reading the manuscript and providing helpful comments.