Q: What is microtonal music?

 A: There is no broad consensus on the meaning of the term
"microtonal".  It may help to look at three popular ways of using
the term:

1. literal definition - refers to music having melodic or
harmonic intervals smaller than 100 cents
This excludes music in temperaments like 7-ET or 10-ET, which is
sometimes called "macrotonal".

2. inclusive definition - 'anything other than 12-ET'
In this view, only solo instrumental music of piano, guitar, or
MIDI keyboard would qualify as NOT microtonal.  Some authors even
consider guitar and/or piano to be microtonal, due to note
bending and octave stretch, respectively.
This certainly makes the (valid) point that 12-ET is only an
abstraction and that real music gains a lot of its power and
subtlety by deviating from it.  However, it makes the term
"microtonal" almost meaningless by making it apply to everything.

3. exclusive definition - refers to intonation that is difficult
to approximate in 12-ET
This covers extended just intonation harmony (e.g. 11-limit),
nondiatonic music, nonoctave scales, etc.

It's worth noting that some authors avoid the term "microtonal"
and have suggested alternatives.  The most popular of these is
"xenharmonic", coined by Ivor Darreg to capture the exclusive
definition above:


 Q: I've read that in microtonal tunings, flats aren't equal to
sharps.  For instance, Bb isn't the same pitch as A#.  Why?

 A: Some microtonal notation systems don't even use letters for
note names, let alone have flats and sharps!  So let's just start
with standard Western music.

Around the year 1500, meantone temperament began to become the
standard intonation system in the West.  In meantone, sharps are
below flats (e.g. C# is lower in pitch than Db).  Here is a chain
of fifths:

Db - Ab - Eb - Bb - F - C - G - D - A - E - B - F# - C#

All sharps are to the right, and all flats are to the left.  Any
adjacent pair of notes has to be spelled as a true 5th (five
diatonic letter names apart, inclusive).  A typical meantone
fifth is about 697 cents.  That means C# will be 4879 cents, or
4 octaves and 79 cents, above C.  Db will be 115 cents above a C
three octaves below where we started.  Thus, C# is pitched
36 cents below Db in meantone.

By about the year 1900, 12-tone equal temperament had fully
replaced meantone in the West.  But we didn't switch notation
systems.  Hence, we still have the distinction between C# and Db
even though the pitches come out the same in 12-ET.

At the end of the day we can say that tuning systems based on a
chain of fifths can be notated with standard Western notation.
If the fifth is smaller than 700 cents, sharps will be below
flats.  If the fifth is exactly 700 cents, sharps = flats.
And, you guessed it, if the fifth is larger than 700 cents,
sharps come out ABOVE flats.


 Q: What is "just intonation"?

 A: Generally, it means tuning instruments so that pitches are
related to one another by simple ratios.  For example, A = 440 Hz
and E = 660 Hz are related by the ratio 3/2.  In equal
temperament, this A-E "perfect fifth" is very close to 3/2, but
differs by a precise amount so that the circle of fifths returns
us to the pitch we started on.

There are three sources of disagreement over the precise meaning
of the term, however.  First, What qualifies as a simple ratio?
Is 19/16 simple?  Historically, only ratios that can be factored
with primes no greater than 5 were considered simple.
Contemporary authors tend to be more inclusive, but the phrase
"extended just intonation" is sometimes used to indicate that
primes greater than 5 are being allowed.

Second, What tuning accuracy is required?  If we have 440 Hz and
660.2 Hz, does it count as 3/2?  One answer is that if the tuning
error is being applied intentionally (in a systematic way), then
it should not be considered just intonation.  If instead the
error is a result of the limited accuracy of an instrument (and
is therefore somewhat random over the range of the instrument),
we can still consider the result to be just intonation.

Finally, Which intervals should be considered when more than two
pitches are involved?  The classical "just intonation" scale has
the pitches: 1/1 9/8 5/4 4/3 3/2 5/3 15/8 2/1.  It contains
chords like 4/3 5/3 2/1, which simplify to 4:5:6 and which are
certainly just intonation chords.  But it also contains the chord
9/8 4/3 5/3, which simplifies to 27:32:40 and sounds quite harsh
if a consonant minor triad is expected.  Nevertheless, it is
often implied that whatever combination of pitches we pluck from
a scale like the one above, we are still in just intonation.

An alternative is to consider just intonation to be a property of
chords (simultaneous pitches) only.  Scales can then be discussed
in terms of how many justly intoned chords they contain, what
portion of their triads are justly intoned, etc.  Except for the
harmonic series itself (and only the first few octaves thereof),
all rational scales with more than a handful of pitches/octave
contain chords that fail classical tests of just intonation, such
as being easy to tune by ear.


 Q: I keep seeing stuff like "5-limit" or "9-limit".  What is
this nomenclature and why have I never heard of it before?

 A: Please see:

It may be new to you because music theory is almost always taught
under the assumption that you will forever and always be hearing,
performing, and composing exclusively in 12-ET.

 Q: Can I get a quick summary of how to calculate these?

 A: OK.  For a ratio n/d in lowest terms, to find its prime
limit, take the product n*d and factor it.  Then report the
largest prime you used in the factorization.

To find its odd limit, simply divide n by 2 until you can no
longer divide it without a remainder, then do the same for d.
Then report the larger of the two numbers left over.

To find the prime limit or odd limit of a list of ratios (such as
a scale), simply calculate it for each of them individually and
report the maximum.

To find the prime or odd limit of a chord, first compute its
table of dyads, e.g. for major triad C-E-G the dyads are C-E,
E-G, and C-G.  Then apply the procedure for a list of ratios
given above.


 Q: I see the terms "tuning", "temperament", and "scale" used
interchangeably.  Is there any distinction between them?

 A: There is no consensus distinction between these terms.  In
traditional Western keyboard practice, a "temperament" is usually
what defines how the keys are tuned, and a "scale" is a more
abstract thing, often thought of in terms of patterns of 2nds
(e.g. LLsLLLs for the diatonic scale) or subsets of whatever
temperament is used.

Microtonal music theory needs much more general terms, since the
scales and consonances being targeted are not taken as given.
Gene Ward Smith has proposed distinct definitions for each of
these terms in order to make discourse about microtonal music
more precise.  His definitions have seen adoption on the tuning
lists, but it falls short of consensus.  Here they are:

scale- An ordered list of intervals, which may be applied to a
given concert pitch to generate an ordered list of pitches that
can be used to tune an instrument.
Scala's scale file format (.scl extension) is an embodiment of
this definition.

temperament- A homomorphic mapping from just intonation to an
abstract free group of smaller rank.
This algebraic definition basically means that for every interval
in just intonation, which is composed of some combination of
primes, we express it instead as a combination of generators
(called the generators of the temperament) such that there are
fewer generators than there were primes.  For example, in 5-limit
just intonation every interval can be expressed as a product of
2s, 3s, and 5s.  In meantone temperament, every interval can be
expressed as a product of octaves and fifths.  5-limit JI is
therefore a rank 3 intonation, and meantone a rank 2 temperament.
In 12-ET, every interval can be expressed in semitones.  It is
therefore a rank 1 temperament.

tuning- A choice of intervals for the generators of a
For example, 1200 cents & 697 cents is a good tuning for
meantone.  So is 1200 cents & 696 cents.  They lead to different
scales, but they are both good meantone tunings because they
produce good approximations of 5-limit JI.  Sometimes tunings are
named after the optimization they solve.  For example, "RMS
optimal meantone" refers to the tuning that minimizes the RMS
error from JI under the constraint of the meantone mapping.


 Q: I've heard the terms "homophonic" and "monophonic".  What's
the difference?

 A:         # of voices   # of rhythms   harmony?

monophony *      1             1            N
heterophony     many          many          N
homophony       many           1            Y
polyphony       many          many          Y

 * Harry Partch called his music style "monophony", partly in
contradiction to the usual meaning, above.  His early work does
indeed feature the "intoning voice" in monophonic settings, but
he soon undertook the use of more complex textures.


 Q: I've come up with a simple formula to measure the consonance
of just intonation dyads.  Is it original?

 A: Probably not.  A multitude of consonance ranking formulas
have been proposed and investigated on this mailing list over the
years.  Of these, the product of numerator and denominator, often
written n*d, for a ratio n/d in lowest terms, has been found to
best agree with informal rankings by listeners as well as
published psychoacoustic data.  Of course, no such simple formula
can be perfect or apply in all contexts, but the n*d rule seems
to work reasonably well in typical musical settings.

The n*d rule is also called Tenney height, after James Tenney^1,
who is one of the theorists to propose it (along with Galileo
Galilei^2, Denny Genovese^3, and countless others).

Tenney height generalizes to geomean(a*b*c...) for a chord
a:b:c..., which gives sqrt(n*d) for dyads.

^1 James Tenney (1983). John Cage and the Theory of Harmony,
Soundings 13: The Music of James Tenney, Ed. Peter Garland.
Santa Fe: SOUNDINGS Press, 1984.

^2 Galileo Galilei (1638). Discorsi e dimostrazioni matematiche
interno a due nuove scienze attenenti alla mecanica ed i
movimenti locali. Leiden: Elsevier, 1638.
Trans. H. Crew and A. de Salvio as Dialogues concerning Two New
Sciences. New York: McGraw-Hill, 1963.

^3 Denny Genovese (1991). The Natural Harmonic Series as a
Practical Approach to Just Intonation. Unpublished Thesis, New
College of Florida.


 Q: Tenney height is a 'small whole numbers' rule.  Didn't
Plomp/Levelt/Sethares show that such rules are only valid for
timbres containing harmonic partials?

 A: No.  Plomp & Levelt tested simultaneities of two sine tones
only.  Had they tested 3-, 4-, or 5-tone interactions, they would
have discovered that 'major' chords are more consonant than
'minor' chords.  This difference is mild but well-known in the
case of the 5-limit triads, and quickly becomes overwhelming with
increasing harmonic limit (e.g. in 11-limit just intonation).
Sethares applies the P&L results to many-tone complexes, but does
so in a pairwise fashion.  No pairwise model can explain the
difference between major and minor chords, since they contain the
the same intervals!

Moreover, timbres whose partials stray too far from the harmonic
series will cease to evoke any single sensation of pitch (unless
one partial is vastly louder than the others).  Matching timbre
to tuning to minimize critical band interactions is a powerful
technique that can remove beating from tempered music, but there
is only so much temperament error it can disguise.  Timbres
created by Bill Sethares for temperaments like 10-ET often sound
bell-like for this reason.


 Q: Critical band interactions are strongly supported by
physiological and psychoacoustic evidence.  What's the evidence
for an innate affinity towards simple rational intervals?

 A: Sounds with inharmonic spectra do not evoke well-resolved
pitches, as do sounds with harmonic spectra.  This "virtual
pitch" phenomemon has been studied in pyschoacoustics:

As social animals, humans are highly adapted to extract
information from speech sounds.  Human vocal folds produce rich
spectra with perfectly harmonic overtones, and vowels sounds in
all natural languages are defined by selectively boosting or
cutting regions of those spectra by using the vocal tract as a
resonant filter:

Thus, a hearing system that is able to identify harmonic spectra
as single sources and continuously characterize their spectral
balance over time has high adaptive significance for humans.  It
is noted that Western tonal music produces spectra with similar
characteristics to that of speech.

Recent research on primates (using in vivo single-cell recording)
reveals "combination sensitive" neurons in the auditory cortex
that can detect when two frequencies resolved by the cochlea are
harmonically related:

Imaging of human subjects locates the work of pitch perception in
the analagous tissue:

Additionally, Tramo et al cite evidence from subjects with
legions in this area:


 Q: If Tenney height works, why do we need harmonic entropy?

 A: Tenney height applies only to rational numbers, and only to
simple ones at that (where the product n*d is below 100 or so).
Harmonic entropy is a smooth, differentiable function of interval
size, i.e. 301:200 and 3:2 should have similar entropies.

Near the simple rationals, harmonic entropy is proportional to
sqrt(n*d).  Triadic harmonic entropy has never been fully
computed, but we do know something about how it must turn out.
In 2002, Paul Erlich showed that the area of the Voronoi cells
of rational triads in 2-D triadspace is strongly correlated to
the generalized Tenney height of the triads:


 Q: Is there a 'harmonic entropy for dummies' somewhere?

 A: Harmonic entropy is based on the hypothesis that the brain
tries to interpret what it hears in terms of a harmonic series.
To do this it must be able to recognize rational intervals.  The
rational numbers are infinite in extent, but probably we have
evolved the ability to recognize only the simplest of them, since
the spectra of human voices have most of their energy in the
first several partials.

Pick any limit you like on the complexity of the rationals.  Plot
everything below it on a number line and you'll find that the
simpler the fraction, the greater the distance on the number line
between it and the next point you plotted.

Now the ear, like any measuring device, will have noise in its
output.  If a 440 Hz sine tone is coming in your ear, the signal
arriving in the brain will fluctuate in a range around 440 Hz.
How does the brain assign a ratio to a pair of these noisy
signals?  Let's say the ear is presented with a pair of sine
tones 697 cents apart.  We can draw a Gaussian representing the
noise in the measurement of these tones, above the number line
where we plotted our recognizable fractions:
         . .
      _.'   '._
4/3      3/2     5/3

Clearly, it's a 3/2.   But what if it looked like this?

         .    .
      .'       '.
    .'           '.
 _.'               '._
4/3  7/5   3/2     5/3

OK, let's cut the Gaussian up into slices above each of the
possible ratios:

         .    .
      .' :     '.
    .'   :      :'.
 _.:     :      :  '._
4/3  7/5   3/2     5/3

Now we can compute the area of each slice, and the slice with the
largest area should win -- looks like 3/2 again.

Harmonic entropy measures the ambiguity of this contest between
possible ratios.  In the first curve we drew there was no
contest, so the entropy would be zero.  In the second example,
7/5 gave 3/2 a run for its money, but 3/2 won by a decent margin.
We still think we are hearing 3/2.

But if the Gaussian were cut into several slices of similar area,
the contest would be harder to call.  In a political election
that is too close to call, the country has an uneasy feeling
until the new leader is determined.  Similarly, we feel uneasy
when the pitch or root of what we're hearing is undetermined.


 Q: Which is the correct JI tuning for the minor triad: 10:12:15,
16:19:24, or 6:7:9?

 A: There is no single "correct" tuning -- it's ultimately a
matter of musical taste.  But we can examine the question using
speculative psychoacoustics...

When presented with an auditory stimulus, the brain will attempt
to assign it a pitch.  It does this by extracting harmonic
components from its spectrum.  The stimulus need not be periodic
for this to work, but even if it is, the hearing system is
limited in its ability to:
 1. perform the spectral analysis -- failure here results in
psychoacoustic roughness
 2. detect periodicity -- failure here results in high harmonic
For clean stimuli with periodicities in the range the brain is
good at detecting (such as 5-limit major triads), the extracted
pitch will be fairly unambiguous (sometimes it vacillates by an
octave or two).  For other stimuli, there may be several pitches
competing for the 'answer'.  Harmonic entropy measures this

In the case of 10:12:15, the probability distribution for the
fundamental will tend to include the pitches 4, 5, 8, and 10.
If the tones of the chord are complex tones, the timbre will
influence their relative likelihoods of winning the contest
(since the hearing system considers all partials).  If 4 or 8
win, the brain is interpreting 10:12:15 literally, as a segment
of harmonics relatively high in the series.  It is somewhat more
likely that 5 or 10 will win; the brain hears 10:15 = 3:2 and
dismisses 12 as an artifact.

The same tradeoff exists in the case of 16:19:24, but here the
literal interpretation is less likely because it relies on even
higher harmonics.  However, this time both the literal and
'outer fifth only' interpretations point to the same pitch,
namely 16 (or 4 or 8).  Therefore, it dominates the probability
distribution and 16:19:24 tends to sound more stable than
10:12:15 (though its critical band roughness is greater).

Many listeners rate 6:7:9 as more consonant than either of the
chords just discussed.  But in has two things against it:
 1. 7:6 is getting close to the critical band -- even in the
middle of the piano, beating between some of its lower partials
can be heard, and this lends it a somewhat 'pinched' sound.
 2. It is found low enough in the harmonic series that its
virtual fundamental more likely to be 4 (or 2) and the 7 less
likely to be dismissed.  This gives it the same disadvantage that
10:12:15 had to 16:19:24, only moreso.  That its virtual
fundamental doesn't clearly coincide with its root may make it
less suitable for minor-key tonal harmony, which needs to be able
to resolve to a 'finished' minor tonic.

In summary, 6:7:9 is an excellent minor triad and can often be
used in performances of classical music.  But generally either
10:12:15 or 16:19:24 are to be preferred, because of their
greater tendency to evoke a fundamental that is the same as their
lowest tone (or octave extension thereof).


 Q: Why did Bach favor minor keys?

 A: It may just have been aesthetic preference.  But we may also
speculate that counterpoint works better in minor keys, partly
because the consonance of the tonic chord is weaker, and
therefore the voices are more free and less likely to be heard as
overtones of it.


 Q: I was wondering about the progress of tunings over time.
AFAIK, the oldest records are of Pythagorean tuning, which is a
rank 2 system.  Many centuries later, there came meantone
temperament, which is also rank 2.  Then centuries again until
very recently, when a huge variety of tuning systems is suddenly
available.  Was all really quiet between these times?

 A: There are two intertwined histories here: the evolution of
just intonation harmony and the evolution of the temperaments
used to support it.

We know most musical traditions simply didn't employ harmony, at
least not in an organic fashion.  Outside of Europe and prior to
the 12th century, music was primarily monophonic or heterophonic.

Organum was the birth of 3-limit harmony, which is rank 2.
All really does seem quiet until about 250 years later, when the
5-limit (rank 3) sprang onto the scene:

Though Euler and others theorized about the 7-limit (rank 4),
tetradic harmony didn't emerge in practice until the late 19th
century, in the French impressionist and African-American forms.*
They treat the most consonant tetrads in 12-ET (approximations
of 8:10:12:15, 10:12:15:18, 4:5:6:7, 5:6:7:9 and others) on
roughly equal footing -- rank 4 down to rank 1.  Barbershop
quartet singing (1940s) is closer to a true 7-limit style, as it
favors the 4:5:6:7 chord and uses pure intonation, though it
often employs rank 2 adaptive JI to support diatonic melodies and
chord progressions.

Harry Partch is apparently the first musician to systematically
employ the 11-limit (rank 5).

As for temperament, certain African and Eastern traditions may
have been using 5-, 7-, and 12-ET from the time of the European
middle ages.  This is a reduction from rank 2 to rank 1, but as
already mentioned these forms did not have polyhponic harmony.
In Europe, the first extant description of a systematic tempering
of the 5ths dates from the early 1500s; about a century after the
5-limit was introduced in song.

I expect a number of theorists noticed non-meantone temperings of
the chain of 5ths (Newton ... Helmholtz), but the first thorough
inventory of linear temperaments with 5th-based generators I know
of is due to Bosanquet (1876).  Fokker was apparently the first
to understand tempering in general as reducing the rank of just
intonation.  Erv Wilson was apparently the first to explore
linear temperaments with non-5th generators.  Then, the recent
explosion you mention, which took place on this mailing list, in
which the full universe of temperaments was delineated (no pun
intended).  Paul Erlich, Paul Hahn, Graham Breed, Dave Keenan,
and Gene Ward Smith were especially instrumental in this between
1997 and 2006.  Most of these systems remain untested in musical
practice, though more than a few have been demonstrated.

 * I'm unaware of direct cross-pollination between these schools
prior to Gershwin.  It may have been taking place from the
outset, or it may be a case of parallel evolution.


 Q: The other day, somebody posted

> name: beatles
> comma: [19 -9 -2>  524288/492075
> mapping: <1 1  5]
>          <0 2 -9]
> TOP period: 1197.1
> TOP generator: 354.7
> MOS: 1+1, 1+2, 3+1, 3+4, 7+3, 10+7

and my eyes glazed over.  What does all this mean?

 A: It identifies a temperament called "beatles".  You may be
familiar with meantone temperament, which is generated by a chain
of approximate 3:2 "fifths" reduced by 2:1 "octaves".  These two
intervals are said to be the generator and period, respectively,
of meantone temperament.

Actually there is a range of sizes of approximate fifths and
octaves that will produce meantone.  You may have heard of
"quarter-comma meantone", which refers to the meantone tuning
obtained when octaves are pure and fifths are one quarter of a
syntonic comma flat.

Above, we are told that beatles temperament has a "TOP" period of
1197.1 cents (an approximate octave) and TOP generator of 354.7
cents (a neutral third).  TOP stands for Tenney OPtimal.  It's
an optimization procedure that's supposed to spit out the best
generator and period for any temperament.  TOP meantone has a
fifth of 697.6 cents and an octave of 1201.7 cents.

We know that in meantone, the pentatonic and diatonic scales are
usually preferred over the hexatonic or octatonic scales.  We
also know that 5 + 7 = 12, 7 + 12 = 19, and 12 + 19 = 31, and
that these are all equal temperaments which support meantone.
This isn't a coincidence.  It's caused by a recurrence relation
characteristic of meantone temperament.  Every temperament has a
recurrence relation that tells us which ETs, and scales within
those ETs, are likely to be fruitful.  The scales in these
patterns were dubbed MOS (Moments Of Symmetry) by Erv Wilson.*

You may know that the syntonic comma, 81/80, vanishes in meantone
temperament.  If you perform a chord progression that has a net
modulation of 81/80 in just intonation, it will bring you back to
unison in meantone.  Well, beatles temperament does the same
thing to 524288/492075.  That comma's a bit big as a fraction, so
it's easier to factor it as 2^19 * 3^-9 * 5^-2 and write it as
[19 -9 -2>.  This shorthand is borrowed from "braket" notation in
physics, and is called a "monzo" after Joe Monzo, who promoted
the use of prime-factor notation.  The thing to remember is that
monzos represent JI intervals and they always point to the right.

Beatles is a 5-limit rank 2 temperament, and all 5-limit rank 2
temperaments can be uniquely identified by a single comma that
they temper out (things get more complicated above the 5-limit).

The mapping rows give us two vals for beatles -- this is another
way to uniquely identify it.  Vals are like monzos except they
always point to the left, and instead of representing JI
intervals in terms of prime factors, they represent tempered
intervals in terms of periods or generators.  In this case, to
get an octave (2/1) out of beatles we need to travel +1 period
and +0 generators.  In other words, the period of beatles is an
octave.  To get a 3/1, we must travel +1 period and +2
generators.  So the generator must be some kind of half-fifth.
Lo and behold, that agrees with the TOP period and generator we
looked at earlier.

For comparison, here's the entry for meantone:

> name: meantone
> comma: [-4 4 -1>  81/80
> mapping: <1  2  4]
>          <0 -1 -4]
> TOP period: 1201.7
> TOP generator: 504.1
> MOS: 1+1, 2+1, 2+3, 5+2, 7+5, 12+7, 19+12

 * Aspects of this concept were independently discovered by each
of the following theorists: Balzano, Carey, Clampitt, Clough,
Myhill, and Yasser, but only Yasser's work predated Wilson's and
none of them seem to have understood the phenomenon as deeply.


 Q: What is "meantone"?  Is 24-ET a meantone temperament?

 A: "Meantone" is the 5-limit rank 2 temperament with 81/80 in
its kernel.

24-ET, as a temperament, is rank 1.  So it's not meantone.  And
even if we made an exception, the meantone-like val you'd use in
24 has torsion so it's not valid temperament anyway.

As a scale, however, there may be an acceptable meantone tuning
within 24-ET.  For a scale to support meantone temperament, it
must satisfy the map

< 1  2  4 ]
< 0 -1 -4 ]

And a good scale should satisfy it so that the most accurate
approximation in the scale for each prime is the one mapped.  In
24-ET, if we choose 24 steps & 10 steps as generators, the above
mapping reduces to

< 24 38 56 ]

That is, we get the 700 cents for 3:2 and 400 cents for 5:4,
which are the best approximations available.  So as a scale,
24-ET supports meantone temperament.


 Q: What are comma shift, comma drift, and comma pumps??

 A: Comma pumps are chord progressions that require either comma
drift or comma shifts to perform in strict just intonation.

Drift occurs when concert pitch is allowed to rise (or fall) each
time the pump repeats.  Shift is where, instead, common tones
between adjacent chords are adjusted to keep the concert pitch
stable, introducing tiny melodic steps.

In adaptive JI, the shifts are allowed to be irrational
(example: 1/4 syntonic comma).  It is essentially the comma shift
solution, but we are allowed to divide the shifts into even
smaller steps (which may be inperceptible) and distribute them
more widely.


 Q: What's a "pun"?

 A: A pun is a surjection


The farther apart 3 & 4 are, the greater the pun.  In music
theory, the thing on the left is JI, the thing on the right is a
temperament, and 3 & 4 would be far apart in cents.  In comedy,
3 & 4 would be far apart in sense.  Nyark.


 Q: Why do good temperaments tend to temper out superparticular
commas?  Meantone tempers out 81/80, miracle 225/224, and so on.

 A: Temperaments are good if they provide many consonances with
few notes, without introducing a lot of tuning error.  Tempering
out a comma makes it a "unison vector", which cuts the infinite
prime-limit lattice into periodic sections.  The simpler a
comma's ratio, the closer it lies to the lattice's origin and the
narrower the periodic slices it will define.  So "few notes" is
satisfied by simple commas because they provide access to the
entire lattice via periodic slices or tiles containing fewer
lattice points.  Low tuning error, on the other hand, is
satisfied by commas that are small in magnitude, since it is
their magnitude that will vanish in the temperament.  It's
straightforward that superparticular ratios tend to have the
smallest magnitude for a given complexity.


 Q: I've played around with creating my own scales, but none of
them quite sound as 'simple' as the diatonic scale in 12-ET.
Why is that?

 A: There are two primary kinds of simplicity worth mentioning
when it comes to scales.

First, there's something about the number of total number of
intervals in a scale.  Usually one can just look at all the 2nds
of the scale (e.g. 9/8 10/9 etc.) and get a good idea of the
total number of unique intervals.  A scale with only one kind of
2nd can only have one kind of anything else (is an ET).  With two
kinds of 2nd the order matters; LsLsLs... gives only 2 kinds of
3rd, but LLssLLss gives three kinds of 3rd (2L, L+s, and 2s).

Scales with fewer unique intervals seem to me to have a more
'coherent' sound, for lack of a better word.  Several authors on
the tuning list have noticed this over the years -- such scales
seem to be more 'singable'.  On the other hand, having a high
number of different intervals (as many JI scales do) also imparts
a unique sound that is certainly valid fodder for music-making.

The second kind of simplicity has to do with symmetry at the 3:2.
Octave-equivalent scales are fully "closed" at the octave.  This
means that if you sing a note an octave away from a pitch in the
scale you'll wind up on another pitch in the scale.  There's some
evidence that a similar effect may be important for the next most
powerful consonance after the octave, 3:2.  But no scale can be
fully closed with respect to both the octave and the 3:2 (this
would mean some number of fifths came out equal to some number
of octaves, which we know is impossible).  So octaves get first
dibs, then we do the best we can with 3:2s.  Pythagorean chains
will be optimal here.  After that, classical tetrachordal scale
construction gives good results.


 Q: What's "Rothenberg propriety"?  What's a "rank-order matrix"?

 A: David Rothenberg is a mathematician who published a series
of papers on the perception of melody and the construction of
musical scales in the 1970s.

He starts with the notion that a melody is not only a series of
pitches (e.g. E3, D3 etc) but also a series of intervals.  And
not only specific intervals (5:4, 9:8 etc) but also scalar
intervals (3rd, 2nd etc).  The scale is the language of the
melody, in other words.  Evidence for this notion includes the
fact that diatonically transposing a melody (e.g. to the
relative minor) preserves a great deal of its essence.

He further assumes that listeners do not carry out something
like: "I'm hearing a 5:4.  I know all 5:4s are 3rds, therefore I
am hearing a 3rd."  The process is assumed to be something closer
to: "This interval is smaller than the previous one. Therefore
if the previous interval was a 4th, this ought to be a 3rd."

R. claims that in order for this to work, the scale has to have
the property that no 2nd is larger than any 3rd, and so on.  Such
scales are called "proper".  Improper scales are predicted to be
unsuitable for diatonic-like music, and instead must be used
against a drone.

We can show all the intervals of a scale by listing all modes of
the scale on consecutive rows of a matrix, such that scale
degrees are shown in columns.  Doing this for the diatonic scale,
we indeed find that all 5:4s are in the 3rds column.  But since
only relative sizes matter in Rothenberg's regime, we can replace
each interval with its size rank among all the intervals in the
matrix.  You can see this in Scala by choosing:

	View > Interval Ranking Matrix.

Rothenberg claims that if two scales have the same rank-order
matrix, they will tend to be heard as different tunings of the
same thing.  Rank-order matrices thus function like scale
classes.  It's a powerful idea whose implications remain largely


 Q: I've read that melodies are easier to sing in scales of low
Rothenberg mean variety.  Is this true?

 A: There's a plausible reason why it may be true.  Humans are
generally able to recognize transposition.  As the number of
different intervals in a scale goes down, the number of excerpts
of a melody that are identical under transposition goes up.  In
12-ET, the phrase D-E-F-A-B-C consists of congruent sections
D-E-F and A-B-C.  In the 5-limit diatonic scale this symmetry is
broken (10/9 - 16/15 vs. 9/8 - 16/15).  Generally, it should be
easier to sing something if you've sung it before.


 Q: Why is MOS important?

 A: The simplest answer is that it may not be terribly important.
Nevertheless, the majority of music is made with MOS scales like
the pentatonic and diatonic/natural minor, rather than non-MOS
scales like the harmonic minor.  One explanation is that MOS may
just happen to have the 'right' amount of symmetry, for both
harmony and melody.

Approaching from the harmony side of things, MOS scales are
embodiments of rank 2 temperaments.  Given a just intonation of
rank n, we can temper down to rank n-1, n-2... n = 1.  So we
might ask, why is rank 2 important?

That raises the question, why is temperament important?  And
again, maybe it isn't.  But it is useful.  It reduces the number
of notes one needs to do certain things musically.  That makes
notation easier to think about, instruments easier to build and
play, and effects like puns possible.  Temperament also seems
quite natural in many cases... for example, choirs tend to
spontaneously ignore 81/80 when performing 5-limit homophony.

There is thought to be a limit to the number of tones that can be
used in a melody.  George Miller famously suggested it should be
about 9 notes/oct.  But ETs < 10 don't get us very far in the
tuning accuracy dept.  The obvious thing is to try higher-rank
systems, but then we start to lose the benefits of temperament
mentioned above.  Rank 2 seems to strike a balance.  5-limit JI
is itself rank 3, so rank 2 is the least degree of temperament
available without adding more primes.

One could draw a melody from a small scale and harmonize it
chromatically.  But aside from the fact that it can be hard to
keep the chromatic steps out of the melody, there are purely
melodic arguments why the small scale should be MOS anyway.

From the melody side of things, MOS scales are scales with
Myhill's property -- every interval class comes in exactly two
sizes.  Again, this seems to be a good amount of symmetry.  ETs
can sound too symmetrical.  It can be hard to keep one's place in
the scale, the modes tending to blend together.  Already though,
rank 3 scales can be hard to sing, because one has to articulate
different steps of similar size in a certain order.  Motivically,
it can be too *difficult* to lose one's place in the scale, to
establish a new mode...  Imagine failing to recognize a motif
because it appears in a mode too dissimilar from the one in which
it was first presented.  In the harmonic minor for instance, the
upper tetrachord is quite different from the lower one (which may
be why the melodic minor was developed).

Most of the popular scales in common practice music that aren't
MOS are 'near-Myhill'.  The harmonic minor, neglected as it is,
still has only two sizes of 3rd/6th.  Likewise, the octatonic
scale is slightly MORE symmetrical than a MOS, since some
interval classes have only one size.  This leads to Rothenberg's
"mean variety", which is the average number of sizes per interval
class.  It appears that, for music using both tonal and modal
effects, values near 2 are best.