One of the best-known and most ambitious music programs for Linux
is the LilyPond score engraving
system. Unlike other typesetting software like Finale or Sibelius,
LilyPond is not a score editor, and it has no GUI — instead it
aims to start from a simple textual description of the music and turn
it into the highest possible quality output, automatically.
LilyPond is the result of several years of work by Han-Wen Nienhuys
and Jan Nieuwenhuizen. In this extensive interview, Linux Musician's
Chris Cannam talks to them about
recent and future directions for the project.
Chris: I recently found a file of music examples
I had printed out from LilyPond, probably in 1998. The LilyPond
printouts looked less professional than they would be today, but many
of the capabilities of today's software were in place. What have you
been doing for the last six years?
Han-Wen: About five years ago we were working up
to release 1.0. Our target was to have a usable program that could
produce basic music notation, where we defined “basic” as
“whatever is in our set of simple test pieces”, and usable
was “will not dump core, mostly.”
We succeeded, but of course it didn't work very well
for things that weren't in our test-pieces. By that time, we were also
reaching the bounds of what was possible in our model of notation, an
object-oriented model, hard-coded in C++. So we decided to integrate
the GNU's GUILE library, a Scheme interpreter which was specifically
designed to extend programs. We spent the next two to three years
refactoring our C++ code into Scheme functions. This resulted in a
more flexible, more efficient and better maintainable program.
“We knew what ‘publication
quality’ engraving meant, and were determined to perfect Lily
into producing that.”
The second big change was catalyzed by an invitation to
join a workshop in Firenze, Italy, organized by Nicola Bernardini of
AGNULA fame, then director of Centro Tempo Reale. At the
workshop we met Nicola, a few top-notch engravers, and an editor for
Universal Edition, an Austrian publisher that does a lot of
contemporary music. We had the chance to discuss LilyPond with several
experts. On the one hand, we were thrilled that they took us
seriously, but on the other hand they pointed to several inadequacies
in our output. We arrived back home a great deal wiser.
We knew what “publication quality”
engraving meant, and were determined to perfect Lily into producing
that. Since we like hand-engraved music, we started reproducing simple
pieces in LilyPond and comparing the output side-by-side. By doing
close comparisons, we learned how music should really look, and we
fixed all the deficiencies that we found.
In anything that you write, there will always be a
neat, simple, small idea that is obscured by crufty implementation,
bad design or suboptimal algorithms. According to me, the real art of
programming is recognizing the neat idea, and being ruthless enough to
redo all the other bad bits. Since we're writing new code all the
time, we also have continue to refactoring everything, and this how we
have spent the last few years: coding new stuff, and refactoring old
stuff.
We also did a lot with the documentation. Some of our
users complain about the current documentation, and they're probably
right, but what we have now is light-years ahead of the manual a few
years ago.
Your website features an essay on
music typesetting that is quite critical of other software, with
an entertaining piece of bad typesetting from Finale. You make an
effort to explain that it isn't just an exceptional example —
but surely if programs like Finale and Sibelius are so widely used by
good musicians, they can't really be that bad?
The default output of Finale is indeed shockingly bad,
which is why almost all other vendors routinely compare their packages
to Finale. Of course, that's why we use it too. The default layout of
Sibelius is not very elegant, but at least it's usable. A Sibelius
sample would be a less entertaining and less convincing
demonstration.
A lot of readers misinterpret the Finale
example. Finale is a powerful package, and in the hands of a good
engraver — which is something different from a good musician
— it can produce very good scores. However, the friendly GUI is
misleading: you need a lot of time and expertise to get decent scores
from Finale. We've seen the engraver for the late Luciano Berio in
action, and he uses Finale exclusively, but only as a glorified
drawing program.
We recently
interviewed the music engraver Mike Mack Smith, who uses a package
called Amadeus. How do you feel LilyPond stands up against venerable
professional packages like Amadeus or SCORE technically?
“
Concentrating on hard problems instead of simple ones is one of the
things that makes life interesting for me.”
For development purposes, we look at hand-engraved
scores. Those are our “gold standard”, so looking at SCORE
or Amadeus output is of no use to us. I do have a big binder with all
the sample printouts that I could lay my hands on, including Amadeus
and SCORE. In terms of default graphical quality LilyPond handily
beats both of them, but that's an unfair comparison. Like Finale,
Amadeus and SCORE are very powerful packages, but they do not mislead
users with a friendly GUI. They are aimed at professionals that need
to create perfect prints in a short time-span.
LilyPond is also focused on making perfect prints, but at the same
time, the software should do The Right Thing. Sometimes we have to
disappoint users, because implementing the Right Thing can take a long
time.
Speaking of doing the right thing, your essay on
typesetting contains an
example from LilyPond that I'd find unacceptable because it has an
arpeggio mark too close to a note. You'd expect that to be easy to
fix in a GUI program, because the arpeggio is a graphical object:
select it and move it. In LilyPond it's less obvious what to do. It
looks as though you're making things harder for yourselves by
concentrating on an impossible problem (knowing exactly what the user
means) in preference to a simpler, widely applicable one (making it
easier for the user to refine what they mean).
Concentrating on hard problems instead of simple ones is one of the
things that makes life interesting for me.
In LilyPond an arpeggio is also a graphical object, and
correcting this mistake is actually very similar to the GUI
approach:
Arpeggio
and move it
\override Arpeggio #'extra-offset = #'(0.5 . 0.0)
This moves the symbol half a staff space to the right,
in LilyPond 2.1 syntax.
The difference is not conceptual, but practical:
without a GUI, you have figure out the exact textual command, and
where to insert it in the input file.
Is this an example of what you meant by disappointing
users — by making simple things a bit harder to do —
because implementing the Right Thing can take a long time?
I was going to say that it's not at all what I
meant, but on second thought, it's actually not such a bad example. I
wrote positioning code which assumes that the wiggle is left of the
chord. The correct fix is to use a more general positioning mechanism,
which I've now done. So now you can also tune the amount of space
between the chord and the arpeggio with
\override Arpeggios #'padding =#<..a number..>
This is a specific illustration of the general
situation: a user complains of something that isn't working well
(arpeggio positioning), and wants to have a quick and dirty fix
(moving objects about manually). From my perspective, hand-holding a
user and teaching him to do obscure manual tweaks is bad idea: it
takes me a lot of my time, and it doesn't help the next guy that runs
into the same problem. In effect, it wastes my time.
The proper solution is to fix this problem for once and for all, make
the solution available to everyone else, and make sure that everyone
can also find it. In this case, it amounts to a small change in the
C++ sources. Often, the proper thing is writing or polishing
documentation. From my perspective, a user asking a question is an
indication that the documentation is still lacking.
Unfortunately some problems are either too specific for
me to spend a lot of time on, or a good solution is too difficult. As
an example of the latter, the formatting of slurs is still far from
optimal in LilyPond. I expect that this module will have to be
rewritten, but that is a big long-term project, so I have to
disappoint users that complain of the shape of their slurs.
You place a lot of emphasis on matching existing
hand-engraved scores. Isn't it possible that rather than creating a
program that can do “correct” musical typesetting, you are
in fact training a program to specialise in imitating
Bärenreiter editions?
We like Bärenreiter especially,
but we have worked from Breitkopf & Härtel and Ed. Peters samples
as well. We wouldn't mind having a “Bärenreiter”
score generator, but in practice it's not really feasible. There are
compromises in the layout when it comes to selecting weights,
distances and glyph shapes, and these compromises vary from score to
score — I suppose it depends on who did the plate engraving. In
the end, we have to find our own compromise, such that LilyPond output
is consistent with itself in all typographical aspects. Regarding the
Solo Cello Suites (Barenreiter BA 350, our engraving bible), I think
we have a slightly heavier look, which is probably related both to
personal preferences and our black note head, which is more elongated
and therefore heavier.
How far do you feel able to judge quality of output
separately from your own personal typographic preferences?
After several years of looking at scores from a
typographical perspective, Jan and I definitely feel
qualified. Our belief is reinforced when we talk to professionals:
they take us seriously, and share many of our views. In addition, we
have reached a point where we can sometimes spot subtle errors in
their engraving work.
Users may get fuzzy feelings from this knowledge, but
of course it doesn't really buy them much. That's why we try to
document as much as typographical knowledge as much as possible. This
information is contained in comments to the program and font source
code, and especially in our regression-test
collection: we have a set of LilyPond source files that test every
aspect of the typesetting engine. The comments to those files document
our beliefs when it comes to proper typesetting.
Do you usually feel that when you've settled on something and
encoded it as your preferred test output, that decision is sound and
final?
Yes, I often feel like that, but after a while, users
always pop up with obscure examples where our approach fails. If
possible, I try to enhance Lily to deal with their complex cases
too.
So how open is musical typesetting to personal
tastes?
Jan:I think that lots of personal taste goes in
tiny details. Professional engravers sometimes design their own font.
Just enough to add your “fingerprint” to an engraved
score, but sublte and tasteful enough not to annoy the trained eye. I
think engravers agree on the “big things”, which is where
the challenge still is for notation software.
You drew your own font for LilyPond, didn't you? Which
symbols caused the most trouble?
Han-Wen: Yes, in a galaxy far away, and long
ago, LilyPond used the MusiXTeX font. But we were unable to get a
licence for the font that was as permissive as we needed for the rest
of Lilypond, so we started writing our own font. We started out with
the basics (note heads, accidentals), and gradually replaced all the
symbols over time.
“The G clef has a nice combination of
poise and flourish, but the bottom crook is still out of
balance.”
I find the most elegant symbols the most difficult to
draw. In particular, I have put in a lot of work in the flat symbol
and the G clef. When I prepare slides for a presentation on Lily, I
show some glyphs in magnification. Usually, I end up tweaking with the
parameters of those glyphs. As you can guess, I'm still not satisfied
with them. For example, the G clef has a nice combination of poise and
flourish, but the bottom crook is still out of balance.
I expressly say we started “writing” a font
instead of “drawing”, since the font is also a program,
written in METAFONT, Donald Knuth's system for designing fonts. The
nice thing about METAFONT is that the font is parameterized, making it
quite easy to have slightly different shapes depending on the design
size. If you look closely at good music prints, then you will notice
that smaller print (i.e. smaller staff sizes) uses heavier staff
lines. Of course since the music font must match the staff size,
smaller fonts should be comparatively chubbier. With METAFONT we have
successfully implemented this in the LilyPond development series, and
I think that LilyPond is the only engraving system that sports a music
font in several design sizes.
Do you know of many people using LilyPond professionally?
Not of many people, but there are people that do the
odd paid engraving job. I estimate that it will take another few years
before we will start to see LilyPond scores from major publishers. We
do see that LilyPond use is rising everywhere, not only from our web
and email statistics, but also because I bump into more and more users
in Real Life, for example at orchestra rehearsals.
Your website says you're available to do paid work based on your
LilyPond expertise. Have you had much interest?
No, but I haven't solicited it actively.
What sort of work can you imagine people needing?
We added the remark [about paid work] to the FAQ
partly on impulse, and partly to see to if it is possible to make a
part-time job out of LilyPond consultancy. I have to admit that we
haven't thought very deeply what kind of services we could
deliver.
One advantage is that it means we have a valid argument
to ignore feature requests that aren't generally useful. In the past
we implemented things because they seemed “cool to have”:
some of our conversion utilities, and “easy notation”
note-heads (they have the name of the note printed inside the head).
After the work was done we might find that the person requesting it
had disappeared and wasn't so interested after all. Nowadays we would
request a fee for such things, and that would sift the pie-in-the-sky
dreamers from users with serious needs.
If my hypothetical small typesetting company tried out
LilyPond and needed some support or training — would you be
available to do that?
Typesetting companies have an obligation to deliver, so
you probably would not base a company on LilyPond unless you were sure
that you yourself were capable of making LilyPond deliver. Some users
do their engraving professionally in LilyPond only because they feel
sufficiently comfortable with the low-level commands for tuning
output.
As Mike Mack Smith explained, the engraving business
has tight margins and I suppose that individual engravers don't have
much money to spare for consultancy. From our point of view, it means
that we have to figure out who else would want to fund work on
LilyPond.
Broadly speaking, LilyPond offers an open-source/free
software solution to music document production, archival and analysis.
Parties that have interests in these areas are potential
sponsors. Music technology research groups might be interested in a
system for storing and producing musical documents; libraries might be
interested in infrastructure to build digital on-line libraries.
There are foundations to stimulate general participation in and
production of art: I imagine some of them might find the effect of
LilyPond on performing arts worthwhile. Big publishers can save money
if typesetting is done more efficiently, which LilyPond could do for
them. Since they have more money to invest in such long-term
projects, they might also be an interested party. In some far away
future, I hope that any or all of these institutions would fund
further work on Lily.
Most music being written and listened to today is not
distributed in written form, and a lot of (for example) hit songs make
pervasive use of things like samples and effects that can't be
recreated effectively from a written representation. And of course
music can be very easily distributed electronically at any stage of
completeness. Do you think traditional written music has any future
at all, other than to communicate with the past?
Your question touches upon another issue: most people, especially
non-musicians, see music as a product that takes the form of
electrically generated sound-waves, and stored on hard-disks or shiny
disks, while to you and me, music is a way expressing myself.
I think that the real question should be “Does
live performance have a future?” I think it has: making music is
deeply satisfying, all the more when done in front of an audience, and
listening to live music is also much better than to recorded music, if
only because it forces listeners to focus their attention.
Written music, i.e. sheet music, is crucial for all
music that does not have a simple structure — basically anything
besides light music — so I don't see written music going away.
In fact one of the motivations behind putting so much energy in
LilyPond is giving written music a better future. Some day far away,
LilyPond is “done”, and then the Mutopia Project can really
take off. I hope that the availability of good software for publishing
music will lead to more music being accessible, and also to more newly
written music.
“Finale and Sibelius cater for the
simple forms of light music. No software caters for the needs of
classical composing.”
There is still a lot of work that can be done in that
area. No software caters for the needs of “classical”
composing. Of course there are various sequencers, and packages like
Finale and Sibelius (and of course, Rosegarden!), but they cater for
the relatively simple forms of light music.
Even in the age of computers, classical composers still
write music by scribbling stacks of note-paper full with ideas and
fragments, and piecing those bits together to a full score. It's a
very laborious process, but computers cannot give them the same
overview as a bunch of paper fragments spread out over a desk would
do.
Are the needs of classical composing something that you
want LilyPond ultimately to address?
No, I don't think LilyPond will ever address that.
LilyPond is an exercise in reducing the musical input as far as
possible, and we have reduced the links between different music
fragments to their relationship in time: are two fragments played
sequentially, in parallel, are they repeats, are they condensed like
tuplets? This makes for a very concise and elegant format, but I
don't think it reflects how composers think of music.
To a composer, one fragment of music may have many relations to
another fragment. For example, a motive played by one instrument may be
a continuation of melodic line in another instrument. At the same
time, the same fragment might have a function in the harmony, and be
thematically related to other motives.
In a music composition system, motives would be entered
separately, and connected with all these relations. A complete score
is just an enormous bunch of motives, connected in many ways, and it
could be visualized in many ways. A printable part is a just one view
of a of the score: one where all motives for one instrument are strung
together.
I heard that you play a lot of modern classical music
in ensemble yourself.
Yes, I play French Horn in the Utrechts Blazers Ensemble,
which is a student wind ensemble dedicated to 20th-century
music. Recently I also joined the VU-Orchestra, a very good amateur
symphony orchestra in Amsterdam. They play late-romantic and modern
repertoire.
Aside from the packages that manipulate it, how far is classical
notation itself capable of catering to the needs of composers? Do you
play much music that demands particular specialised notations?
The stuff in the UBE is fairly “normal”, at least when it comes to
notation. Our conductor is a big fan of John Adams, Louis Andriessen
and Stravinsky, so we play a lot of music from them and their
students. These are eclectic composers: they blend many musical styles
(ranging from medieval hoketus via french baroque to boogie-woogie)
into new pieces. The pieces are largely based on traditional music so
they look fairly normal except for the frequent time signature
changes.
I think the weirdest thing I've ever played in the UBE
is “De Volharding” ("Perseverance"), a piece written as
inauguration for ensemble “De Volharding” by Louis
Andriessen. It's an archetypical minimal music piece, where everyone
in the ensemble plays ad libitum from a set of ostinato patterns. The
patterns slowly change over the course of minutes in a group
process.
I had to return the parts, but the following LilyPond
notation might give an idea of what was written. It had fragments
like
f16[ g f g] \bar ":|"
"repeat approx 200 times"
"change gradually into"
g16[ f g f]
The majority of the parts that we have to
play from are rental material, and not performed very often. If they're
not classics (like Stravinsky or Poulenc), the parts are written by
hand.
“Most modern music has evolved from
existing old music, and so has its notation.”
Getting back to the general question of new notation: people like to
point to funky, weird modern notation as a problem area for LilyPond,
but people seem to forget that weird notation is only necessary for
weird music. Most modern music has evolved from existing old music,
and so has its notation.
In any event, it is my personal opinion that we should do only one
step at a time. First we should have a good understanding of producing
traditional notation with computers. Only then are we in the
position to explore in which direction to improve or extend notation.
MusicXML recently made
the news when Recordare announced the release of version 1.0 of the
specification. What do you think of it?
It's nice that there is finally a format that is
supported by more than one package, but I am not terribly impressed.
In my opinion, any file format that claims to be universal should
have two properties: it should have an expressive structure, so other
formats can be expressed in it, and it
should be as lean as possible, so that converting from other formats
amounts to removing information. I think that MusicXML fits neither.
I have the utopian vision of a “universal”
music format. That would be a format capable of expressing all kinds
of written music while being suitable for machine manipulation. Such
a format must not have redundant information, as that gets in the way
of manipulation.
For example, in LilyPond you can define a music fragment,
frag = \notes { c'4 d'8 e' f'2 }
and shift it by a beat, doing
newfrag = \notes {
s4 % shift by quarter note
\frag
}
Many other formats, including MusicXML, define a
fragment of music as a list of measures, where each measure may
contain notes. This structure gets in the way of manipulation: when
you shift a fragment by a quarter note, the bar lines and beaming
change completely.
Lean data structures for flexibility is an example of
duality, and this concept is much more general. In object-oriented
programming, base classes always have fewer data members than derived
classes, and for that reason, one can perform more operations on
them. A mathematical example of duality is C∞, the
space of infinitely smooth functions: it is smaller than
C0, the space of continuous functions, and therefore, more
operations can be applied to C∞.
Aside from theoretical aspects, a lean data structure
is practically useful. Converting from MusicXML to LilyPond is rather
easy: parse the XML, discard everything but pitches and durations, and
dump those to a .ly file. It does change the problem: the more you
remove from a data format, the more advanced the software has to
become to fill in the missing details. The big problem is not so much
defining the format, but writing the software to recreate the notation
for a piece of music.
For practical use as an interchange format, surely
MusicXML only has to be easy to write and parse, and to be
substantially more expressive than MIDI?
Perhaps.
Maybe I just have trouble comprehending the concepts of
“music format” and “easy to parse”. Music is a
time-based thing, so a music format should support parallel and
sequential composition. As a BNF grammar, you would have
Music::
NOTE
| SEQ Music*
| PAR Music*
This is basically what the LilyPond format is all
about, and I can't see how you could make it much simpler than this.
It's a context-free grammar. How much easier parsing do you need?
You see defining the format as 10% of the effort and
making good use of it as the other 90%, but I imagine from the point
of view of MusicXML, defining a format that could work is 10%
of the effort and getting a majority of software to agree to and use
it
is the other 90%. Does the relative success of MusicXML in that sort
of environment, particularly compared to earlier attempts like NIFF,
not make it seem that it's a thing people do actually want?
I think you should ask this question to
“people”. As a developer of notation software, I prefer to
deliver the features that make users happy. Then they will continue
using LilyPond. By contrast, the main asset of having MusicXML-output
is that users can migrate away from LilyPond more easily, and that
doesn't give me warm fuzzies.
And, is MusicXML so successful? Sure, the diagram at www.recordare.com has a
neat little box saying MusicXML in the center, and many neat little
arrows going to neat little boxes listing other software, but are
people using it all that much?
Jan: In my view, MusicXML is a job poorly done.
You estimate an effort of 10% going into the design of the format
itself, and that shows. Used as a notation interchange format, it is
a step up from MIDI. MIDI has all the notes, a bit of tempo, a broken
key signature. That's maybe 50% of the “music”; the other
50% is lost. MusicXML adds another 25%: most notably, articulations.
Now we're already at 75%, yay! It also adds unnecessary stiffness and
clumsy verbosity and buzzword compliance. Because of these facts, I'm
afraid that MusicXML's lifespan may well be an order of magnitude
shorter than that of MIDI — about three rather than thirty years.
After some user pressure we implemented MIDI import for
LilyPond. In practice, re-entering a piece in LilyPond is often
quicker than adding to and touching-up the MIDI import result. As a
consequence I consider the MIDI import filter a mostly wasted
exercise. The question we have to ask ourselves is whether the
result/effort equation for supporting MusicXML is significantly more
favourable than it was for MIDI. If we can postpone worrying about
and supporting MusicXML until its successor comes along or until it
gets fixed, that may well be better for our users.
For LilyPond, MusicXML could work as a better way to
interchange music notation with other programs. But what is it what
people want from us: support any exchange format that is a bit richer
than MIDI and widely adopted? Or would they rather have LilyPond
draw fret diagrams?
Have any of your decisions in the design and
development of LilyPond turned out to be big mistakes?
Han-Wen:Like I said before, we are rewriting the
internals of LilyPond on a continuous basis, so we see the results of
design decisions very directly in the code. Since bad ideas lead to bad
code, they are refactored out relatively quickly.
Unfortunately that doesn't hold for syntax: for
compatibility reasons we cannot change the format too often. One big
thing that took us a long time to repair was the syntax for
chords. There used to be no distinction between a chord (a set of
pitches) and simultaneous music (pieces of music playing together). At
first this was rather neat, since having no chords made syntax
slightly leaner. Unfortunately, it also resulted in many clumsy
inconsistencies from the user's point of view. The main point of the
1.8/1.9/2.0 releases was to fix this problem in a graceful way.
I was certainly rather surprised to see such major changes to the
syntax in 2.0. Is that likely to happen again in future major
releases?
“We have to make future
users happy too, and that's a much bigger group than the current
users.”
Frankly, I think that we are starting to hit rock-bottom when it comes
to simplifications of the syntax. Nevertheless, if we can think of
improvements, we will surely implement them: we have to make future
users happy too, and that's a much bigger group than the current
users.
Before anyone gets the wrong impression, we don't leave
current users out in the cold. We have a conversion script that
handles most of the syntax changes transparently.
How much time do you each manage to spend on Lilypond?
Han-Wen:It depends on my personal situation. It
has ranged between 10 and 50 hours per week. When I started LilyPond,
I had loads of free time (being a lazy MSc. student), and for the last
half year I've been unemployed, which allowed me to spend obscene
amounts of time on Lily. This will change soon, though. I'm starting
as a IT/logistics consultant in mid-April, so it will probably go down
to a decent 10 to 15 hours a week.
Jan:If I'm lucky about 12-20 hours a week.
Do you have many other contributors?
Han-Wen: The hard-core coding work is basically
a two-man show; we do get support from others with porting, packaging,
translating and proofreading. Also, some special features, like
tablature and ancient notation, have been contributed by other people.
What sorts of music do you listen to, rather than play?
Jan: Anything interesting, really. I like
medieval choirs but also modern stuff, especially minimal music: even
what Han-Wen plays, when I get to listen.
Han-Wen: Live music usually boils down to going
to concerts where friends play, so that tends to be classical. My CD
collection is quite varied, including jazz, rock and lots of
“classical” dating from between 1450 and 2001.
I have to admit that I don't often listen to CDs
any more. Background music is distracting when you're not listening,
and when I have the time to listen, I'd rather be coding or, even
better, playing music myself.
You can hear the VU Orchestra, with Han-Wen on French Horn,
performing in the Concertgebouw in Amsterdam on Saturday June 26,
2004. The programme includes Lutosławski's Concerto for Orchestra and
Bartok's Concerto for Viola. |