m2m: mod->midi file converter for TiMidity++


BRIEF SYNOPSIS:

This adds the new -OM output mode to TiMidity++, which will read in a mod file
and output a midi file.  All parameters needed for the conversion are
contained in a .m2m file of the same base name as the mod.  If this file can
not be found, it will generate one for you.  Chord assignment and
transposition values can be very difficult and tedious to assign by hand.  It
is STRONGLY recommended that you let the program generate the initial .m2m
file, so that it can do most (if not all) of this for you.  You will still
need to assign drums, correct banks/programs, and tweak volume amps by hand.


EXAMPLE USAGE:

timidity -c c:\timidity\timidity.cfg -OM -V 2 -idt foo.mod

This will try to read in foo.m2m as the config file for the conversion, and
will output a file called foo.mid.  -V 2 tells it to generate midi for use
on a device that uses an X^2 volume curve (all GM/GS/XG hardware).  You can
use the -o flag to specify any other output name you wish.

If you don't have timidity installed, so that you don't have a valid
timidity.cfg file, just create a 0 byte file and use it instead of the real
thing.  Since you're playing mod files, timidity doesn't need to load any
midi instruments, so you don't need to have a set of patches or a real
timidity.cfg file :)


BACKGROUND:

MOD files are a lot like MIDI files.  Both formats are basicly a series of
events that control how notes get played with which instruments.  MODs
package the instruments along with the events into a single file, while MIDI
relies on external sources of instruments.  It is this fundamental
difference that creates the most difficulty in performing a mod->midi file
conversion.  The mod file does not need to know what pitch each sample is
tuned to, if it is a drum, or if it is a chord.  MOD players simply play the
packaged sample at the requested pitch, assuming all samples are tuned to
the same fixed frequency, whether they actually are or not.  Thus, if you
were to do a direct mod->midi event conversion, you would wind up with midi
instruments playing in the wrong keys, snare drums being treated as normal
melodic instruments, and single notes where there should be chords. 
Transposition, drum related channel movements, and chord emissions are the
most noticable obstacles to overcome when performing an accurace mod->midi
conversoin.

Paolo Bonzini has already done half of my job for me.  He contributed a good
amount of code that turns TiMidity++ into a first rate mod player.  This,
alone, would not have helped me very much; it was how he implemented it.  
Rather than handle the events like every other mod player known to man,
TiMidity++ converts them into standard midi events, loads the mod instruments
in as special patches, and then renders them just like it would any normal
midi file.  The mod event parsing, instrument parsing, and direct event
conversion was already done!  All I had to do was handle the problems I
mentioned above, along with many more minor ones I haven't mentioned, before
writing the internal TiMidity++ events out to a midi file.  See the comments
at the top of m2m.c if you are interested in some of the other issues that
needed to be addressed during the conversion process.  Although some of
these other issues were non-trivial to deal with, and pitch bends beyond 4
octaves may still sound a bit odd, they are nothing that the average user
needs to know about or keep in mind when trying to succesfully convert a mod
file.  The only thing you need to know is that, in order to address the
conversion problems disscussed above, some information about each sample in
the mod must be specified in a config file (.m2m) associated with each mod
file.  The format of this file is given below.


M2M CONFIG FILE FORMAT:

Comment lines must begin with a #.  Blank lines (no spaces or any other
character besides a newline or carriage return) are allowed.  All other lines
must specify ALL FIVE of the fields described below.  Each field is separated
by white space.


FIELD 1: Sample Number

This is the number of the sample that you are defining information for. 
The first sample in the mod file is 1 (not zero).


FIELD 2: Bank/Program, drum flag, chord, silent flag

This field specifies several different properties of the sample.  Optional
paramaters are given surrounded by parentheses.  The format for this field
is:

(!)(bank/)program(chord)(*)

If the field begins with an exclaimation mark, ! , then no notes will be
issued for this sample.  This can be used to silence samples that you can
not assign to a general midi instrument, such as speech, complicated drum
tracks, or any sound effect that you can not create a close approximation to
using GS sfx banks.

The bank portion of the field specifies an optional bank selection.  This is
the number of the bank to use, followed by a / to separate it from the
program number.

The program number is the midi instrument you are assigning to the sample. 
If the sample is a drum, this is the note that the drum is mapped to in the
drum set.

The optional chord field specifies what type of chord the sample is composed
of.  There are 4 types of chords, each of which has 3 subtypes.  The
supported chord types are (M)ajor, (m)inor, (d)iminished minor, and (f)ifth.
Each chord is specified by the letter surrounded by parantheses in the
previous line.  The subtype of the chord describes how much the chord is
"rotated" from a standard chord, which can be 0, 1, or 2.  As an example of
what I mean by "rotated", a major chord is composed of the following note
semitone offsets: 0,4,7.  If you were to rotate the chord one to the left, it
would be: -5,0,4.  Two to the left is: -8,-5,0.  If no subtype is given,
zero rotation is assumed.

The final part specifies if the sample is a drum.  Put a * at the end of the
field to indicate this.  Chord assignments will be ignored if the drum flag
is set.

Examples:
8/48M     bank 8, program 48 (Orchestra Strings), with a normal major chord
!8/48M    silence this sample
8/48M2    same as the first example, only the chord is rotated down twice
48        normal Marcato Strings in tone bank 0
16/38*    Power drum set, Snare1
38*       Snare1 on the regular drum set 0


FIELD 3: Transposition

This is how much to transpose the original note specified in the mod file. 
If the sample is tuned at middle C (pitch 60), it will need to be transposed
+24 semitones for the midi instrument to play on the correct pitch.  Samples
marked as drums will not be transposed, since they are fixed to a single
note on the drum channel.  You must still enter a value for the
transposition field, even if it is ignored by the drums, so that the config
file parser will not crash.


FIELD 4: Fine Tuning

All pitch bend events for this sample will be adjusted by the given fraction
of a pitch.  This is sometimes necessary for highly out of tune samples.
Some MOD composers, instead of tuning their samples correctly, use pitch
bends to tune the samples.  When you play this music with correctly tuned
samples, these pitch bends detune the note and it sounds out of tune.  So the
fine tuning value is used to compensate for these detuning pitchbends.

It is also common to find out of tune samples that were NOT tuned with
pitchbends, so adding in a pitch bend adjustment would only make them sound
worse in a midi file.  To disable fine tuning, an optional ! can be placed
before the fine tuning value.  This is the DEFAULT SETTING in the automatic
config file generator.  If you find that a mod requires fine tuning for a
sample, simply delete the ! and redo the conversion.

This feature is not yet fully implemented.  Only existing pitch bend events
are affected, so no new pitch bend events are issued.  This is not usually a
problem, however, since most cases where this feature needs to be applied
involve mods that issue pitch bends before the affected notes, since they
were intended to tune the samples to begin with.  I plan to eventually
implement insertion of new pitch bend events, so that this will be a true
fine tuning feature.

 
FIELD 5: %Volume

Each sample can be amplified by scaling the expression events.  100 is the
default amount, which is 100% of the original volume.  50 would decrease it
to half of the original volume, while 150 would be 1.5 times the original
volume.  Don't forget that the maximum expression value is 127, so any
expression events that get scaled higher than this will cap off at 127 and
you won't hear any difference.  It is mainly used for quieting instruments
that are too loud in the midi file, or for amplifying instruments whoose
expression values are too low to begin with.

Any fields beyond the first 5 will not be parsed.  You can type anything
here that you want.  You do not have to place a # before comment text, but
it is conventional to do so.


FREQUENCY ANALYSIS:

So, how do you figure out how much to transpose each sample and what chord
it is?  Load it up in a program that can perform an FFT on the sample and
display the frequency peaks.  The first peak is usually, but not always, the
fundamental pitch of the sample.  If the sample is a chord, take the first 3
major peaks and assign the chord from these.  Then enter the appropriate
chord and transposition values in the .m2m file and see if it sounds correct.
It is VERY time consuming to do all of this by hand....  So, I wrote routines
to do all of the assignments for you :)  It is not 100% accurate, but it's
pretty darn close.  And when it does miss a pitch or a chord, it always
assigns it the correct LOOKING answer.  That is, if I were to visually
inspect the FFT data, I would pick the same pitch the algorithm does.  I'm
no expert at this, but after spending so many hours testing this on many
different difficult to assign pitches, I think I'm pretty good at it now :) 
The only way I can see to improve it is to build in some sort of
psychoacoustical model that takes into account how the human ear percieves
the sound.  And I don't think I want to do that at the moment....  It does an
above average job at dealing with samples that have more than one pitch or
chord in them, but don't be surprised if a noisy or multi-tonal sample
doesn't get assigned correctly.  Garbage in, garbage out :)  The automatic
assignment is very good for the vast majority of samples and should DEFINATELY
be tried first before you start changing things by hand.  When it does mess
up, it's usually only off by a single semitone or an octave multiple, so it's
easy to tweak from there.

Before I wrote the automatic frequency analysis routines, I knew very little
about the field.  Pitch detection is a very old problem in the audio signal
processing literature.  I looked up references in the library dating from
the 1960's.  The stuff from back then is just as relevant as the later
literature, since the methods really haven't improved much since then.  The
two major camps on how to do this are "autocorrelation" and "cepstrum"
analysis.  It turns out that autocorrelation was not the answer to my
problems.  While it works well on "well behaved" samples, it breaks down
very quickly on synth instruments, noisy instruments, and instruments with
multiple fundamental frequencies.  A large number of samples encountered in
mod files exhibit these properties.  No matter what I did to try to tweak it,
and I tried a lot of good things, I just could not make it robust enough to
handle real world samples.  It's a good theory, but it falls apart in
practice.

Cepstrum analysis proved to be much more robust.  But even so, I had to do a
good deal of pitch filtering and peak weighting before I could get it to work
well.  The 2nd FFT analysis kept giving me frequency peaks that didn't exist
in the 1st FFT spectrum.  They were, however, very close to real peaks.  So I
throw away all frequencies that fall below a pitch peak area and maximum
magnitude filter, then force the cepstrum analysis to only choose pitches that
have made it through the filter in the 1st FFT spectrum.  I set a maximum
frequency based on zero point crossing analysis, going out two zero crossings
from the largest amplitude in the sample.  This was necessary to prevent
octave jumping errors.  I found that it is also important to weight the
cepstrum peak areas by the maximum magnitude within the corresponding pitch
peak in the 1st FFT.  This was a desperate attempt to get some especially
troublesome bass samples to assign correctly.  Surprisingly enough, it works
great, giving me a higher success rate on all my samples without inducing any
new misassignments!  The only catch is that the weighting only works well for
< 2 seconds of audio analysis.  Any larger than that and the FFT size gets
so big that the pitch peaks are too diffused, so the maximum magnitudes for
the pitches are too small, and the weighting starts to give wrong answers. 
If anyone wants to analyze >= 2 seconds of data, which isn't neccessary for
assigning pitches to mod/midi instruments, it would be easy to implement a
sliding window average that calls the existing frequency assignment
function.

It appears to work better than any of the other sample analysis software I
have.  If you are interested in more details of how I did the cepstrum
analysis, try looking over the code in freq.c and/or email me for a more
complete description of the algorithm I wound up with.  The new FFT routines
are not mine, but are public domain.  From all the benchmarks I could find,
this is the best FFT implementation for doing what I need to do (and for
future effects processing, should they ever be added to TiMidity++).  See
fft4g.c for info on where to get the original FFT package.


SUGGESTIONS ?:

Feel free to email me with any suggestions you may have on how I can do a
better job of converting the mods, or how I can implement things on the TODO
or WISH lists in m2m.c.  I am considering turning this into a stand alone
program, but until I get more free time and energy, it's going to stay as
just an addon for TiMidity++.


LEGAL STUFF:

TiMidity++ is distributed under the GPL, and since my code is derived from
and makes use of it, I guess it's under the GPL too.  So blah blah blah,
legal stuff, blah blah blah, etc..  You know the drill.


Eric A. Welsh <ewelsh@ccb.wustl.edu>
Center for Molecular Design
Center for Computational Biology
Washington University
St. Louis, MO