Popular Posts

Tuesday, August 23, 2011

MIDI Instruments on iOS

MIDI Best Practices Manifestos

There is a MIDI Manifesto on Best Practices that is getting a lot of attention at the moment. They are focused on getting developers to write apps that cooperate to make CoreMIDI work really well on iOS. You can read it here:

http://groups.google.com/group/open-music-app-collaboration/browse_thread/thread/939e4f2998a8bc

This focuses on interoperability issues. It is very much from the perspective of developers playing piano-like instruments to drive traditional synthesizers and beat machines. This is very welcome news, and it could not have better timing for this to happen. Up until a few days ago, I did not know it was even possible to have a sound engine run in the background while a controller runs in a foreground app, communicating over CoreMIDI.

I am not an electrical engineer or a sound engineer, so the mandatory internal engine that I had to write was always something that was an especially time-consuming part of my app that yielded less than ideal results for all the work that went into it. I have always wanted to focus on making a controller with world-class virtuoso playable potential. The achievement above, using virtual MIDI ports, promises to free me up from having the sound engine be such a huge focus in what I do.

This achievement won't matter if the sound engines get the pitch handling wrong. So here are some issues for sound engine creators to think about, which you will encounter as soon as you plug in something that's not a piano.

The Skill Mismatch

Frankly, I think there are a lot of synthesizers on the iPad with great sound, but playability on the device itself seems to be unconquered territory. I don't think it's possible to make a synth on iPad that resembles a piano, and be able to consistently produce people that can play it better than a real piano. The small rectangular dimensions defy the layout, and the continuous surface fights against the discrete-key design of the instrument being emulated. A piano is wide and doesn't need to be so tall. As a result, most synthesizers are something like 80% controls and 20% playing area. This makes for a really small octave range. There is a lot of demand for synths to "support MIDI" because of this. The truth is, it's mostly so that you can at least plug in a 2 octave keyboard so that you can play at a reasonable level of skill on the thing. It is not so that you can take your iPad and play a piano-like app to drive a Korg Karma. Nobody in the real-world does that.

There is also the skill mismatch that bothers me. If you look at any rock band, you will notice something about their makeup. It is usually 4 or 5 guys. The keyboardist is optional, but he shows up more often than he used to. There is always a drummer. There's always a guitar player. There is almost always a bass player. In a lot of cases, there are two guitar players. The ratio of guitar players to piano players is very high. If you look at all the music stores in your area, you will notice a gigantic guitar section, and a smaller piano and keyboard section. I think that there are something like 5 guitar (including bass) players to every piano player.

The electronic instrument industry is out of balance in this regard. They don't get it.

TouchScreens versus MIDI

Code that actually implements these ideas (the AlephOne instrument). It works very well. But it pushes all of the complexity of MIDI into a client library, and forces the synth (the server) to be as dumb as possible so that the client can get whatever it wants:


It is no coincidence that iPads and iPhones are posing a challenge to the instrument industry at this time. iOS devices are essentially rapid prototyping devices that let you make almost anything you want out of a touchscreen, audio, midi, accelerometer, networking, etc combination. iOS developers are becoming the new instrument manufacturers, or are at least doing the prototypes for them at a very high turnover rate.

Multitouch has a unique characteristic of being tightly coupled to the dimensions of human hands. It does away with discrete keys, knobs, and sliders. If you put your hand on a screen in a relaxed position with your fingers close to each other without touching, you can move them very quickly. Every spot that a user touches should be an oval a little bit larger than a fingertip. If you make the note spots any smaller, then the instrument quickly becomes unplayable. So, lining up all notes besides each other to get more octaves does not work. The awkward stretch to reach accidentals at this small size is equally unhelpful.

A densely packed grid of squares that are about the size of a fingertip is exactly what string instrument players are used to. So, a simple row of chromatics stacked by fourths is what we want. We can play this VERY fast, and have room for many octaves. It's guitar layout. Furthermore, the pitch handling can be much more expressive because it's a glass surface. We know exactly where the finger is at all times, and frets are now optional, or simply smarter than they were before. An interesting characteristic of a stack of strings tuned to fourths is that there is a LOT of symmetry in the layout. There are no odd shapes to remember that vary based on the key that is being played in.

Transposition is simply moving up or down some number of squares. This even applies to tuning. Just play a quartertone lower than you normally would and you are playing along with something that was tuned differently without actually retuning the instrument. It is a perfect isomorphic instrument, with a layout that's already familiar to 80% of players. This is the real reason why this layout has a special affinity for the touch screen.

This link below on a *phone*, and it's ridiculously easy to do. I could play at a similar level on Mugician 1.0 only a few weeks after I got it working. The layout matters. It matters more than a dozen knobs and sliders. You can fix a simple sound with external signal processing, but if you can't play fast, or if the controller drops all the nuances, or it has latency, then that can never be fixed by the sound engine later.

http://www.youtube.com/watch?v=FUX71DAelno&feature=channel_video_title

This layout works, mostly because of the way that pitch handling is done.

Fretless MIDI Messages

Synths that take in MIDI messages are only getting the things right that a piano exercises. They seem to consistently get everything else wrong in various ways. So without any more ranting, here is how MIDI must behave on touch screens:

A MIDI number is a number such that 0 is a very low C note. That low C note is the reference frequency, and I don't remember or know how many hertz it is at the moment. So we will just speak in relative terms to "midi note 0":

frequency = c0 * 2^(n/12)

I believe that A 440hz is Midi note 33. It has this frequency:

c0 * 2^(33/12)

But what happens when the MIDI note is not a whole number? That's a MIDI note with a bend. If we bend midi 33 up with 1/4 pitch wheel with the default wholetone pitch wheel setting, we get midi note:

33.5 ( frequency = c0 * 2^(33.5/12) )

"A quartersharp" if you must name it. So, when we want this pitch, we send midi messages like:

bend ch1 +25%
on ch 1 33

So imagine for a moment the coordinate system that normalizes all touches to fit in the rectangle <0,0> at the bottom left corner, and <12,6> for the top corner. This is an instrument with 6 strings stacked by fourths, with the lowest note in the bottom left corner. You don't have frets yet, but you know that at = <2.5, 0> that you are on the bottom string, half way between fret 2 and fret 3. The pitch is 2^(2.5/12). Every string you go up by adds 5 frets, and therefore multiplies by another 2^(5/12). There is a pixel-to-pitch mapping. This generates the exact frequencies that we are trying to represent as (note,bend) pairs in MIDI.

We bend up to 50% and get an A sharp note....

bend ch1 +25%
on ch 1 33
bend ch1 +50%

Bend up another semitone:

bend ch1 +25%
on ch 1 33
bend ch1 +50%
bend ch1 +100%

So we are up at the top of the bend. We can't do a vibrato now! We will exceed the bend width. So, if the synth was set for the bend to mean "percentage of an octave up or down", then we can do this...

bend +0%
on ch1 33
bend +(3/12)

To bend 3/12 of the way up to the note C. So now, we can do whatever we want as long as we don't get up to an octave. Because users will simply drop their fingers on the glass and actually drag the finger up an octave, this MUST be supported. This drag CANNOT happen on a piano, which is why it doesn't seem that this scenario is ever getting tested.

The core problem with MIDI's model of bends is that it assumes that there is one pitch wheel for the whole instrument. You have to bind together multiple monophonic instruments as one if you want every note to have its own pitch wheel. A finger dropped on the glass is a pitch wheel for that finger. These per finger movements are what gives string instruments their special character, which you cannot fix later in the sound engine after the gestures have been rounded off and lost. Here is an example of two of the same note being played and slightly detuned from each other:

bend ch1 0%
bend ch2 0%
on ch1 33
bend ch1 -1%
on ch2 33
bend ch2 +1%
off ch1 33
off ch2 33

This isn't possible on a piano. But this always happens on a string instrument. We have two concurrent instances of the same note. They are bent in different directions. But they are on the same instrument. Here is another challenge, where note overlaps will cause trouble for synths that don't correctly interpret multiple channels:

on ch1 33
on ch2 33
off ch1 33
on ch3 36

On synths that replace every channel with the same number before interpreting, not only did it get the bends wrong, but it also gets the above scenario wrong. We should be hearing ch2 33 and ch3 36 sounded together as a chord, not just ch3 36 by itself. As for the next scenario, of the problems when you have more notes to play than channels available, just make sure you don't get a stuck note on this:

on ch1 33
on ch1 33
on ch1 33
off ch1 33

This should sound three times with exactly one note, and no stuck note at the end. But this:

on ch1 33
on ch2 33
on ch3 33
off ch1 33
off ch2 33

This will leave ch3 33 still playing. At the time that ch1 and ch2 had notes on, it wasn't one note either. It was two instances of the sound playing, most likely out of phase from each other.

Channel cycling is a requirement to get the independent pitch wheels. Because note off message isn't actually the end of the note, we have to give the notes release time. This is something done in the controller, but synth engines should be aware of why this is being done. We want to maximize the amount of time that a channel has been dead before we steal it to play a new note.
So when we make a fretless instrument, we spread it across a channel span, such as channels: 4-8 for a 4 channel instrument. That allows for 4 notes down with each note having its independent pitch wheel. It means also that if you play 4 notes per second, then an old channel gets stolen (along with its pitch wheel) at 4 notes per second. This means that there is a speed limit to playing without anomalies in MIDI because of the small number of channels.

Chorusing For Real Men

Note that if you are playing fretlessly on an instrument that allows note duplication, that you don't need no stinking chorus effect. Just play the line simultaneously at two different locations:

bend ch1 0%
bend ch2 0%
on ch1 33
bend ch1 -1%
on ch2 33
bend ch2 +1%
off ch1 33
off ch2 33

You can switch between chorusing and chording as you play as well. With microtonality turned on, you can get exact pitch ratios such that the pitches don't sound like distinct notes, but end up being different wave shapes as well.

Note Tie Non Registered Parameter Number

It is not impossible to bend MIDI notes to any width you want at fullest possible resolution. the problem is that there is no defacto or dejure standard on how this is done. Imagine a piano player trying to simulate a bend, and it's on our channel cycling instrument....

bend ch1 0%
bend ch2 0%
...
bend ch16 0%
on ch1 33
...
off ch1 33
on ch2 34
...
off ch2 34
on ch3 35
...
off ch3 35
on ch4 36
...

So he's playing chromatics to simulate the bend because that's the best he can do. But if we are on a synth that inserts bend messages, the synth can at least bend from one chromatic to the next like this:

bend ch1 0%
bend ch2 0%
...
bend ch16 0%
on ch1 33
bend ch1 20%
bend ch1 40%
bend ch1 60%
bend ch1 80%
bend ch1 100%
off ch1 33
on ch2 34
bend ch2 20%
bend ch2 40%
bend ch2 60%
bend ch2 80%
bend ch2 100%
off ch2 34
on ch3 35
bend ch3 20%
bend ch3 40%
bend ch3 60%
bend ch3 80%
bend ch3 100%
off ch3 35
on ch4 36
bend ch4 20%
bend ch4 40%
bend ch4 60%
bend ch4 80%
bend ch4 100%

So, this would be a smooth bend, except we hear the note retrigger every time we reach the next chromatic. So let's say that we have a special message that notes that there is a note tie coming and that it's done when the next note on appears.

bend ch1 0%
bend ch2 0%
...
bend ch16 0%
on ch1 33
bend ch1 20%
bend ch1 40%
bend ch1 60%
bend ch1 80%
bend ch1 100%
tie ch1 33
off ch1 33
on ch2 34
bend ch2 20%
bend ch2 40%
bend ch2 60%
bend ch2 80%
bend ch2 100%
tie ch2 34
off ch2 34
on ch3 35
bend ch3 20%
bend ch3 40%
bend ch3 60%
bend ch3 80%
bend ch3 100%
tie ch3 35
off ch3 35
on ch4 36
bend ch4 20%
bend ch4 40%
bend ch4 60%
bend ch4 80%
bend ch4 100%

We can continue this from the lowest note on the keyboard to the highest for a super-wide bend. It is at the full pitch resolution as well because we aren't playing tricks with the MIDI bend width. It is also the case that if we broadcast this both to a piano that can't bend, and the synth that understands, we get a similar result. It degrades gracefully on the piano, and sounds perfect on the synth that understands. We can use this to track up to 16 fingers at arbitrary pitches (in MIDI range of course!) bending in whatever wild directions they need.

The NRPN looks like this in our code:

#define TRANSITION 1223

static inline void sendNRPN(int ochannel,int msg,int val)
{
//B0 63 6D
//B0 62 30
//B0 06 100
int lsb = msg&0x7f;
int msb = (msg>>7)&0x7f;
//midiPlatform_sendMidiPacket7(0xB0+ochannel, 0x63, msb, 0x62, lsb, 6, val);

midiPlatform_sendMidiPacket3(0xB0+ochannel, 0x63, msb);
midiPlatform_sendMidiPacket3(0xB0+ochannel, 0x62, lsb);
midiPlatform_sendMidiPacket3(0xB0+ochannel, 6, val);
}

static inline void retriggerNewMidiNote(int finger,float midiFloat,int vol,int expr)
{
int channel = midiFingerUsesChannel[finger];
if(channel >= 0)
{
int ochannel = midiChannelOChannelSent[channel];
sendNRPN(ochannel,TRANSITION,midiChannelNote[channel]);
}
stopMidiNote(finger);
startNewMidiNote(finger,midiFloat,vol,expr);
}


Let us know if there is something unreasonable about that message. I haven't used NRPNs before, and since we write both ends of it, they could both be 'wrong' and work just fine between our synths.


What Is It Useful For

There is a very practical use for this: Violin! Oud! You could even do an accurate rendition of the human voice to MIDI without auto-tuning it. Most real-world instruments exhibit this because the spectrum itself is actually microtonal, with the exact 2^(n/12) adjustments being something that can't actually be achieved in practice on an acoustic instrument. They are real-world resonating bodies afterall, and will resonate in whole tone ratios and have real harmonics. Acoustic pianos often use a stretched octave tuning to deal with this problem.

This opens up the door to rendering music outside the 12 tone tempered traditions as well. MIDI should be a rendering format that doesn't break the capability to do what you mean without injecting its prejudgement of what should be disallowed. MIDI itself, like auto-tune, seems to be one of the key factors in keeping electronic music sounding more like it was produced by a machine than acoustic instruments do. There has been a lot of progress in getting good timbre out of instruments, but this is meaningless if you round off all the nuances in pitch that are what really give the acoustic instrument its character.

I am working on a new project, and if you are already a tester or jailbroken, then you can download it here (it's very minimal, with the purpose being more to produce reuseable code than to ship something right now):

http://rfieldin.appspot.com

This is a similar post that puts it in perspective with more general instruments like this for iOS:

http://rrr00bb.blogspot.com/2011/09/multitouch-midi-recommendation.html

4 comments:

  1. Your manifesto is really thought-provoking. My decision to purchase an ipad several months ago was influenced in large part by a desire to try out Mugician. I think your approach to interface/latency issues is brilliant but I have often found myself wishing I could use Mugician to control another synth, but I had never really considered some the issues you raise here, especially that of polyphonic bending. Anyway, thanks for giving us a great app in Mugician and for spelling out these important ideas with such depth. If/when you develop your hypothetical MIDI controller, I will glady buy it.

    Peter Bajzek
    unatta.net

    ReplyDelete
  2. Peter: LOL... I know that Geo Synth *seems* hypothetical. But I have had MIDI working well for about a month and a half, getting caught up in delays related to rewriting the visuals to fit in with Wizdom Music standards. It should ship some time soon. I am going to let the schedule slip a little more to ensure that I work well with background MIDI capable synths. SampleWiz and NLog can already run in the background, driven by Geo. But this is pretty exotic MIDI messaging, and a lot of synths don't render it well.

    ReplyDelete
  3. Oops, I didn't mean that as a negative. I was sort of limiting my comments to the hypotheticals you've described in this post. I am glad to hear your project is going well and look forward to it. Regards.

    Peter Bajzek

    ReplyDelete
  4. Hey! I just wanted to highlight that you really managed to build a magnificent portal. And I have a question for you. Have you ever participated in some sort of blogging competitions?

    ReplyDelete