Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.


  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • DownloadDownload
  • PrintPrint

Audio Glue

The audio architectures of Android and iOS are quite different, but libpd aims to provide a coherent interface across platforms, without sacrificing platform-specific functionality. We discuss the common features here and leave most platform-specific considerations for Chapter 5 and Chapter 6.

The common features of the audio glue include methods for initializing, starting, and stopping the audio components, as well as a method that checks whether the audio components are currently active. In Java, the audio glue is provided by a class called PdAudio; in Objective-C, a class called PdAudioController plays a similar role.

Audio glue in Java

public class PdAudio {
  static void initAudio(int sampleRate, int inChannels, int outChannels,
                        int ticksPerBuffer, boolean restart) throws IOException;
  static void startAudio(Context context);
  static void stopAudio();
  static boolean isRunning();
}

Audio glue in Objective-C

@interface PdAudioController : NSObject <AVAudioSessionDelegate>

@property(nonatomic, readonly) int sampleRate;
@property(nonatomic, readonly) int numberChannels;
@property(nonatomic, readonly) BOOL inputEnabled;
@property(nonatomic, readonly) BOOL mixingEnabled;
@property(nonatomic, readonly) int ticksPerBuffer;

@property (nonatomic, getter=isActive) BOOL active;

-(PdAudioStatus)configurePlaybackWithSampleRate:(int)sampleRate
                                 numberChannels:(int)numChannels
                                   inputEnabled:(BOOL)inputEnabled
                                  mixingEnabled:(BOOL)mixingEnabled;

-(PdAudioStatus)configureAmbientWithSampleRate:(int)sampleRate
                                numberChannels:(int)numChannels
                                 mixingEnabled:(BOOL)mixingEnabled;

-(PdAudioStatus)configureTicksPerBuffer:(int)ticksPerBuffer;

@end

libpd and Core Audio

If you are familiar with Core Audio in iOS, then you may have noticed that the configuration options of PdAudioController neatly map to audio session categories. To wit, the configurePlaybackWithSampleRate method will choose either AVAudioSessionCategoryPlayAndRecord or AVAudioSessionCategoryPlayback, depending on whether audio input is required. The configureAmbientWithSampleRate method will choose AVAudioSessionCategoryAmbient or AVAudioSessionCategorySoloAmbient, depending on whether mixing is enabled. The configuration options give access to all session categories that make sense for libpd, at least according to Apple’s documentation. In practice, some configurations may not be available on your target device. Make sure to test your code on an actual device and be prepared to try more than one audio configuration.

The mixing flag indicates whether the audio session will allow simultaneous output from other apps. Moreover, an instance of PdAudioController will register itself as an audio session delegate that suspends audio playback when a phone call comes in.

The goal of PdAudioController is to encapsulate commonly used behavior in order to protect developers from having to worry about configuration details of Core Audio. If the default behavior of PdAudioController does not meet your needs, you can create a subclass that overrides the methods that you wish to modify.

As a matter of fact, PdAudioController is merely a utility class that configures the audio session and then runs Pd in an audio unit provided by another class, PdAudioUnit. If you find PdAudioController entirely unsuitable, you can bypass it altogether and use AVAudioSession together with PdAudioUnit in order to gain complete control of your audio setup.

In order to initialize the audio glue, you need to specify a number of parameters. Most of them (sample rate, number of channels) are obvious, but one, the number of ticks per buffer, requires an explanation.

Pd computes audio in chunks of 64 frames, known as ticks. When specifying the number of ticks per buffer, you are effectively choosing the duration of the audio buffer through which Pd will exchange audio samples with the operating system. For example, if you request four ticks per buffer at a sample rate of 44100Hz, then the duration will be 4 * 64 / 44100Hz = 5.8ms. Note that this is only a request; PdAudio and PdAudioController will negotiate with the audio subsystem to get a buffer size that’s as close as possible to your request, but depending on the capabilities of your platform, the actual buffer size may be different.

In Objective-C, you don’t have to explicitly specify the number of ticks per buffer because Core Audio will provide a usable default: If you don’t set the number of ticks per buffer, the buffer size will be 512 frames, i.e., eight ticks per buffer.

One minor difference between the Android version and the iOS version is that the Android version will let you choose any combination of input and output channels as long as the requested channel numbers are available, while most audio configurations for iOS will only allow audio output. When audio input is enabled, the number of input channels must equal the number of output channels because the audio unit that connects libpd to Core Audio uses the same channel configuration and buffer for both input and output.

Another difference is the way PdAudio and PdAudioController handle configuration failures. In Java, PdAudio will either give you the configuration you requested, or it will fail and throw an IOException. In Objective-C, the configuration method returns a value of type PdAudioStatus, which is an enum with three elements: PdAudioOK, PdAudioError, and PdAudioPropertyChanged. The first two indicate success or failure, as one might expect. The third one, PdAudioPropertyChanged, indicates partial success, i.e., the controller was able to configure the audio session and create an audio unit, but it had to adjust some parameters. For example, if the requested sample rate is not available, the controller will use the current hardware sample rate instead. When asked to configure input channels on a system that does not provide audio inputs, the controller will configure the audio without inputs. If an audio configuration method returns PdAudioPropertyChanged, you can query the properties of the controller in order to determine whether the outcome is acceptable, or you can just treat it as a failure.

Not only does the audio glue protect you from having to deal with the complexity of the audio subsystem, it also lets your app benefit from the evolution of the underlying technology. You specify the buffer size (essentially, the latency) that you want, and PdAudio and PdAudioController will get you as close to that as currently possible.

Once the audio glue has been initialized, all you need to do is activate and deactivate it as needed. In Java, this is accomplished by calling startAudio and stopAudio. In Objective-C, a setter for the active property serves the same purpose. Finally, if you want to change the audio settings, you can call PdAudio.initAudio(…) again (make sure to set the restart parameter to true) or reconfigure your instance of PdAudioController. Keep in mind, however, that some patches configure themselves at load time, and they may malfunction if you change the sample rate after they have been loaded.

Note

In Chapter 2, I mentioned that the DSP toggle of Pd is redundant when working with libpd. Now we see why. With libpd, it is preferable to start or stop the audio thread instead of toggling the DSP state of Pd. Using two controls for the same purpose would be a recipe for confusion. PdAudio and PdAudioController will automatically enable DSP in Pd upon initialization; I strongly recommend that you don’t touch the DSP toggle of Pd, neither in your patch nor in your application code.

In both Java and Objective-C, the methods for starting and stopping the audio thread simply turn audio processing on and off. This can result in discontinuities in the sound, which will be audible as clicks. The basic audio glue makes no attempt to avoid clicks because there are many different ways of dealing with clicks, and different apps will have different requirements. If clicks on start or stop turn out to be a concern for you, you are responsible for dealing with them. A common technique is to ramp the audio output down before stopping the audio thread, and then ramp it back up when starting the thread again; you can implement this in a subclass of PdAudio or PdAudioController. Miller Puckette’s book, The Theory and Technique of Electronic Music, discusses clicks and their suppression in great depth.