VST Plugins: Difference between revisions
Line 362: | Line 362: | ||
call only those older functions that are guaranteed to exist, preventing crashes and providing a robust mechanism for safely updating the API. | call only those older functions that are guaranteed to exist, preventing crashes and providing a robust mechanism for safely updating the API. | ||
== 3.2.1 Single vs. Dual Component Architectures == | === 3.2.1 Single vs. Dual Component Architectures === | ||
There are two paradigms for writing VST3 plugins, which involve how the two primary plugin duties are distributed: audio signal processing and GUI implementation. In the dual component pattern, you implement the plugin in two C++ objects: one for the audio processing (called the Processor object) and the other for the GUI interface (called the Controller object). The VST3 architects went to great lengths to isolate these two objects from one another, making robust communication between the two nearly impossible. This is not meant to be a criticism, as it closely follows the C++ object-oriented | There are two paradigms for writing VST3 plugins, which involve how the two primary plugin duties are distributed: audio signal processing and GUI implementation. In the dual component pattern, you implement the plugin in two C++ objects: one for the audio processing (called the Processor object) and the other for the GUI interface (called the Controller object). The VST3 architects went to great lengths to isolate these two objects from one another, making robust communication between the two nearly impossible. This is not meant to be a criticism, as it closely follows the C++ object-oriented |
Revision as of 11:31, 21 February 2022
Virtual Studio Technology[edit]
(VST) is an audio plug-in software interface that integrates software synthesizers and effects units into digital audio workstations. VST and similar technologies use digital signal processing to simulate traditional recording studio hardware in software. Thousands of plugins exist, both commercial and freeware, and many audio applications support VST under license from its creator, Steinberg.
Steinberg's VST SDK is a set of C++ classes based around an underlying C API. The SDK can be downloaded from their website.
In addition, Steinberg has developed the VST GUI, which is another set of C++ classes, which can be used to build a graphical interface. There are classes for buttons, sliders and displays, etc. Note that these are low-level C++ classes and the look and feel still have to be created by the plugin manufacturer. VST GUI is part of the VST SDK and is also available as a SourceForge project.[13]
There are also several ports to other programming languages available from third parties.
Many commercial and open-source VSTs are written using the Juce C++ framework instead of direct calls to the VST SDK because this allows multi-format (VST, Audio Units and Real Time AudioSuite) binaries to be built from a single codebase.
Download[edit]
https://www.steinberg.net/developers/
[[1]] Audio Plug-Ins SDK (Format: zip, 104.9 MB)
[[2]] VST Module Architecture SDK (Format: zip, 0.35 MB)
[[3]] ASIO SDK (Format: zip, 5.70 MB)
[[4]] GameAudioConnect SDK (Format: zip, 8.70 MB)
Anatomy of an Audio Plugin[edit]
Because the audio plugin specifications for AAX, AU, and VST3 are all fundamentally the same I will deal with VST only. But they all implement the same sets of attributes and behaviors tailored specifically to the problem of packaging audio signal processing software in a pseudo-generic way.
Plugins are encapsulated in C++ objects, derived from a specified base class, or set of base classes, that the manufacturer provides. The plugin host instantiates the plugin object and receives a base class pointer to the newly created object. The host may call only those functions that it has defined in the API, and the plugin is required to implement those functions correctly in order to be considered a proper plugin to the DAW—if required functions are missing or implemented incorrectly, the plugin will fail. This forms a contract between the manufacturer and the plugin developer. The deeper you delve into the different APIs, the more you realize how similar they really are internally. This chapter is all about the similarities between the APIs—understanding these underlying structural details will help your programming, even if you are using a framework like JUCE or our new ASPiK. In the rest of the book I will refer to the plugin and the host. The plugin of course is our little signal processing gem that is going to do something interesting with the audio, packaged as C++ objects, and the host is the DAW or other software that loads the plugin, obtains a single pointer to its fundamental base class, and interacts with it.
Each plugin API is packaged in an SDK, which is really just a bunch of folders full of C++ code files and other plugin components. None of the APIs come with precompiled libraries to link against so all of the code is there for you to see. To use any of the frameworks, like JUCE or ASPiK, you will need to download the SDK for each API you wish to support. The SDKs always come with sample code; the first thing you want to do is open some of that code and check it out, even if you are using a framework that hides the implementation details. Remember: do not be afraid. All of the APIs do the same basic thing. Yes, there is a lot of API-specific stuff, but you really need to get past that and look for the similarities. This chapter will help you with that, and the following four chapters will provide many more details and insights. A good exercise is to open up the most basic sample project that the SDK provides (usually this is a volume plugin) and then try to find each of the pieces in this chapter somewhere in the code.
2.1 Plugin Packaging: Dynamic-Link Libraries (DLLs)[edit]
All plugins are packaged as dynamic-link libraries (DLLs). You will sometimes see these called dynamic linked or dynamically linked but they all refer to the same thing. Technically, DLL is a Microsoft-specific term for a shared library, but the term has become so common that it seems to have lost its attachment to its parent company. We are going to use it universally to mean a precompiled library that the executable links with at runtime rather than at compile time. The executable is the DAW and the precompiled library is the DLL, which is your plugin. If you already understand how
DLLs work then you can safely skip this section, but if they are new to you, then it is important to understand this concept regardless of how you plan on writing the plugin.
To understand DLLs, first consider static-link libraries, which you have probably already used, perhaps unknowingly. C++ compilers include sets of precompiled libraries of functions for you to use in your projects. Perhaps the most common of these is the math library. If you try to use the sin() method you will typically get an error when you compile stating “sin() is not defined.” In order to use this function you must link to the library that contains it. The way you do this is by placing #include <math.h> at the top of your file. Depending on your compiler, you might also need to tell it to link to math.lib. When you do this, you are statically linking to the math.h library, a precompiled set of math functions in a .lib file (the file is suffixed with .a on MacOS). Static linking is also called implicit linking. When the compiler comes across a math function, it replaces the function call with the precompiled code from the library. In this way, the extra code is compiled into your executable. You cannot un-compile the math functions. Why would you want to do this? Suppose a bug is found in the sin() function and the math.h library has to be recompiled and redistributed. You would then have to recompile your software with the new math.h library to get the bug fix. Static linking is shown figuratively in Figure 2.1a where the math.lib code is inside of the host code. A solution to our math bug problem would be linking to the functions at runtime. This means that these precompiled functions will exist in a separate file that our executable (DAW) will know about and communicate with but only after it starts running. This kind of linking is called dynamic linkingor explicit linking and is shown in Figure 2.1b. The file that contains the precompiled functions is the DLL. The advantage is that if a bug is found in the library, you only need to redistribute the newly compiled DLL file rather than recompiling your executable with a fixed static library. The other advantage is that the way this system is set up—a host that connects to a component at runtime—works perfectly as a way to extend the functionality of a host without the host knowing anything about the component when the host is compiled. It also makes an ideal way to set up a plugin processing system.
The host becomes the DAW and your plugin is the DLL, as shown in Figure 2.1c.
When the host is first started, it usually goes through a process of trying to load all of the plugin DLLs it can find in the folder or folders that you specify (Windows) or in well defined API-specific folders (MacOS). This initial phase is done to test each plugin to ensure compatibility if the user decides to
Figure 2.1: (a) A host with a statically linked math library compiled into it. (b) A host communicating with a DLL version. (c) A plugin version.
load it. This includes retrieving information such as the plugin’s name, which the host will need to populate its own plugin menus or screens, and may also include basic or even rigorous testing of the plugin. We will call this the validation phase. For a plugin developer, this can be a very frustrating phase: if your plugin has a runtime bug in the loading code that the compiler did not catch, it will likely crash the host application. (Reaper® is one exception to this; it will tell you if your plugin failed to load, but it usually won’t crash as a result.) In addition, the host DAW may forbid itself from ever loading your plugin again in the future. This is particularly true with Apple Logic®, which will permanently forbid your plugin if it fails validation. This can make plugin development confounding.
You can find some strategies for dealing with validation phase errors at www.willpirkle.com but the best thing you can do is prevalidate your plugins prior to running them in the host. AU and VST provide mechanisms to do this.
Usually, the host unloads all of the plugins after the validation phase and then waits for the user to load the plugin onto an active track during a session. We will call this the loading phase. The validation phase already did at least part (if not all) of this once already; if you have bugs during validation, they are often caused during the loading operation. When your plugin is loaded, the DLL is pulled into the executable’s process address space, so that the host may retrieve a pointer from it that has an address within the DAW’s own address space. For most people, this is the only way they have ever used a plugin. There is a less common way of doing this between processes, meaning your DAW would load a plugin from another executable’s address space. The loading phase is where most of the plugin description occurs. The description is the first aspect of plugin design you need to understand, and it is mostly fairly simple stuff.
After the loading phase is complete, the plugin enters the processing phase, which is often implemented as an infinite loop on the host. During this phase, the host sends audio to your plugin, your cool DSP algorithm processes it, and then your plugin sends the altered data back to the host. This is where most of your mathematical work happens, and this is the focus of the book. All of the plugin projects have been designed to make this part easy and universally transportable across APIs and platforms. While processing, the user may save the state of the DAW session and your plugin’s state information must also be saved. They may later load that session, and your plugin’s state must be recalled.
Eventually, your plugin will be terminated either because the user unloads it or because the DAW or its session is closed. During this unloading phase, your plugin will need to destroy any allocated resources and free up memory. This is another potential place your plugin can crash the host, so be sure to test repeated loading and unloading.
2.2 The Plugin Description: Simple Strings[edit]
The first place to start is with the plugin description. Each API specifies a slightly different mechanism for the plugin to let the host know information about it. Some of this is really simple, with only a few strings to fill in: for example, the text string with its name (such as “Golden Flanger”) that the plugin wants the host to show the user. The simplest part of the plugin description involves defining some basic information about the plugin. This is usually done with simple string settings in the appropriate location in code, or sometimes within the compiler itself. These include the most basic types of information:
• plugin name • plugin short name (AAX only) • plugin type: synth or FX, and variations within these depending on API • plugin developer’s company name (vendor) • vendor email address and website URL
These descriptions are set with simple strings or flags; there isn’t much more to discuss. If you use ASPiK, you will define this information in a text file—and it will take you all of about 30 seconds to do so. There are some four-character codes that need to be set for AU, AAX, and VST3, which date back to as early as VST1. The codes are made of four ASCII characters. Across the APIs, there are two of these four-character codes, plus one AAX-specific version, as follows here.
• product code: must be unique for each plugin your company sells • vendor code: for our company it is “WILL”
AAX requires another code called the AAX Plugin ID that is a multi-character string, while VST3 requires yet another plugin code. This 128-bit code is called a globally unique identifier (GUID, which is pronounced “goo-id”); it is also called a universally unique identifier (UUID). Steinberg refers to it as a FUID as it identifies their FUnknown object (FUnknown IDentifier) but it is identical to the standard GUID. The GUID is generated programmatically with a special piece of software; there are freely available versions for most operating systems. The software aims to create a genuinely unique number based on the moment in time when you request the value from it, in addition to other numbers it gets from your network adaptor or other internal locations. The moment in time is the number of 100-nanosecond intervals that have elapsed since midnight of October 15, 1582. If you use ASPiK, this value is generated for you when you create the project.
2.2.1 The Plugin Description: Features and Options[edit]
In addition to these simple strings and flags, there are some more complicated things that the plugin needs to define for the host at load-time. These are listed here and described in the chapters ahead. • plugin wants a side-chain input • plugin creates a latency (delay) in the signal processing and needs to inform the host • plugin creates a reverb or delay “tail” after playback is stopped and needs to inform the host that it wants to include this reverb tail; the plugin sets the tail time for the host • plugin has a custom GUI that it wants to display for the user • plugin wants to show a Pro Tools gain reduction meter (AAX) • plugin factory presets
Lastly, there are some MacOS-specific codes that must be generated called bundle identifiers or bundle IDs. These are also supposed to be unique codes that you create for each of your plugin designs and are part of the MacOS component handler system. If you have used Xcode, then you are already aware of these values that are usually entered in the compiler itself or in the Info.plist file. As with the VST3 GUID, if you use ASPiK, these bundle ID values are generated for you and you’ll never need to concern yourself with them.
2.3 Initialization: Defining the Plugin Parameter Interface[edit]
Almost all plugins we design today require some kind of GUI (also called the UI). The user interacts with the GUI to change the functionality of the plugin. Each control on the GUI is connected to a plugin parameter. The plugin parameters might be the most complicated and important aspects of each API; much of the next few chapters will refer to these parameters. The plugin must declare and describe the parameters to the host during the plugin-loading phase.
There are three fundamental reasons that these parameters need to be exposed:
1. If the user creates a DAW session that includes parameter automation, then the host is going to need to know how to alter these parameters on the plugin object during playback. 2. All APIs allow the plugin to declare “I don’t have a GUI” and will instead provide a simple GUI for the user. The appearance of this default UI is left to the DAW programmers; usually it is very plain and sparse, just a bunch of sliders or knobs. 3. When the user saves a DAW session, the parameter states are saved; likewise when the user loads a session, the parameter states need to be recalled. The implementation details are described in the next chapters, but the relationship between the GUI, the plugin parameters, the host, and the plugin are all fundamentally the same across all of the APIs—so we can discuss that relationship in more abstract terms. Remember that the APIs are all basically the same! Let’s imagine a simple plugin named VBT that has some parameters for the user to adjust:
• volume • bass boost/cut (dB) • treble boost/cut (dB) • channel I/O: left, right, stereo
Figure 2.2a shows a simple GUI that we’ve designed for VBT. Each GUI control corresponds to an underlying plugin parameter. The plugin must declare these parameters during the loading phase, Figure 2.2: (a) A simple plugin GUI. (b) In this GUI, each GUI control connects to an underlying parameter object; the host gathers control changes and they are delivered to the plugin as parameter updates.
(c) DAW automation works the same way, except the host is interacting with the parameters rather than the user/GUI (note that the Channel parameter is not automatable).
with one parameter per GUI control. Without exception, these parameters are implemented as C++ objects, and the result of the declaration is a list of these objects. The actual location of this list of parameters depends on the API, but it is generally stored in a plugin base class object, and the DAW has a mechanism to get and set parameter values in the list. This is how it stores and loads the plugin state when the user stores and loads a session that includes it. In these diagrams I will show this parameter list as being part of the host regardless of where it is actually located. We’ll discuss the mechanisms that the host uses to update the parameters in a thread-safe way in the API programming guide chapters.
Figure 2.2b shows the concept, with each GUI control connecting to a parameter. The parameter updates are sent to the plugin during the processing phase. Figure 2.2c shows the same thing—only it is track automation that is updating the parameters rather than the user. The implementation details are different for each API, but the information being stored about the parameters is fundamentally the same. There are two distinct types of parameters: continuous and string-list. Continuous parameters usually connect to knobs or sliders that can transmit continuous values. This includes float, double, and int data types. String-list parameters present the user with a list of strings to choose from, and the GUI controls may be simple (a drop-down list box) or complex (a rendering of switches with graphic symbols). The list of strings is sometimes provided in a comma-separated list, and other times packed into arrays of string objects, depending on the API. For the Channels parameter in Figure 2.2a, the comma-separated string list would be:
“stereo, left, right” The parameter attributes include: • parameter name (“Volume”) • parameter units (“dB”) • whether the parameter is a numerical value, or a list of string values and if it is a list of string values, the strings themselves • parameter minimum and maximum limits • parameter default value for a newly created plugin • control taper information (linear, logarithmic, etc.) • other API-specific information • auxiliary information
Some of the parameter attributes are text strings such as the name and units, while others are numerical such as the minimum, maximum, and default values. The tapering information is another kind of data that is set. In all of the APIs, you create an individual C++ object for each parameter your plugin wants to expose. If you compare the parameter interface sections of each of the following three chapters, you will see just how similar these APIs really are.
2.3.1 Initialization: Defining Channel I/O Support[edit]
An audio plugin is designed to work with some combination of input and output channels. The most basic channel support is for three common setups: mono in/mono out, mono in/stereo out, and stereo in/stereo out. However, some plugins are designed for very specific channel processing (e.g. 7.1 DTS® vs. 7.1 Sony). Others are designed to fold-down multi-channel formats into mono or stereo outputs (e.g. 5.1 surround to stereo fold-down). Each API has a different way of specifying channel I/O support; it may be implemented as part of the plugin description, or the initialization, or a combination of both. You can find the exact mechanisms for each API in the plugin programming guide chapters that follow.
2.3.2 Initialization: Sample Rate Dependency[edit]
Almost all of the plugin algorithms we study are sensitive to the current DAW sample rate, and its value is usually part of a set of equations we use to process the audio signal. Many plugins will support just about any sample rate, and its value is treated similarly to plugin parameters—it is just another number in a calculation. A few plugins may be able to process only signals with certain sample rates. In all APIs, there is at least one mechanism that allows the plugin to obtain the current sample rate for use in its calculations. In all cases, the plugin retrieves the sample rate from a function call or an API member variable. This is usually very straightforward and simple to implement.
2.4 Processing: Preparing for Audio Streaming[edit]
The plugin must prepare itself for each audio streaming session. This usually involves resetting the algorithm, clearing old data out of memory structures, resetting buffer indexes and pointers, and preparing the algorithm for a fresh batch of audio. Each API includes a function that is called prior to audio streaming, and this is where you will perform these operations. In ASPiK, we call this the reset( )operation; if you used RackAFX v1 (RAFX1), the function was called prepareForPlay( )—a secret nod to the Korg 1212 I/O product that I designed, whose driver API used a function with the identical name. In each of the programming guide chapters, there is a section dedicated to this operation that explains the function name and how to initialize your algorithm. One thing to note: trying to use transport information from the host (i.e. information about the host’s transport state: playing, stopped, looping, etc.) is usually going to be problematic, unless the plugin specifically requires information about the process—for example, a pattern looping algorithm that needs to know how many bars are looping and what the current bar includes. In addition, some hosts are notoriously incorrect when they report their transport states (e.g. Apple Logic) so they may report that audio is streaming when it is not.
2.4.1 Processing: Audio Signal Processing (DSP)[edit]
Ultimately the effects plugin must process audio information, transforming it in some way. The audio data come either from audio files loaded into a session, or from real-time audio streaming into the audio adaptor. These days, and across all APIs, the audio input and output samples are formatted as floating point data on the range of [−1.0, +1.0] as described in Chapter 1. AU, AAX (native), and RAFX use float data types while VST3 allows either float or double data types, though these will likely change over time. In either case we will perform floating point math and we will encode all of our C++ objects to operate internally with double data types. In an interesting—but not unexpected—twist, all of the APIs send and receive multi-channel audio in the exact same manner. From the plugin’s point of view, it doesn’t matter whether the audio data comes from an audio file or from live input into hardware, or whether the final destination for the data is a file or audio hardware. In Figure 2.3 we’ll assume we have live audio streaming into the DAW.
On the left of Figure 2.3 is the Hardware Layer with the physical audio adaptor. It is connected to the audio driver via the hardware bus. The driver resides in the Hardware Abstraction Layer (HAL). Regardless of how the audio arrives from the adaptor, the driver formats it according to its OS-defined specification and that is usually in an interleaved format with one sample from each channel in succession. For stereo data, the first sample is from the left channel and then it alternates as left, right, left, right, etc. Note that this is the same way data are encoded in most audio files including the .wav format. So, if the audio originated from a file, it would also be delivered as interleaved data. The driver delivers the interleaved data to the application in chunks called buffers. The size of the buffers is set in the DAW, the driver preferences, or by the OS.
The DAW then takes this data and de-interleaves it, placing each channel in its own buffer called the input channel buffer. The DAW also prepares (or reserves) buffers that the plugin will write the processed audio data into called the output channel buffers. The DAW then delivers the audio data to the plugin by sending it a pointer to each of the buffers. The DAW also sends the lengths of the buffers along with channel count information. For our stereo example here, there are four buffers: left input, right input, left output, and right output. The two input buffer pointers are delivered to the plugin in a two-slot array and the output buffers are delivered the same way, in their own two-slot array. For a specialized fold-down plugin that converts 5.1-surround audio into a stereo mix, there would be six pointers for the six input channels and two pointers for the pair of output channels. It should be obvious that sending and receiving the buffer data using pointers to the buffers along with buffer sample counts is massively more efficient than trying to copy the data into and out of the plugin, or coming up with some difficult-to-implement shared memory scheme. When the plugin returns the processed audio by returning the pointers to the output buffers, the DAW then re-interleaves the data and ships it off to the audio driver, or alternatively writes it into an audio file. This gives us some stuff to think about:
• The plugin receives, processes and outputs buffers of data. • Each channel arrives in its own buffer. • Each buffer is transmitted using a single pointer that points to the first address location of the buffer. • The length of each buffer must also be transmitted. • The number of channels and their configuration must also be transmitted.
Figure 2.3: The complete audio path from hardware to plugin and back again.
With our audio data packaged as buffers, we have an important decision to make about processing the buffers. We can process each buffer of data independently, which we call buffer processing, or we can process one sample from each buffer at a time, called frame processing. Buffer processing is fine for multi-channel plugins whose channels may be processed independently—for example, an equalizer.
But many algorithms require that all of the information to each channel is known at the time when each channel is processed. Examples would include a stereo-to-mono fold-down plugin, a ping-pong delay, many reverb algorithms, and a stereo-linked dynamics processor. Since frame processing is a subset of buffer processing and is the most generic, we will be doing all of the book processing in frames.
2.5 Mixing Parameter Changes With Audio ProcessingOur plugin needs to deal with the parameter changes that occur in parallel with the audio processing and this is where things become tricky. The incoming parameter changes may originate with the user’s GUI interaction, or they may occur as a result of automation. Either way we have an interesting issue to deal with: there are two distinct activities happening during our plugin operation. In one case, the DAW is sending audio data to the plugin for processing, and in the other the user is interacting with the GUI—or running automation, or doing both—and this requires the plugin to periodically re-configure its internal operation and algorithm parameters. The user does not really experience the plugin as two activities: they simply listen to the audio and adjust the GUI as needed. For the user, this is all one unified operation. We may also have output parameters to send back to the GUI; this usually takes the form of volume unit (VU) metering. We might also have a custom graph view that requires us to send it information. All of these things come together during the buffer processing cycle. We’ve already established that in all APIs the audio is processed in buffers, and the mechanism is a function call into your plugin. For AU and VST the function call has a predefined name, and in AAX you can name the function as you wish. So in all cases, the entire buffer processing cycle happens within a single function call. We can break the buffer processing cycle into three phases of operation.
1. Pre-processing: transfer GUI/automation parameter changes into the plugin and update the plugin’s internal variables as needed. 2. Processing: perform the DSP operations on the audio using the newly adjusted internal variables. 3. Post-processing: send parameter information back to the GUI; for plugins that support MIDI output, this is where they would generate these MIDI messages as well.
2.5.1 Plugin Variables and Plugin Parameters[edit]
It is important to understand the difference between the plugin’s internal variables and the plugin parameters. Each plugin parameter holds a value that relates to its current state—that value may be originating with a user–GUI interaction or automation, or it may be an output value that represents the visual state of an audio VU meter. The parameter object stores this value in an internal variable.
However, the parameter’s variable is not what the plugin will use during processing, and in some cases (AU), the plugin can never really access that variable directly. Instead, the plugin defines its own set of internal variables that it uses during the buffer processing cycle, and copies information from the parameters into them. Let’s think about a volume, bass and treble control plugin. The user adjusts the volume with a GUI control that is marked in dB. That value in dB is delivered to the plugin at the start of the buffer processing cycle. The plugin needs to convert the value in dB to a scalar multiplier coefficient, and then it needs to multiply the incoming audio samples by this coefficient. There are a few strategies for handling this process. We are going to adhere to a simple paradigm for all of our plugin projects depicted in Figure 2.4. Notice that we name the input parameters as inbound and the output parameters as outbound.
• Each GUI control or plugin parameter (inbound or outbound) will ultimately connect to a member variable on the plugin object. • During the pre-processing phase, the plugin transfers information from each of the inbound parameters into each of its corresponding member variables; if the plugin needs to further process the parameter values (e.g. convert dB to a scalar multiplier) then that is done here as well. • During the processing phase, the DSP algorithm uses the plugin’s internal variables for calculations; for output parameters like metering data, the plugin assembles that during this phase as well. • During the post-processing phase, the plugin transfers output information from its internal variables into the associated outbound parameters. Figure 2.4 shows how the DAW and plugin are connected. The audio is delivered to the plugin with a buffer processing function call (processAudioBuffers). During the buffer processing cycle, input parameters are read, audio processing happens, and then output parameters are written. The outbound audio is sent to the speakers or an audio file and the outbound parameters are sent to the GUI. Concentrating on the pre-processing phase, consider two different ways of handling of handling the volume control on our plugin. In one naïve case, we could let the user adjust the volume with a GUI control that transmits a value between 0.0 and 1.0. This makes for a poor volume control because we hear audio amplitude logarithmically. Nevertheless, consider that arrangement of parameter and plugin variables. The GUI would transmit a value between 0.0 and 1.0. We would transfer that data into an
Figure 2.4: The audio plugin’s operational connections.
internal variable named volume, then we would multiply the incoming audio data by the volume value.
This is shown in Figure 2.5a. The raw GUI value is used directly in the calculation of the audio output.
In Figure 2.5b, the improved plugin transfers values in dB on a range of −60.0 to 0.0 into the plugin, which then cooks the raw data into a new variable also named volume that the plugin can use in its calculations. In this case, the cooking operation requires only one line of code to convert the dB value
into a scalar multiplier. Figure 2.5c shows a panning control that transmits a value between −1.0 (hard left) and +1.0 (hard right). The plugin must convert this information into two scalar multipliers, one for each channel. Since this operation is more complicated, we can create cooking functions to do the
work and produce nicer code. For almost all of our plugin parameters, we will need to cook the data coming from the GUI before using it in our calculations. To keep things as simple and consistent across plugins, we will adhere to the following guidelines:
• Each inbound plugin parameter value will be transferred into an internal plugin member variable that corresponds directly to its raw data; e.g. a dB-based parameter value is transferred into a dB-based member variable.
• If the plugin requires cooking the data, it may do so but it will need to create a second variable to store the cooked version.
• Each outbound plugin parameter will likewise be connected to an internal variable.
• All VU meter data must be on the range of 0.0 (meter off ) to 1.0 (meter fully on) so any conversion to that range must be done before transferring the plugin variable to the outbound parameter.
Figure 2.5: (a) A simple volume control delivers raw data to the plugin for immediate use. (b) A better version delivers raw dB values to the plugin, which it must first cook before using. (c) A panning control requires more complex cooking.
If you want to combine the first two guidelines as a single conceptual block in your own plugin versions, feel free to do so. Our experience has been that when learning, consistency can be very helpful so we will adhere to these policies throughout. Another reason to keep raw and cooked variables separate involves parameter smoothing.
2.5.2 Parameter Smoothing[edit]
You can see in Figure 2.6a that the plugin parameters are transferred into the plugin only at the beginning of the buffer processing cycle. If the user makes drastic adjustments on the GUI control, then the parameter changes arriving at the plugin may be coarsely quantized, which can cause clicks while the GUI control is being altered (which is sometimes called zipper noise). If the user has adjusted the buffer size to be relatively large, then this problem is only exacerbated, as the incoming parameter control changes will be even more spread out in time. To alleviate this issue, we may employ parameter smoothing. It is important to understand that if this smoothing is to occur, it must Figure 2.6: (a) In ordinary frame processing, parameter updates affect the whole buffer. (b) In processing with parameter smoothing, each frame uses a new and slightly different parameter value. (c) The complete buffer processing cycle includes the transfers.
be done on the plugin side, not in the GUI or DAW. With parameter smoothing, we use interpolation to gradually adjust parameter changes so that they do not jump around so quickly. Our parameter smoothing operations will default to do a new parameter smoothing update on each sample period within the buffer processing cycle. This means that the smoothed internal plugin variable will be updated during every frame of processing, as shown in Figure 2.6b. The good news is that this will eliminate clicks or zipper noise. The bad news is that if the parameter change requires a complex cooking equation or function, we can burn up a lot of CPU cycles. The parameter smoothing object we use may also be programmed to perform the smoothing operations only during every other Nsample periods (called granularity)—but of course the more granularity we add, the more likely the control changes will produce clicks and noise. We’ll discuss the parameter smoothing object in more detail in Chapter 6. Since parameter smoothing will incur some CPU expense, you should use it only when absolutely necessary. For a GUI control that transmits a discrete value—for example, a three-position switch that transmits a value of 0, 1, or 2 corresponding to each position—there is usually no need to try to smooth this value to slow down the movement between positions. We will smooth only those parameters that are linked to continuous (float or double) plugin variables.
2.5.3 Pre and Post-Processing Updates[edit]
Our ASPiK kernel object (or your chosen plugin management framework) should provide a mechanism for performing the operations of transferring parameter data into and out of the plugin. To break this operation up further, we will adhere to another convention that separates the transfer of parameter information from the cooking functions that may be required; this will produce code that is easier to read and understand, and that is more compartmentalized. In Figure 2.6c you can see that the pre-processing block calls a function we will call syncInboundVariables. During this function call, each parameter’s value is transferred into an associated plugin variable. After each transfer, the function postUpdatePluginParameter will be called. During this function call, the plugin can then cook incoming parameter information as it needs to. During the post-processing phase, the function syncOutboundVariables is called: this copies the outbound plugin variables into their corresponding plugin parameters. There is no need for a post-cooking function in this block as the plugin can format the information as needed easily during the processing phase.
2.5.4 VST3 Sample Accurate Updates[edit]
The VST3 specification differs from the others in that it defines an optional parameter smoothing capability that may be implemented when the user is running automation in the DAW session. This involves only automation and is entirely separate from the normal user–GUI interactions that produce the same per buffer parameter updates as the other APIs. In the case of automation, the VST3 host may deliver intermediate values in addition to the last parameter update received. The exact mechanism is covered in Chapter 3; it is not a perfect solution to the parameter smoothing problem with automation, but it is definitely a step in the right direction. ASPiK is already designed to handle these parameter updates with a feature that may be optionally enabled. As with normal parameter smoothing, care must be taken when the updated parameter values must be cooked with a CPU intensive function. The same issues will exist as with normal parameter smoothing.
At this point, you should have a nice big-picture view of how the plugin operates at a functional block level. You can see how the plugin parameter updates and the audio signal processing are interleaved such that each buffer is processed with one set of parameter values, smoothed or not. That interleaving is a key part of these plugin architectures that transcends any individual API. The API-specific chapters that follow will elaborate with the details as they relate to those plugin specifications. Before we move on, we need to discuss the labels in Figure 2.4 that involve the threads and safe copy mechanisms.
2.5.5 Multi-Threaded Software[edit]
If you’ve studied microprocessor design and programming, you’ve come across something called the instruction pointer (IP). At its lowest level, all software runs in a loop of three stages: fetch, decode, and execute. The program fetches an instruction from memory, decodes it, and then executes the instruction. The IP points to the current instruction that is being fetched. After the instruction executes—and assuming no branching occurred—the IP will be updated to point to the next instruction in the sequence and the three-step cycle repeats. Branching (if/then statements) will alter the IP to jump to a new location in code. Function calls do the same thing and alter the IP accordingly. We say that the IP “moves through instructions” in the program. There was a time when you could write a piece of software with only one IP; most of us wrote our first “Hello World” program like that. An instruction pointer that is moving through a program from one instruction to the next is called a thread or a thread of execution.
This is shown in Figure 2.7a. Most of us spent our formative programming classes writing single-threaded programs that did very important things like calculating interest on a savings account or listing prime numbers.
Figure 2.7: (a) A single-threaded application has only one instruction pointer (IP) moving through the instructions and data. (b) A multi-threaded program features multiple IPs accessing the same instructions and data. Here, IP1 tries to write the volume variable while IP3 reads it. (c) In this diagram, a race-condition exists between thread 1 and thread 3 as they compete to write the volume variable with different values.
Now think about a computer game that simulates a jet fighter. This is a different kind of program. The graphics are showing us flying through the air: as we launch our missiles and hit targets there is accompanying audio in sync, and when we move the joystick we watch the sky dance around in response. All the while, there is a small radar indicator spinning around to show that the enemy is behind us. Even if we release the controls and do nothing, the airplane still flies, at least for a while. Now a bunch of things need to appear to happen simultaneously, and the programmers need to write the code accordingly. In this kind of software, there are multiple IPs, each moving through different parts of the code: one IP is running the code that paints the images on the screen, while another IP is looping over the code that plays the music soundtrack and yet another is tracking the motion of the joystick. This is an example of a multi-threaded application, shown in Figure 2.7b. Each different instance of an IP moving though code is one of the threads, and this software has multiple IPs. The OS is involved in creating the illusion that all of this is happening at once. On a single-CPU system, each thread of execution is allowed a sliver of processing time; the OS doles out the CPU time-slices in a round-robin fashion, moving from one thread to the next. It certainly looks like everything is happening at once, but in reality it is all chopped up into processing slices so short in duration that your brain can’t discern one from the other—it all merges into the beautiful illusion that you really are flying that jet. Even with multi-CPU systems, most programs still execute like this, with one slice at a time per thread; adding more CPUs just adds more time-slices.
Now think about our audio plugin. The two things that need to appear to happen at once—the audio processing and the user interaction with the GUI—are actually happening on two different threads that are being time-shared with the CPU. This means that our plugin is necessarily a multi-threaded piece of software. It has at least two threads of execution attacking it. Each thread can declare its priority in attempt to try to get more (or less) CPU time. The highest priority threads get the most CPU time, and the lowest priority threads get the least. This also requires that the threads run asynchronously. The OS is allowed to adjust the CPU thread-sliver time as it needs to: for example, if the system becomes bogged down with your movie download, it may decide to down-prioritize the already low-priority GUI thread messages so that your plugin GUI becomes sluggish. But the high-priority audio is not affected and audio arrives glitch-free at the driver.
The priority of each thread is really not the issue we need to deal with, though we do need to keep it in mind. The larger problem we need to address is the fact that the two threads need to access some of the same data, in particular the plugin parameters. The GUI needs to alter the parameters and the audio processing thread needs to access them as well to update its processing accordingly. The fact that the two threads running asynchronously need to access the same information is at the core of the multi�threaded software problem. Now think about the threads in that flight simulator: how much data do they need to share to give the user the proper experience? Figure 2.7b shows this issue in which thread 1 and thread 3 both attempt to access the volume variable in the data section of the program.
The problem with sharing the data doesn’t seem that bad on first glance: so what if the two threads share data? The CPU lets only one thread access the data at a time, so isn’t it already handing the thread synchronization process for us? Isn’t it keeping the shared data valid? The answer is no. The problem is that the data themselves are made of multiple bytes. For example, a float or int is actually 32 bits in length, which is four bytes. A double is 64 bits in length or eight bytes. Threads read and write data in bytes, one at a time. Suppose a GUI thread is writing data to a double variable, and it has written the first three bytes of the eight that are required when the OS gives it the signal that it must halt and relinquish control to the next thread. For ordinary variables, the thread will stop immediately and remember where it was, then pass execution to the next thread. When our GUI thread gets a new CPU time-slice, it finishes writing the remaining five bytes of data to the variable and moves on.
But what if our audio processing thread that was next in line for CPU time tried to read the partially written double variable? It would be accessing corrupt data, which is one type of multi-threading issue:
inconsistent data between threads. Now suppose that the audio processing thread tried to write data to that double variable and succeeded in overwriting the three new bytes and five old existing bytes.
Later, the GUI thread regains control and finishes writing the rest of the data with the remaining five bytes it believes need to be written. Once again we are left with a corrupt value in the variable. When two threads compete to write data to the same variable, they are said to be in a race condition: each is racing against the other to write the data and in the end the data may still be corrupted. Race conditions and inconsistent data are our fundamental problems here. The good news is that each of the APIs has its own internal mechanism to ensure thread safety in the plugin. The reason we spend time discussing this is that these mechanisms not only affect how each API is designed but also trickle down into how our plugin code must be written. We’ll discuss each API’s multi-threading tactics in more detail in the following chapters to give you a deeper understanding and some more insight into the APIs. However, rest assured that the ASPiK is 100% thread-safe in its operation within the various APIs—you will not need to concern yourself with these issues directly when programming.
2.6 Monolithic Plugin Objects[edit]
The plugin itself is compiled as a DLL but it implements the plugin as a C++ object or set of objects. The chores of these objects include the following: • describing the plugin for the host • initializing the parameter list • retrieving parameter updates from the user/DAW • cooking parameter updates into meaningful variables for the DSP algorithm • processing audio information with the DSP algorithm • rendering the GUI and processing input from the user, then transferring that information safely into the parameter list In all APIs, the very last item in the list—rendering the GUI—is preferably done in a separate and independent C++ object that is specifically designed to implement the GUI and nothing else. Trying to combine that GUI and platform-specific code into the same object that handles the other plugin duties is going to be problematic at best and fraught with basic object-oriented design errors at worst. So we will set that C++ object aside for now and look at the other items.
You can roughly divide the remaining duties into two parts, one that handles parameters and another that handles audio processing. The VST3 and AAX specifications both allow two distinct paradigms for creating the underlying C++ objects that encapsulate the plugin. In one version, the duties are split into these two components: processing audio and handling parameters. In the other paradigm, a single C++ object handles both of the chores. In the AAX SDK, this pattern is called the “monolithic plugin object” paradigm. For consistency, I will adhere to this monolithic object archetype throughout our plugin projects and within ASPiK. If you want to later split the code up along the parameter/processing boundary lines, feel free to do so—it is not overly difficult. For example, the monolithic plugin object in VST3 is actually just a C++ object that is derived from both the parameter and audio processing objects, inheriting both sets of functions. However, in no case will our GUI rendering object ever be merged with the monolithic processing object: it will always remain as an independent component and our monolithic processing object will neither require nor know of the GUI’s existence. Likewise, our GUI rendering object has no knowledge that there is an associated processing object that is using its information. The API-specific thread-safe mechanisms that ASPiK implements ensure a proper and legal relationship between the GUI and plugin object itself—these will be revealed in Chapter 6.
2.7 Bibliography
Apple Computers, Inc. 2018. The Audio Unit Programming Guide, https://developer.apple.com/library/archive/documentation/ MusicAudio/Conceptual/AudioUnitProgrammingGuide/Introduction/Introduction.html, Accessed August 1, 2018. Avid Inc. AAX SDK Documentation, file://AAX_SDK/docs.html, Available only as part of the AAX SDK. Bargen, B. and Donnelly, P. 1998. Inside DirectX, Chap. 1. Redmond: Microsoft Press. Coulter, D. 2000. Digital Audio Processing, Chap. 7–8. Lawrence: R&D Books. Petzold, C. 1999. Programming Windows, Chap. 21. Redmond: Microsoft Press. Richter, J. 1995. Advanced Windows, Chap. 2, 11. Redmond: Microsoft Press. Rogerson, D. 1997.Inside COM, Chap. 1–2. Redmond: Microsoft Press. Steinberg.net. The Steinberg VST API, www.steinberg.net/en/company/developers.html, Accessed August 1, 2018. 32
CHAPTER 3
VST3 Programming Guide[edit]
VST3 is the current version of Steinberg’s mature VST plugin specification. The venerable VST2 API, which has arguably produced more commercial and free plugins than any other API in the history of audio plugins, was officially made obsolete in October 2018 with its removal from the VST SDK and the inability for new plugin authors to legally write for it, though there is a grandfather clause for older plugin developers who have already signed license agreements with Steinberg. VST3 was redesigned from the ground up and has almost nothing in common with VST2 except some naming conventions and other very minor details. VST3 specifically addressed some of the shortcomings of the VST2 API, including a somewhat vague side-chain specification and a messaging system. VST3 is also designed with upgrading and product extension in mind, and the developers add new features with every release. Finally, due to the fact that the VST2 specification has been gracefully laid to rest, I will use the terms “VST3” and “VST” interchangeably in this text. You may download the VST3 SDK for free from Steinberg (www.steinberg.net/en/company/developers.html) and you are encouraged to join the developer’s forum, where you may post questions.
3.1 Setting Up the VST3 SDK[edit]
The VST SDK is simply a collection of subfolders in a specific hierarchy. You can download the SDK from Steinberg and unzip it to your hard drive. The VST SDK is already set up to work with CMake (more on that later), so you will notice that the folder names have no whitespaces. The root folder is named VST_SDK. As of SDK 3.6.10, the root folder contains two subfolders, one for VST2 and another for VST3. Steinberg has made the VST2 plugin API obsolete and is no longer developing or supporting it. The VST2 folder contains the old VST2 API files; these are used in a VST3-to-VST2 wrapper object that is currently included in the SDK, and is fairly simple to add to projects. Unless you have a previous license to sell VST2 plugins, you are not allowed to sell new VST2 plugins. I’m not sure if future SDKs will contain the VST2 portion or not, or if the VST2 wrapper will be around much longer.
3.1.1 VST3 Sample Projects[edit]
With the SDK planted on your hard drive, you need to run CMake to extract the sample projects. Instructions are included in the SDK documentation and a video tutorial at www.willpirkle.com. The VST3 SDK comes with a rich set of sample plugin projects that will generate more than a dozen example plugins in all shapes and sizes. Install the SDK on your system, then use the CMake software to generate a single Xcode or Visual Studio compiler project that contains every one of the sample plugins inside of it. You can build all of the plugins at one time with one build operation. If your compiler and SDK are set up correctly, you will get an error-free build and have a bunch of VST3 plugins to play with. You should verify that your compiler will properly build the SDK sample projects before trying to write your own plugins.
3.1.2 VST3 Documentation[edit]
The VST3 API documentation is contained within the SDK and is generated using Doxygen®, which creates HTML-based documents that you view with a web browser. You can open the documentation by simply double-clicking on the index.html file, which is the Doxygen homepage. This file is located close to the SDK root at VST_SDK/VST3_SDK/index.html; you should definitely bookmark it and refer to it frequently. The VST3 SDK documentation improves greatly with each new SDK release. It contains a wealth of information as well as documentation on all structures, classes, functions, and VST3 programming paradigms.
3.2 VST3 Architecture and Anatomy[edit]
VST plugins may be written for Windows, MacOS, iOS, and Linux. This means you must use the proper compiler and OS to target these products—you cannot use Xcode to write a Windows VST, for example. This book targets Windows and MacOS specifically, although the VST API is written to be platform independent. If you are interested in writing for iOS or Linux, I suggest you get the Windows and MacOS versions working first: there is much more information and help available for these. The latest version of the VST3 SDK packages both Windows and MacOS plugins in a bundle, which you may think of as simply a directory of subfolders and files. The bundle is set up to appear as a file for a typical user, though in reality it is a folder—in MacOS you may right-click on the plugin “file” and choose “Show Package Contents” to get inside the bundle. The VST3 SDK also includes VSTGUI4 for generating the custom GUIs, though you may use whatever GUI development library you like;
VSTGUI4 was designed to interface directly with VST3, and ASPiK uses VSTGUI4 for its GUI library operations.
At its core, VST3’s Module Architecture (VST-MA) is based on (but not identical to) Microsoft’s Common Object Model (COM) programming paradigm. This allows the VST3 API to be easily updated and modified, without breaking older versions. COM is really just a “way” or approach to writing software and is technically independent of any programming language, though VST-MA supports only C++ as of this writing.
COM programming revolves around the concept of an interface, which we also use in our FX objects that accompany this book (IAudioSignalProcessor and IAudioSignalGenerator). For the VST-MA, an interface is a C++ object that defines only virtual functions so that it represents an abstract base class. There are no member variables, and the classes that inherit an interface must override any pure abstract function it defines. Some, but not all, of the VST-MA interfaces are pure abstract, so that all inherited functions are implemented on the derived class. There is some resemblance to the protocol in Objective C programming where a class “conforms to a protocol” by implementing its required methods. The interface concept is powerful as it allows objects to hold pointers to each other in a safe manner: if object A holds an interface pointer to object B (i.e. a pointer to B’s interface base class) object A may call only those functions that interface defines. Being abstract, the interface has no constructor or destructor, so object A cannot call the delete operator on the interface pointer, and trying to do so at compile time produces an error. With multiple inheritance, an object may implement many interfaces (or conform to their protocols) so that it may safely be accessed via one of these base class pointers.
In addition, the COM paradigm provides a device that allows for backwards compatibility when interfacing with older objects that were compiled with earlier SDKs. It consists of a querying mechanism so that one object may safely query another object to ask it if it supports an interface 34 Chapter 3 of some kind. If that object replies that it does, the first object may then ask for an interface pointer and use its methods. This means that when new features are added to the VST3 API, they may be implemented as new interface class definitions. A host may query the plugin to see if it has one of these new interfaces. If it does not, then the host does not attempt to use the new features. Or, the host may work its way “backwards” querying the object to see if has an older interface that supports at least part of the features it needs. If it does and the host acquires an interface pointer for it, then the host may call only those older functions that are guaranteed to exist, preventing crashes and providing a robust mechanism for safely updating the API.
3.2.1 Single vs. Dual Component Architectures[edit]
There are two paradigms for writing VST3 plugins, which involve how the two primary plugin duties are distributed: audio signal processing and GUI implementation. In the dual component pattern, you implement the plugin in two C++ objects: one for the audio processing (called the Processor object) and the other for the GUI interface (called the Controller object). The VST3 architects went to great lengths to isolate these two objects from one another, making robust communication between the two nearly impossible. This is not meant to be a criticism, as it closely follows the C++ object-oriented design paradigm and it allows the audio processing and GUI connection/implementation to run on separate CPUs. For multi-CPU platforms, which use farm cards full of CPUs, this makes sense.
However, if you are trying to work with monolithic programming objects for implementing plugins, then your VST3 versions may be very different than your AU and AAX versions. In addition, there are some plugin architectures whose design patterns prohibit them from being distributed in two objects running on two separate CPUs, or whose implementations become overly complicated with this paradigm.
To address this, the VST3 API also allows for a single component version in which one monolithic C++ object handles both the processing and GUI interfacing chores. This is called the single component paradigm. The architects simply created a single C++ object with multiple inheritance from the processor and controller objects. Figure 3.1 shows the two different design patterns. Please note Figure 3.1: (a) The dual component VST3 plugin uses two C++ objects; the thick black bar across the center represents the hard communication barrier between the two components. (b) In contrast, the single component version uses one C++ object; the thin dotted line denotes a less firm barrier (which is really up to the designer).
that, as with AAX and AU, the controller portion deals with communication with the GUI, but does
not handle the implementation details—the part that actually renders the knobs, buttons, switches, and
views, which is written in a separate GUI object.
3.2.2 VST3 Base Classes[edit]
In the dual component version, you create two C++ objects that are derived from two separate base classes: AudioEffect (for the processor) and EditController (for the controller). In the single component version, which we use exclusively in this book and ASPiK, your plugin is derived from SingleComponentEffect, which essentially inherits from the dual components. Notice that unlike AU, the same base classes (listed here) are used regardless of the type of plugin, FX or synth.
• AudioEffect: the dual component processor base class • EditController: the dual component controller base class • SingleComponentEffect: the combined single object base class
3.2.3 MacOS Bundle ID[edit]
Each bundle in MacOS is identified with a unique string called a bundle ID. Apple has recommendations on naming this string, as well as registering it with Apple when you are ready to sell your product commercially. Typically, you embed your company and product names into the bundle ID. The only major restriction is that you are not supposed to use a hyphen (“-”) to connect string components, instead you should use a period. For example, mycompany.vst.golden-flanger is not legal, whereas mycompany.vst.golden.flanger is OK. This is set in the Xcode project’s Info panel, which really just displays the information inside of the underlying info.plist file (see Figure 3.2). Consult the Apple documentation if you have any questions. If you use ASPiK to generate your VST projects, then the bundle ID value is created for you and set in the compiler project in the Info.plist Preprocessor Definitions; you can view or change this value in the project’s Build Settings panel.
3.2.4 VST3 Programming Notes[edit]
The VST3 API is fairly straightforward as far as C++ implementation goes. However, the VST3 designers tend to stay close to the forefront of programming technology, and the current version of the SDK will require a compiler that is C++11 compliant. VSTGUI4 also requires C++11, and for MacOS/Xcode, it also requires C++14. So be sure to check the release notes of your SDK to ensure that your compiler will work properly. If you compile the sample projects and immediately get a zillion errors, check to make sure you are compliant. Note that the MacOS versions require that several compiler frameworks are linked with the AU project, including CoreFoundation, QuartzCore, Accellerate, OpenGL, and Cocoa®; the latter two frameworks are for VSTGUI4. However, the compiler projects are actually very complex—more so than AU and AAX—due in part to the fact that the plugins must link to several static libraries that are compiled along with the plugin and are part of the complier solution. In earlier versions of the VST3 API you precompiled your base library (the same as the current version of AAX), but later versions merged these libraries into the compiler project. Figure 3.2 shows what a stock VST3 plugin project looks like in Xcode. This plugin uses VSTGUI for rendering the GUI and has additional libraries for that support, but the base and sdklibraries are required. The Visual Studio solution view is similar, showing independent sub-projects that represent the libraries and validator.
As with AAX and AU, there are numerous C++ typedefs that rename standard data types, so you should always right-click and ask the compiler to take you to definitions that you don’t recognize. Most of the definitions are simple and are supposed to make the code easier to read, though sometimes they can have the opposite effect. In particular, you will see a typedef for PLUGIN_API decorating many VST3 functions, which simply renames the __stdcall function specifier. This specifier has to do with how the stack is cleaned up after function calls. You can safely ignore it if you don’t know what it means (it simply keeps consistency across platforms). Most of the VST3 plugin functions return a success/failure code that is a 32-bit unsigned integer typedef’d as tresult. The success value is kResultTrue or kResultOK (they are identical) while failure codes include kResultFalse, kInvalidArgument, kNotImplemented and a few more defined in funknown.h.
3.2.5 VST3 and the GUID[edit]
VST3 uses GUIDs as a method of uniquely identifying the plugin (for the DAW) as well as uniquely identifying each of the processor and editor halves when using the dual component paradigm (perhaps as a way to distribute the components across CPUs). The single component version uses just one GUID. If you use ASPiK, then CMake will generate the GUID for you as part of the project creation; if you do not use ASPiK, then consult your framework documentation. If you are making your own plugin from scratch, then you need to generate this value yourself. On Windows, you may use the built-in GUID generator that is installed with Visual Studio called guidgen.exe; on MacOS, you may download one of numerous GUID generator apps for free. You can see the GUID in the class definition codes provided in Section 3.3.
Figure 3.2: A standard VST3 project in Xcode includes multiple libraries, a validator executable, and the plugin target.
3.2.6 VST3 Plugin Class Factory[edit]
VST3 uses the class factory approach for generating multiple instances of a plugin during a typical DAW session. There are a couple of ramifications to this approach. First, you need to be careful about using singletons or other anti-patterns, as they probably won’t work properly with the class factory. We don’t use them with our ASPiK plugin shells and neither should you (please, no vicious emails). Second, the construction and destruction of your VST3 plugin’s dynamically allocated resources do not happen in the C++ object’s constructor and destructor as they do with the other APIs. Instead, there are two extra functions that you use for these purposes—initialize( ) and terminate( )—whose roles should be obvious by name. So, you just leave the constructor and destructor of the SingleComponentEffect alone. For the dual component paradigm, these functions are part of the processor object.
3.3 Description: Plugin Description Strings[edit]
Your plugin description strings, along with the GUID, are part of this class factory definition. There are two macros used to define the class factory; your description information is embedded in them. The class factory macro named DEF_CLASS2 (for one of the SDK sample projects, AGainSimple) follows. It contains the following description strings, which are outlined in Section 2.2 except the FUID, which is unique to VST3: vendor name, vendor URL, vendor email, GUID, plugin name, and plugin type code. These are shown in bold in that order.
BEGIN_FACTORY_DEF ("Steinberg Media Technologies",
"http://www.steinberg.net", "mailto:info@steinberg.de")
DEF_CLASS2 (INLINE_UID (0xB9F9ADE1, 0xCD9C4B6D, 0xA57E61E3,
0x123535FD), PClassInfo::kManyInstances, kVstAudioEffectClass, "AGainSimple VST3", 0, // 0 = single component effect Vst::PlugType::kFx, 1.0.0, // Plug-in version kVstVersionString, Steinberg::Vst::AGainSimple::createInstance)
END_FACTORY
Notice that the creation function is the last argument—this function serves up new instances of the plugin in a single line of code using the new operator. Also notice the single component effect designator (the 0 which has no predefined constant declaration); for the dual component version it is re-defined as Vst::kDistributable. The FX plugin type code follows it in the macro’s argument sequence as Vst::PlugType::kFx; for completeness, the synth version is Vst::PlugType::kInstrumentSynth. VST3 also supports one sub-category layer: for example, if your plugin is for reverb, you could alternatively assign its category as Vst::PlugType::kFxReverb. In that case the DAW may group your plugin with the other reverb plugins the user has installed.
3.4 Description: Plugin Options/Features[edit]
The plugin options include audio I/O, MIDI, side chaining, latency and tail time. For VST plugins, these are all done programmatically either in the plugin initializer, or in other base class functions that you override.
3.4.1 Side Chain Input[edit]
VST3 plugins work on the concept of audio busses. For a FX plugin there is one input buss and one output buss, which may have one or more channels each. Note that there does not necessarily need to be the same number of channels for input as output. The side chain input is declared as a separate input buss and it is typically declared as a stereo buss. We will work with the audio channel I/O declarations shortly, but since implementing robust side chaining was an important part of the VST3 specification, we can look at the code now. You add an audio buss using the addAudioInput function and you pass it a name string, a channel I/O setup (called a “speaker arrangement”), and an optional buss type and buss info specifier. That buss type is key for identifying the side chain and is specified as kAux. To set up a stereo side chain input, you would write:
addAudioInput(STR16("AuxInput"), SpeakerArr::kStereo, kAux); We will discuss the speaker arrangements shortly, but the main concept here is the kAux designator. The default value (and if none is supplied in the arguments) is kMain. There are no other designations at the time of this writing.
3.4.2 Latency[edit]
If your plugin includes a latency that adds a necessary and fixed delay to the audio signal, such as the PhaseVocoder in Chapter 20 or the look-ahead compressor in Chapter 18, you may signify this for the VST3 plugin host. The host will call the function getLatencySamples( ) on your plugin object to query it about this feature. If your plugin includes a fixed latency, you would override this function and return the latency value. If your plugin introduces 1024 samples of latency, you would override the function and write:
virtual uint32 PLUGIN_API getLatencySamples() override { return 1024;}
3.4.3 Tail Time[edit]
Reverb and delay plugins often require a tail time, which the host implements by pumping zero�valued audio samples through the plugin after audio playback has stopped, allowing for the reverb tail or delays to repeat and fade away. The host will query your plugin when it is loaded to request information about the tail time. For VST3 you report the tail time in samples rather than seconds. For algorithms that are sample-rate dependent, this means that your tail time in samples will likely change. The function you use to report the time is named getTailSamples; it will be called numerous times during the initialization as well as when the user modifies the session in a way that changes the sample rate. To report a tail time of 44,100 samples, or 1 second at a sample rate of 44.1 kHz, you would write this:
virtual uint32 PLUGIN_API getTailSamples() override {return 44100;}
You may also specify an infinite tail time, which forces the DAW into sending your plugin a stream of 0s forever after the user stops playback by returning the predefined kInfiniteTail.
3.4.4 Custom GUI[edit]
The controller portion of the VST3 plugin, whether using the dual or single component versions, inherits from an interface that implements the createView function. Your derived class should override this function if you are planning on generating a custom GUI. The host will call the function when the user requests to see the GUI: your plugin must create the GUI and return a specific type of interface pointer called IPlugView. Regardless of how the GUI is created and implemented, you must return an IPlugView pointer back to the host. This means that your GUI must conform to the IPlugView protocol and implement the pure virtual functions that it specifies. The ASPiK VST3 plugin shell defines the IPlugView object that implements these functions and handles the lifecycle of the GUI, including resizing it for hosts that allow this. It uses the VSTGUI4 PluginGUI object defined in Chapter 6. There are a couple of items to note: first, your plugin object can always implement the createView function and simply return 0 or nullptr, which signifies to the host that your plugin does not support a custom GUI—this can be valuable when debugging a new plugin if you think your GUI may have some internal fault or you are trying to narrow down your problems. Second, you should never cache (save) the IPlugView pointer you return, nor that of the internal GUI object, in an attempt to set up some kind of communication with the GUI—which is a bad idea to begin with. Make sure you use the VST3-approved mechanisms for sending parameter updates to and from the GUI.
3.4.5 Factory Presets and State Save/Load[edit]
Factory presets are part of the IUnitInfo interface that the SingleComponentEffect object inherits. For the dual component architecture, it is part of the controller object. According to the VST3 documentation, IUnitInfo describes the internal structure of the plugin:
• The root unit is the component itself. • The root unit ID has to be 0 (kRootUnitId). • Each unit can reference one program list, and this reference must not change. • Each unit using a program list references one program of the list.
A program list is referenced with a program list ID value. The root unit (the plugin) does not have a program list. However, you may add sub-units that contain these program lists, which represent your factory presets. In order to support factory presets, you declare at least two units: one for the root, and another for the preset program list. You then set up a string-list parameter—the VST3 object that is normally used to store string-list parameters for the plugin. Since that object holds a list of strings to display for the user, it may be used to store lists of factory preset name strings.
When you set up the special string-list parameter object, you supply optional flags—one of which is ParameterInfo::kIsProgramChange, which signifies that the string list is holding preset names. So, the order of operations is as follows:
1. Add a root unit that represents the plugin. 2. Add another unit that represents the preset list. 3. Set up a string-list parameter that holds the names of the factory presets. 4. Make sure to supply the kIsProgramChange flag for this special parameter. 5. Add the parameter to the plugins list of other parameters (i.e. GUI controls).
In addition to exposing the preset names, there are several other functions you need to override and support to conform to the IUnitInfo interface that you need for presets. None of these functions are difficult or tedious, but there are too many to include here.
VST3 is unique in that it requires you to write your own serialization code for loading and storing the plugin state that occurs when the user saves a DAW session or creates their own presets. You must write the code that reads and writes data to a file. You serialize the data to the file on a parameter�by-parameter basis, one parameter after another. This means you need to take care to ensure that you read data from the file in the exact same order as you wrote the data to the file. The fact that you must supply this code has advantages in that you are in charge of saving and retrieving the exact state for the plugin, and you may write other information into that file that is not directly tied to any plugin parameter, allowing robust customization of the data you save. There are three functions that your plugin must override to support the state serialization and two of them are for reading state information, while the third is for writing it. Two functions are necessary to support the dual component architecture. The two state-read functions, which you typically code identically when using the SingleComponentEffect, are: tresult setState(IBStream* fileStream) tresult setComponentState(IBStream* fileStream) The state-writing function is: tresult getState(IBStream* state)
You are provided with an IBStream interface that you use to read and write data to the file. A helper class IBStreamer is defined that provides functions for reading and writing the standard data types, such as float, double, int, unsigned int, etc. In order to support Linux and pre-Intel Macs, both big�endian and little-endian byte ordering may be specified. For the book projects, which designed are for Windows and Intel-based Macs, we use little-endian byte ordering; if you want to support Linux, see the VST3 sample code for identifying big-endianness. You use the supplied IBStream pointer to create the IBStreamer object, and then use the necessary helper function to read and write data. Remember that you usually bind plugin member variables to GUI parameters (Chapter 2). Those bound variables are the ones that you are reading and writing. For example, to write a double variable named pluginVolume that is bound to the volume parameter, you would use the IBStreamer::writeDouble function: tresult PLUGIN_API getState(IBStream* fileStream) {
// --- get a stream I/F IBStreamer strIF(fileStream, kLittleEndian); // --- write the data if(!strIF.writeDouble(pluginVolume)) return kResultFalse; return kResultTrue;
} To read that variable in either of the two read-functions, you would use the readDouble function, which returns the data as the argument: VST3 Programming Guide 41 tresult PLUGIN_API setState(IBStream* fileStream) {
// --- get a stream I/F IBStreamer strIF(fileStream, kLittleEndian); // --- read the data if(!strIF.readDouble(pluginVolume)) return kResultFalse; return kResultTrue;
}
Have a look at the fstreamer.h file and you will find dozens of functions for writing all sorts of data types, including arrays of values as well as individual ones. You can also find functions that allow you to read and write raw blocks of data to serialize custom information that is not part of the plugin parameter system. While being required to write your own serialization functions for even the basic parameter states may be a drag, you can also enjoy the robust functions provided for serializing custom information as well.
3.4.6 VST3 Support for 64-bit AudioVST3 plugins allow you to support both 32- and 64-bit audio data. To keep consistency between all of the APIs, the ASPiK book projects support only 32-bit audio data. If you want to add your own support for 64-bit audio, then it is fairly straightforward: you tell the host that you support these data, then during the audio processing you write code that may operate on either 32- or 64-bit data. This is accomplished most easily using function or class templates. The host will query your plugin to ask for the supported audio data types in a function called canProcessSampleSize and it will pass in a flag that specified the data type of the inquiry, either kSample32 or kSample64. For our ASPiK book projects that support 32-bit audio, we write: tresult PLUGIN_API canProcessSampleSize(int32 symbolicSampleSize) {
// -- we support 32 bit audio if (symbolicSampleSize == kSample32) return kResultTrue; return kResultFalse;
} During the buffer processing function that I will address shortly, you will be passed a structure that includes the current “symbolic sample size” that you decode, then you can branch your code accordingly.
3.5 Initialization: Defining Plugin Parameters[edit]
In VST3, parameters are moved to and from the GUI as normalized values on the range of [0.0, 1.0] regardless of the actual value. This is in contrast to AU and AAX, which operate on the actual data value. In VST3 lingo, the actual value is called the plain value. For example, if you have a volume control that has a range of −60 to +12 dB, and the user adjusts it to −3 dB, the parameter will store the normalized value of 0.7917 and transmit this to your plugin. If you want to write data out to the GUI parameter (e.g. user loads a preset), then you pass it the normalized value of 0.7917 as well. If you are using ASPiK, then all of this is handled for you, so you never need to bother with the normalized or plain values. However, if you intend to create custom parameters, you will need to work with both versions. The normalized values are always implemented as double data types, which VST3 typedefs as ParamValue, and the control ID is an unsigned 32-bit integer, which VST3 typedefs as ParamID.
VST3 supplies you with a base class named Parameter along with two sub-classed versions named RangeParameter and StringListParameter, which are used to encapsulate linear numerical parameter controls and string-list controls respectively. For our plugins, we define additional parameter types that are based on the control taper (see Chapter 6), including log, anti-log and volt/octave versions. In this case we need to sub-class our own Parameter objects that handle the chores of not only converting linear control values to their nonlinear versions, but also for converting from plain to normalized and back. These are built into the book project VST3 shell code and are independent of ASPiK, so feel free to use them as you like. We’ve sub-classed the following custom Parameter objects.
• PeakParameter: for parameters that are already normalized such that both the plain and normalized values are the same (this was borrowed directly from the VST3 SDK sample projects) • LogParameter: for log-controls • AntiLogParameter: for inverse log-controls • VoltOctaveParameter: for controls that move linearly in octaves of 2N This gives you six total parameters to use, and you may always define your own hybrid versions—just see the file customparameters.h in the VST3 shell code for our versions, which will guide you into making your own.
As with all the other APIs, you declare these parameter objects to expose your plugin parameters to the host (in VST3 lingo, this is called exporting parameters). For VST3, you must declare one parameter object for each of your plugin’s internal controls, plus an additional parameter to operate as the soft bypass mechanism. For early versions of the VST3 API, this soft bypass was optional, but it is now required. Once you have exposed your parameters to the host, you should not alter their ordering, nor should you add or remove parameters. This makes sense: if you altered the parameter ordering, it would massively screw up the automation system and confuse the host, and the user as well. You expose your parameters in the initialize function that we discussed in Section 3.2.6, and this function will be called only once per lifecycle. The SingleComponentEffect base class includes a container for these parameters. You simply call a member function (addParameter) to add parameters to the list. Like ASPiK, VST3 parameters are ultimately stored in both a std::vector and std::map for fast iteration and hash-table accesses. As you learned in Chapter 2, all the APIs are fundamentally the same, and the VST3 parameter objects are going to encode the same types of information: an index value for accessing; the parameter name; its units; and the minimum, maximum and default control values. You just need to use the appropriate sub-classed parameter object that depends on the control type and taper. For example, suppose a plugin has two parameters with the following attributes.
Parameter 1 • type: linear, numeric • control ID = 0 • name = “Volume” • units = “dB” • min/max/default = −60/+12/−3 • sig digits (precision past the decimal point): 4 VST3 Programming Guide 43 Parameter 2 • type: string list • control ID = 42 • name = “Channel” • string list: “stereo, left, right” You would declare them as: // -- parameter 1 Parameter* param = new RangeParameter(USTRING(“Volume”), 0, // name, ID
USTRING(“dB”), // units
−60, +12, −3); // min,max,default param->setPrecision(4); // fractional sig digits parameters.addParameter(param); // -- parameter 2 StringListParameter* slParam = new StringListParameter(
USTRING(“Channel”), 42; // name, ID
slParam ->appendString(USTRING(“stereo”); slParam ->appendString(USTRING(“left”); slParam ->appendString(USTRING(“right”);
To add the required soft bypass parameter, you first need to assign it a control ID value. Of course this will depend on your own numbering scheme; here we define the ID as SOFT_BYPASS, then use the pre-supplied RangeParameter object to create the parameter; we name it “Bypass.” There are no units (“ ”); the minimum value is 0 (off ) and the maximum value is 1 (on), with the default value set to off (0). The next (0) value is the parameter’s step count which is ignored. Note the use of the flags argument, which is used to wire-or the automate and bypass flags together. As an exercise, you can search for the ParameterFlags structure to reveal other flags that are used with the parameters, such as the kIsList flag, which is automatically enabled for the string-list parameter object. After the declaration, you add the soft bypass parameter to your list with the same addParameter function as before:
const unsigned int PLUGIN_SIDE_BYPASS = 131072; int32 flags = ParameterInfo::kCanAutomate|ParameterInfo::kIsBypass; // -- one and only bypass parameter Parameter* param = new RangeParameter(USTRING("Bypass"),
PLUGIN_SIDE_BYPASS, USTRING(""),
0, 1, 0, 0, flags); parameters.addParameter(param);
3.5.1 Thread-Safe Parameter Access[edit]
The controller component is used for accessing the parameters as normalized values. The SingleComponentEffect object inherits the controller component, so your plugin object simply calls the functions directly—their names and parameters are easily identified. The get function accesses the parameter while the set function writes it all in a thread-safe manner.
ParamValue PLUGIN_API getParamNormalized(ParamID id); tresult PLUGIN_API setParamNormalized(ParamID id, ParamValue value); 3.5.2 Initialization: Defining Plugin Channel I/O Support As of this writing, VST3 has the broadest audio channel support of all of the APIs: it includes 33 traditional formats (mono, stereo, 5.1, 7.1, etc.), three Ambisonics® formats, and a whopping 27 different 3-D formats for a total of 63 supported channel I/O types. These input/output channel formats are named speaker arrangements and you can find their definitions in vstspeaker.h. When the user sets up a session track with a specific channel format and loads your plugin, the host will query it to find out if the format is supported. The system is quite robust and (as with the other APIs) it allows for different input and output formats. There is some dependence on the host’s own capabilities as well, but the specification allows for maximum flexibility. For example, if the user loads a 5.1 audio file, the host will query you to see if your plugin supports 5.1. But what if you have some kind of specialized plugin that operates on the left and right surround channels only? The host will query that as well: the speaker arrangement named kStereoSurround is a two-channel format just for this purpose. For most DAWs, the host will call the function setBusArrangements repeatedly to see if your plugin can handle the audio file format, or any of its sub-formats as discussed earlier. Your plugin must not only reply with a true/false answer, but also must set up the audio busses on the plugin component in response. Two functions named addAudioInput and addAudioOutput are used for this purpose. However, you need to clear out any previously declared busses to adapt to the new scheme. Some very limited or very specialized VST3 hosts may support only one audio channel I/O scheme, so you should actually set up your audio I/O twice: once in the initialize function, and then again in response to the setBusArrangements queries. For example, suppose your plugin supports mono in/mono out and stereo in/stereo out operation, and your preferred default arrangement is stereo in/stereo out. First, you would set up the busses inside the initialize function: addAudioInput(STR16 ("Stereo In"), SpeakerArr::kStereo); addAudioOutput(STR16 ("Stereo Out"), SpeakerArr::kStereo); Next, you would override the setBusArrangements function to handle other combinations and schemes. The arguments for setBusArrangements will pass you an array of input and output speaker arrangements and channel counts. The following snippet is based on the VST3 sample project AGainWithSideChain (you can find more variations in the numerous other sample projects, so definitely check out that code). If you are supporting many formats and sub-variants, this code can become tricky, as you might have a bunch of logic to sort through. There are a few things to observe here. Note how the audio busses are cleared out with removeAudioBusses: if you support a side chain input, you must re-create it any time you remove all busses. Here, the side chain is mono (some code has been removed for brevity): tresult PLUGIN_API setBusArrangements(SpeakerArrangement* inputs,
int32 numIns,
SpeakerArrangement* outputs, int32 numOuts) { // first input is the Main Input and the 2nd is the SideChain if (numIns == 2 && numOuts == 1) {
// the host wants Mono => Mono (or 1 channel -> 1 channel) if (SpeakerArr::getChannelCount (inputs[0]) == 1 && SpeakerArr::getChannelCount (outputs[0]) == 1) { removeAudioBusses(); addAudioInput(STR16("Mono In"), inputs[0]); addAudioOutput(STR16("Mono Out"), inputs[0]); // recreate the Mono SideChain input bus addAudioInput(STR16("Mono Aux In"), SpeakerArr::kMono, kAux, 0); return kResultOk; }
etc... }
3.5.3 Initialization: Channel Counts and Sample Rate Information[edit]
For VST3, the channel count initialization is part of the channel I/O declaration and is all contained in the setBusArrangements function, so there is nothing else to do (note this is somewhat different from AU). That said, the channel configuration in VST3 is very advanced and may be set up in more complex arrangements than what we use for the FX plugins here, as well as the VST3 sample projects. The channel I/O buss configurations are dynamic, and you may set up multiple input and output busses and then activate, deactivate, or reactivate them as required. To inform the plugin of the current sample rate (and other information), the host will call the function setupProcessing and pass it a ProcessSetup structure with the relevant information. Extracting the pertinent information is straightforward—see the documentation for the ProcessSetup structure. To grab the sample rate and bit depth, you would override the function and implement it like this:
tresult PLUGIN_API setupProcessing(ProcessSetup& newSetup) { double sampleRate = processSetup.sampleRate; int32 bitDepth = processSetup.symbolicSampleSize; // --- base class return SingleComponentEffect::setupProcessing(newSetup); }
Note: this function is called when the ProcessSetup has changed. The base class object will then save a copy of this structure as a protected member variable; you may access it in other functions to get the sample rate or other information on the fly.
3.6 The Buffer Process Cycle[edit]
The VST3 buffer processing cycle follows the same basic paradigm as AAX and AU, as shown in Figure 3.3. The entire operation happens inside of one audio processing function called process. This function has one argument called ProcessData that contains all of the information needed to complete the procedure. tresult PLUGIN_API process(ProcessData& data)
The ProcessData members are shown here with the most important ones for our FX plugins in bold: struct ProcessData { int32 processMode; // realtime, offline, prefetch int32 symbolicSampleSize; // 32 or 64 bit int32 numSamples; // number of samples to process int32 numInputs; // number of audio input busses int32 numOutputs; // number of audio output busses AudioBusBuffers* inputs; // buffers of input busses AudioBusBuffers* outputs; // buffers of output busses IParameterChanges* inputParameterChanges; IParameterChanges* outputParameterChanges; IEventList* inputEvents; IEventList* outputEvents; ProcessContext* processContext; };
The processMode variable indicates real-time or offline processing in addition to a flag for plugins that require process spacing that is nonlinear: for example, time stretching or shrinking. For all book projects, we use the real-time mode. The bit depth variable, block size in samples, and the channels counts are self-explanatory. You access the audio buffers using the AudioBusBuffers pointers; these also include the side chain input buffers when a side chain has been activated. The parameter changes that occurred while the audio data was captured are transmitted into the function in the inputParameterChanges list; this is where you grab the parameter updates and apply them to your signal processing algorithm before working on the audio data. You write data out to the GUI parameters (mainly for metering) with the outputParameterChanges list. We will discuss the processContext member shortly. The input and output events consist mainly of MIDI messages, which I cover in detail in Designing Software Synthesizers in C++. For the most part, the procedure for accessing the audio buffers for processing is fundamentally identical to that of AAX and AU. One Figure 3.3: The VST3 buffer process cycle follows the standard audio plugin pattern; the pseudo-code for each functional block is shown on the right.
VST3 exclusive is how the GUI parameter changes are sent into this function, which includes sample accurate automation (SAA) that allows the host to interpret recorded automation data to send smoother control change information when automation is running—this has no effect during normal operation with the user manually adjusting controls.
We can break the buffer processing into three parts, exactly following Figure 3.3 where we have the succession:
1. Update parameters from GUI control changes, cooking variables as needed. 2. Process the audio data using the updated information. 3. Write outbound parameter information (meters, signal graphs, etc.).
A nice feature of VST3 (and AAX and RackAFX) that is not available in AU is that the input parameter change list includes only those parameters that actually changed during the audio block capture or playback. If no GUI controls were adjusted and automation is not running, there is no need to check parameters and update internal variables—saving CPU cycles and simplifying the operation.
3.6.1 Processing: Updating Plugin Parameters From GUI Controls[edit]
The parameter changes arrive in the process function in a list of queues that is passed into the function via the processData structure. There is one queue for each parameter that has changed. The queue contains one or more timestamped parameter change values that were transmitted during the audio buffer capture/playback. The timestamp is really just an integer value that denotes the sample offset from the top of the audio buffer in which the parameter change event occurred. The information about the parameter change is provided with a function called getPoint. Figure 3.4a is a conceptual diagram of how the parameter change events line up with various samples in the audio buffer. Notice that the buffers are rotated vertically so that sample time and buffer index values move from left to right. Each normalized parameter value change event is shown as a diamond.
Under non-automation conditions, with the user manually adjusting the GUI controls, there is usually only one data point in any queue (I’ve only ever observed one point but the spec does not explicitly limit Figure 3.4: (a) In this conceptual diagram of how the parameter change events line up, there is one control change queue for each parameter that needs updating, each queue containing one or more data points. (b) An example automation curve for a single parameter that changes over many buffer process cycles. (c) The DAW-linearized version. (d) This graph (adapted from the VST3 SDK documentation) shows the data points actually transmitted to the plugin (filled circles) and the data points that are implied (hollow circles).
it to one point). However, if the user is running automation with a non-constant automation curve, then there will be multiple points in the queue. The simplest way to handle both of these conditions is to take the first or last parameter change value in the queue and apply that to your plugin processing. If your framework supports parameter smoothing, then it will at least provide smooth ramping over those points. If the user is running automation and the queue contains multiple points, Steinberg specifies that these points are to be connected using linear interpolation. Consider a single, automated parameter that evolves slowly over many buffer process cycles, with the recorded curve shown in Figure 3.4b. The VST3 API specifies that the host DAW must linearize this curve to produce something like Figure 3.4c. The exact method of linearization, including how many segments used to encode a curve, is up to the host manufacturer. The parameter change points that are actually transferred into the process function are the endpoints of the piecewise linear approximation of the curve, shown as filled dots in Figure 3.4d and the hollow dots at the end of each buffer/queue. These hollow endpoints are not re-transmitted at the start of the next buffer/queue. If the automation value remains constant across more than one buffer cycle, then no value is transmitted: these points are implied and are shown as filled gray squares. If the plugin supports SAA, then it is required to first make sense of the queue’s data points and understand the concept of non-repeated or implied data points, then provide its own mechanism for slowly changing the parameter value across each sample interval, providing the smoothest possible parameter changes. The parameter changes arrive in a normalized format on the range of [0.0, 1.0]. Your plugin must decode the control ID and pick up this normalized value, then convert it to the actual value that the plugin will use. The code for grabbing the last data point in the first queue in the list is shown here; notice how the control ID is transmitted as the queue’s “parameter ID” value, then the normalized value and sample offset are obtained with the getPoint function. The rest of the function (labeled 1, 2, 3) will depend on our plugin’s implementation. With your plugin parameters updated, you proceed to the next step and process the audio.
// get the FIRST queue IParamValueQueue* queue =
data.inputParameterChanges->getParameterData(0);
if(queue) { // --- check for control points if(queue->getPointCount() <= 0) return false; int32 sampleOffset = 0.0; ParamValue normalizedValue = 0.0; ParamID controlID = queue->getParameterId(); // --- get the last point in queue #0 if(queue->getPoint(queue->getPointCount()−1, sampleOffset
normalizedValue) == kResultTrue)
{
// 1) un-normalize the normalizedValue // 2) cook the actual value to update the plugin algorithm // 3) here we ignore the sample offset because we are only // using one control point
} }
3.6.2 Processing: Resetting the Algorithm and Preparing for Streaming[edit]
VST3 plugins may be turned on or off to enable or disable them during operation. The function that is called is named setActive and has a Boolean argument that identifies the state: true is enabled and false is disabled. You may then perform reset operations to flush buffers and prepare the plugin for the next audio streaming operation. tresult PLUGIN_API setActive(TBool state) { if(state){
// --- do ON stuff
} else {
// --- do OFF stuff
} return SingleComponentEffect::setActive(state); }
3.6.3 Processing: Accessing the Audio Buffers[edit]
In VST3, the audio input buffers arrive in the input buffer list passed into the process function in its processData argument. The buffer list is named inputs and it contains pointers to each input buss. Buss number 0 is the main audio buss, and buss number 1 is the side chain. Each buss may contain one or more audio channels; each audio channel arrives in its own buffer (array) exactly as with the other APIs. To get the array of input buffer pointers for a 32-bit main audio input buss, you would access data.inputs[0], where 0 indicates the main buss, and then get the array of pointers in the first slot [0] of the channelBuffers32 member. float** inputBufferPointers = &data.inputs[0].channelBuffers32[0]; You then access the individual channel buffers: float* inputL = inputBufferPointers[0]; float* inputR = inputBufferPointers[1]; The output buffers are accessed identically except that you use the data.outputs member instead of the data.inputs variable, as follows. float** outputBufferPointers = &data.outputs[0].channelBuffers32[0]; float* outputL = outputBufferPointers[0]; float* outputR = outputBufferPointers[1]; To get the side chain buffers, you would access buss 1 of the data.inputs array. Since the side chain is an input-only buffer, there is no corresponding output buffer. To access buss 1, you should first make sure that the buss actually exists, and then grab the channel count and buffer pointers: BusList* busList = getBusList(kAudio, kInput); Bus* bus = busList? (Bus*)busList->at(1): 0; if (bus && bus->isActive())
50 Chapter 3
{ float** sidechainBufferPointers =
&data.inputs[1].channelBuffers32[0];
} The AudioBusBuffers variable is actually a structure that contains additional information about the buss, which you may use to find the exact channel count as well as access the silence flag. To get the channel counts, you would write: int32 numInputChannels = data.inputs[0].numChannels int32 numOutputChannels = data.outputs[0].numChannels int32 numSidechainChannels = data.inputs[1].numChannels If the input buffer is full of zero-valued samples (e.g. the user has muted the track with the plugin), then the silence flag will be set. You may access this and quickly bail out of the function if there is nothing to do. bool silence = (bool)data.inputs[0].silenceFlags; With the buffer pointers acquired, you may then do your audio signal processing: you create a loop that processes the block of samples and you find the sample count in the data structure’s numSamples member. Here we are processing stereo in/stereo out: // --- loop over frames and process for(int i = 0; i < data.numSamples; i++){
// -- apply gain control to each sample outputL[i] = volume * inputL[i]; outputR[i] = volume * inputR[i];
} To use information in the side chain buffer list, you treat it the same way as the others: float* sidechainL = sidechainBufferPointers[0]; float* sidechainR = sidechainBufferPointers[1];
3.6.4 Processing: Writing Output Parameters[edit]
If your plugin writes data to output meters or some kind of custom view (see Chapter 21) then you would do that here as well. With all of the processing done, you would then write the output information to the appropriate GUI parameter that needs to receive it. To do this, you need to write data points into the outbound parameter queues—but you must first find those queues to write into; you are now on the opposite side of the queue paradigm. To create/find the first outbound queue for a meter parameter with a control ID named L_OUTMTR_ID, you would write the following: int32 queueIndex = 0; // return variable IParamValueQueue* queue = data.outputParameterChanges->addParameterData(L_OUTMTR_ID, queueIndex); The addParameterData function finds the queue for the parameter with the associated control ID and returns that value in the queueIndex variable, which you typically won’t need to use. With the queue
VST3 Programming Guide 51
found, you may then add data points to it. For metering, we will add one data point that represents the metering value across the entire audio buffer. The meter is displayed on a low-priority GUI thread that is animated once every 50 milliseconds or so (depending on your framework’s GUI code). Adding more points to the queue won’t make the meter appear more accurate or “fast,” as it is limited primarily by the GUI repaint interval. To add a new point to the queue, you use the addPoint function, which accepts the timestamp as a sample offset (we will use 0) and the normalized meter value. The index of the data point in the queue is returned as an argument (dataPointIndex here), and you will usually ignore it. Notice that for this meter we transmit the absolute value of the first sample in the left output buffer: int32 dataPointIndex = 0; // return variable queue->addPoint(0, fabs(outputL[0], dataPointIndex);
3.6.5 Processing: VST3 Soft Bypass[edit]
All VST3 FX plugins must support the soft bypass function. Remember that we declared that soft bypass parameter along with the other plugin parameters, so it will arrive in its own queue inside of the inputParameterChanges list along with those same parameters. You set your own internal plugin flag (plugInSideBypass here) when bypass is enabled, then you use it in your processing block—you just pass the input to the output, like this: if(plugInSideBypass) {
for(int32 sample = 0; sample < data.numSamples; sample++) { // -- output = input for (unsigned int i = 0; i<info.numOutputChannels; i++) { (data.outputs[0].channelBuffers32[i])[sample] = (data.inputs[0].channelBuffers32[i])[sample]; } }
}
3.7 Destruction/Termination[edit]
Because of the class factory approach, you destroy any dynamically allocated resources and perform any other cleanup operations in the terminate( ) function. You may treat the terminate method as if it were a typical object destructor except that you should call the base class method at the head of the function and check its status: tresult PLUGIN_API terminate() { tresult result = SingleComponentEffect::terminate (); if (result == kResultOk) {
// -- do your cleanup
52 Chapter 3 } return result; }
3.8 Retrieving VST3 Host Information[edit]
VST3 provides a robust amount of host information that may be provided to your plugin. This information arrives in the process method’s processData argument in the processContext member and includes data about the current audio playback such as the absolute sample position of the audio file (the index of the first sample in the input audio buffer for the current buffer processing cycle), the DAW session’s BPM and time-signature settings, and information about the audio transport (whether it is playing or stopped, or if looping is occurring). Not all VST3 hosts support every possible piece of host information. However, the absolute sample location and time—along with the host BPM and time signature information—is very well defined across all VST3 clients we’ve tested. You can find all of the possible information in the ivstprocesscontext.h file, which defines the processContext structure. Note that this includes the current sample rate as well, so you have a lot of information available on each buffer process cycle. As an example, here is how to grab the BPM and time signature numerator and denominator:
if (data.processContext)
{
double BPM = data.processContext->tempo; int32 tsNum = data.processContext->timeSigNumerator; int32 tsDen = data.processContext->timeSigDenominator;
}
3.9 Validating Your Plugin[edit]
Steinberg provides validation software in the VST3 SDK called the validator. If you go back and look at Figure 3.2, you can see the validator executable is part of the VST3 plugin project. The VST3 sample projects are packaged this way, as are the ASPiK VST3 projects. The compiler projects are set up to compile the libraries, your plugin, and the validator together. The validator is built before your plugin in the sequence of operations; after your plugin builds without errors, it is run through the validator. If the validator passes, then your plugin binary is written to disk. If validation fails, the output file will not be written. If this occurs, you need to go back through the compiler’s output log and find the location where the validator faulted. The validator’s text output is verbose, so it is fairly easy to find the spot where it fails. You then need to fix the problem. There are some tactics you may use to do this that you can ask for help at www.willpirkle.com/forum/, but you still have a lot of control. Unlike the AU validator, which is part of the OS, the VST3 validator is part of your plugin project and there are ways to debug through it. You should also use the printf statement to pipe text messages from the various plugin functions, as they will be printed to the output along with the validator. This allows you to watch the sequence of operations as the validator runs. 3.10 Using ASPiK to Create VST3 Plugins There is no substitute for working code when designing these plugins; however, the book real estate that the code requires is prohibitive. Even if you don’t want to use the ASPiK PluginKernel for your development, you can still use the code in the projects outlined in Chapter 8. If you use ASPiK, its
VST3 Programming Guide 53
VST3 plugin shell will handle all of the description and initialization chores, and your PluginCore object will do the audio processing—you will only work within two files: plugincore.h and plugincore.cpp. You may also enable VST3 SAA when you create the new project with CMake. If the parameter you are supporting has a CPU-intensive cooking function, then you may not want to update the parameter value on each sample interval. In this case, you may set the sample accurate granularity in the constructor of your PluginCore object by altering one of the plugin descriptors named apiSpecificInfo. You alter the member variable set the granularity in samples. For example, to set the accuracy to perform an update after every 32-sample interval, you would write: apiSpecificInfo.vst3SampleAccurateGranularity = 32; 3.11 Bibliography Pirkle, W. 2015. Designing Software Synthesizers in C++, Chap. 2. New York: Focal Press. Steinberg.net. The Steinberg VST API, www.steinberg.net/en/company/developers.html, Accessed August 1, 2018.