Wednesday, December 29, 2010

Stereo, quad, 5.1, 7.1,… Ambisonics to the rescue

Prelude

Imagine you did a nice recording of an oratorio. You created a fine stereo mix when the client calls you...

"Hey, did you see the guy with the video camera at the concert? That was just the angle for details. He also had a camera on a tripod, and his montage looks great. Wouldn't it be great if we had a 5.1 up-mix of your recording?"

No problem as you had two outriggers for ambience. Lucky they were there! So you dutifully do a 5.1 mix that will shake things up. Soon afterwards your client rings you up again.

"Nice work on that! The video guy is very happy. Actually I'd love to hear it as well, but I've just got this quad setup from way back. I wired it all up now so could you please give me a 4.0 mix?"

OK, back to mixing it is, and guessing as to how the balance will work for him, as you are loath to tear apart your standard conforming monitoring rig.

"Thanks mate, I feel like I'm back in the 70's, but with a much better sound of course! Now, the video guy said he'd love to release this recording--not on DVD but on Blue-ray, do you dig it? He'll want 7.1 for that though, can you please do that?"

OK, stereo, 5.1, quad, now 7.1, every time requiring a dedicated mix to accomodate the discrete speaker feeds. So you do the 7.1 as well, moving to a friends mixing space that takes some getting acquainted to, but you get it done in the end.

"Lovely man! Your 7.1 mix really rocks! I get great feedback from everyone who saw the clips, and even the local TV wants to broadcast the performance. Say, is your mix mono compatible?"

Mixing to B-format

Ambisonics, developed in the early 1970's by the mathematician Michael Gerzon has a solution for this problem: Mix in B-format.

"In the basic version, known as first-order Ambisonics, sound information is encoded into four channels: W, X, Y and Z. [...] The W channel is the non-directional mono component of the signal, corresponding to the output of an omnidirectional microphone. The X, Y and Z channels are the directional components in three dimensions. They correspond to the outputs of three figure-of-eight microphones, facing forward, to the left, and upward respectively." [Wikipedia]

For a lot of music (and most playback configurations; really, how many setups with height channels have you encountered so far?) the Z-channel can be omitted leaving the need for three channels for 1st order B-format.

What do you need to do when recording?

If the ensemble is rather close and not too expansive a Nimbus-Halliday (developed by Dr. Jonathan Halliday, research director at Nimbus Records) can do the job. If not a main mic setup based on symmetry or better yet arranged in an a isotropic configuration can be used. Use directional spot microphones according to taste.

What do you need to do when mixing?

Configure your DAW for a 3 (or 4) channel main bus. Put an Ambisonic decoder (e.g. by Bruce Wiggins / WigWare or Daniel Courville) onto that bus. Now use an Ambisonic panner (see previous links) to pan the sources to specific locations, and that's about all there is to it.

To be continued...

Tuesday, August 31, 2010

The "4x8" approach to quadraphonic miking

How do you capture a concert harp in an amazing sounding acoustic space in surround, playing solo pieces, compositions with additional playback through 6 widely distributed speakers as well as a duo performance involving both the main as well as several surrounding cavern like spaces? This was the specific case for one of the fine concerts during Berlin's __tiefKLANG festival.

I came up with what I named the "4x8" approach to quadraphonic miking: four microphones of a fig-8 characteristic are placed in four corners of a square centered around the main performer. All microphones are angled 45° inward and facing downward towards the center, kind of like the cross-hairs of a reticle, rotated sideways by 45°.

All four microphones were placed at a height of 170 cm. Regarding the distance between the corners of the "4x8" I went for 220 cm in this case, but like with the flexible DECCA-tree you'll need to listen and adjust spacing and height to get it just right :-)

What are the benefits of "4x8"?

Assuming the shape of the room is a true rectangle and the performer sits in parallel in relation to the walls at the sides and perpendicularly / orthogonally in relation to the walls at the front and back the 0-axis of the fig-8 capsule is equally distant from any parallels between walls. Tilting the microphones downwards towards the performer distances the vertical axis of maximal pickup from parallels between floor and ceiling and also places the audience closer to the vertical 0-axis.

I attached a small omnidirectional condenser to the resonant base of the harp that provides the C-channel of my 5.1 mix and taped an omnidirectional boundary layer microphone to the wall, centrally behind the quadraphonic configuration, that gives me the LFE after appropriate low-pass filtering. You do need to be able to compensate for the LF-rolloff of condenser fig-8 designs!

The "4x8" approach served me well for this setting, with a stationary central sound source and the necessity of capturing ambient sounds in a 360° radius. be careful when employing it with an instrument that moves a lot in space and has strongly directional qualities at any frequency range. You will quite likely end up with the central sound source jumping around sideways and front to back on quadraphonic playback.

Friday, June 11, 2010

2d, 3d, how many dimensions really?

I've done some more thinking after having posted What is 3d audio

Strictly speaking a dimension is "a spatial characteristic of an object; that is, length, width, or height." (I'll skip the fact there was a best-selling shampoo named alike; thanks Wikipedia ;-)

So a static image has two dimensions whereas reproduced sound originally has only one: the lateral location of a transducer--as it changes over time, resulting in the propagation of a sound wave through space by compression and rarefaction of air.

You can see that it's necessary to add the parameter of "time" to the available physical dimensions.

Moving pictures, be they film- or video-based, add the aspect of change over time to their dimensionality. So a traditional movie has three dimensions. As explained above sound can not exist without time--even sound pressure level (SPL) readings need a reference value which they are based on--, so it requires two dimensions.
  • "3d"-movies are four-dimensional as they include depth,
  • stereophonic sound is three-dimensional as you require _two_ sources of correlated sound waves, but the result is a forth dimension: depth can now be perceived (when located at the correct position: usually one point of an equilateral triangle, with both speakers facing in the observer's direction).
You can now add more dimensions to the sound by introducing more sound sources…
  • L+R+Surround in the rear
  • Quad (L+R+Ls+Rs)
  • 5.x (L+C+R+Ls+Rs)
  • etc.
…or you can consider the introduction of a second sound source (with relation to a listener's position) as being the establishment of an array of sources on a two-dimensional plain, which would result in the observation that adding another speaker does _not_ increase the number of dimensions.

To summarize, so-called "3d" in movies or video, as well as surround sound, require four dimensions (three physical ones plus time), but only surround sound has the capacity to envelop the listener and create a truly three-dimensional experience, placing the observer at the center of all action. In that respect "3d" movies are about where stereophonic sound was in the early 1930's ;-)

Disclaimer: I very much enjoy watching good movies (with well crafted soundtracks--preferably in surround) on DVD!

Monday, June 07, 2010

What is 3d audio?

The traditional movie theater / home dvd (or video for ye old-timers ;-) experience is flat. Conventional screens have two dimensions, width and height, but no depth. What I mean by that is that while a picture certainly can seem deep, any perceived depth is just that: perceived. It is quickly recognized as such, and does not change when one moves one's eyes or head.

Audio on the other hand, especially simple stereophonic recordings, have always been able to convey depth--but no height. The latter may be due more to the playback setup than the recording technology itself! I had an amazing experience of height information encoded in a simple AB recording when attending a presentation of the "Bloomline loudspeaker system" (formerly named the "inaudible loudspeaker") by Leo de Clerk at 2008's VDT International convention. [Abstract]

So what makes video three dimensional? It is either (1) depth or (2) envelopment. And what puts audio in the same position: either (1) height or (2) envelopment. So the question at the start can be answered in that surround sound, be it quad(raphonic) or 5.0, 7.0 etc. already satisfies the requirement.

Surround sound is 3d audio!

And so it appears we have had 3d audio long before moving pictures went 3d. BTW, adding height to the equation introduces yet another dimension with little or no sweat ;-)

Wednesday, May 12, 2010

The album lives!

According to Nate Anderson's The death of the album "40 percent of all [Tunecore] sales were single-track downloads, 57 percent were streams, and a mere 2.3 percent were full albums." [Original markup] Please note that the percentages refer to the strange numbers on the graph in relation to 65.18!

I think the real question is: How much income did Tunecore-sellers generate?

Turning sales-percentages into dollars the situation looks very different to me: streams (costing 1c) made 5.70$, albums (costing 10$) 23$, and single tracks (costing 1$) 40$. With albums & tracks a commission of roughly 1/3 is deducted from the profits. In the end people selling albums generated more than twice the amount of money that people selling tracks did, and the overall direct income from streaming seems comparatively negligeable.

All of this besides the fact that one has to closely examine the demographics of the potential clientele. E.g. lovers of the opera and symphonic music will rarely purchase a single aria or movement--and Wolfgang Spahr writes: "Classical music was the big winner at retail in 2009 according to a statement issued by the German Federal Music Industry Association (BVMI). Double-digit growth was achieved in both volumes and revenues." [Classical Music Sales Soar In Germany]

Friday, April 09, 2010

Errata - Exotic Positions

I just read Paul Stamler's Exotic Positions, and here are some necessary corrections:

3 out of 4 ways of recording in stereo...

Let me start with (the old master) Alan Dower Blumlein. While it is certainly correct that "he did most of the theoretical analysis necessary for the development of stereo recording" he was also very much into exploring the practical aspects of recording. And while "he developed a stereo miking system [the Blumlein setup] that solves many of the problems inherent in XY and ORTF techniques" it is important to note that not only ORTF came 30 years later, but that Mr. Blumlein invented all three coincident setups, including XY and M/S. The latter is listed under the "true exotica" section although it is the de facto standard of stereo in moving picture sound.

"XY"

Back to the start of the post. XY creates a stereo image strictly by differences in sound pressure level. The term "volume differences" is not accurate. And while often arranged at an angle of +-45, resulting in a huge recording angle of 196°, any angle can be used. It is important to note that the size of the inter-capsule-angle is inversely proportional to the size of the recording angle. Point the capsules father apart for a smaller soundstage!

Visit Eberhard Sengpiel's stereophonic playground for some experimentation.

Rather than saying that "typically, XY recording produces a narrow soundstage" I'd explain that using an XY pair with an inter-capsule-angle of 90° results in a semicircular pickup range. I use this configuration when I have to mike very close to an ensemble and don't want to go for AB, perhaps because I want to reduce the influence of the room or attenuate audience noise.

"Getting fancy: ORTF"

I'd personally place the near-coincident pair stereo technique after XY, Blumlein and M/S. Or, if I were to sort the approaches by their practical relevance to me it would be at the top.

The very nice stereo image that ORTF (and other near-concident techniques: NOS, DIN, EBS, ...) can produce relies on a combination of differences in sound pressure level volume and time of arrival cues. The reason that "sources are rendered in correct spatial perspective rather than in the narrow soundstage endemic to XY recording" is due to the fact that the recording angle is 98°, a little more than half that of XY (at +-45°).

"The old master: Blumlein"

I consider Blumlein to be a special case of XY, using bidirectional capsules / ribbon transducers angled at +-45°. While it is true that "Figure-8s tend to maintain their pattern at all frequencies" the statement that "at least some of them have excellent bass response" is doubtful. Look at the frequency charts of all bidirectional capsules for a pronounced proximity effect / LF-attenuation. As an example please compare the charts on the cardioid Schoeps MK 4 with that of the fig-8 Schoeps MK 8. David Royer has argued that this does not hold for ribbons (see the Royerlabs SF-1) and I have done a few one-point Blumlein recordings that seem to justify the assessment, but I am as yet uncertain on this point.

I don't agree with the statement that "because the pickup pattern is bidirectional the microphones will pick up lots of room sound, leading to a very wet recording." AB gives me much more spaciousness, but since AB (with two parallel omnis spaced at a distance of 51.5 cm) results in a recording angle of 180° I place it much closer to the ensemble than a Blumlen setup with it's recording angle of 76°. Therefore it is definitely true that the sonics of the recording space matter a lot when using a Blumlein setup.

"True exotica"

Read all about Tony Faulkner's Phased Array here: Part-1, Part-2, Part-3.

The Jecklin disk uses "a pair of omnidirectional mics spaced a foot or so apart with a large plastic disc between them." That much I can follow. The statement that "this creates the equivalent of a pair of cardioids pointed outward at 180˚" sounds wrong as (in the case of coincident capsules) this would result in a recording angle of 102°.

Regarding M/S the post states that "what’s most useful about this is a high degree of mono compatibility". In fact M/S is completely mono compatible: Use only the central channel...

Interesting to me: the "Swedien technique"--although in theory coincident yet angled true omnidirectional microphones can not generate differences in sound pressure level. Paul Stamler does say the setup pertains to the use of large diaphragm condensers though. I'll give it a try when I get the chance.

"Step up to the bar"

The sturdy yet inexpensive K&M 23550 stereo bar does a good job for XY & ORTF. Don't forget one or two K&M 218 thread adapters to be able position one mic above the other without undue vertical angling.

"What, when, and where"

Much of this is a matter of taste, but I'd usually not recommend using "a pair of good condensers overhead in ORTF formation (panned hard left and right) [...]" for recording drums. True, a stereo overhead and two mics, one for the snare (or a central position) and one for the kick is all you'll need, but if you don't use a coincident overhead you can not reduce the spread of the image in post without introducing comb filtering!

"Try a few true-stereo single pair recordings just to hear what the technique can do. And if you have the tracks, try using stereo miking techniques on multitracked projects. You won’t be sorry." I am happy to say that I wholeheartedly agree with that final statement. "Happy pairing" to you as well :-)

[Please check out Eberhard Sengpiel's website for detailed insights into various stereophonic configurations, recording angles etc.]

Thursday, April 08, 2010

Do we need 24 bit audio for sonic nirvana?

"The noise level in the avarage residence is about 43 decibels" [Harry Ferdinand Olson (1967): Music, physics and engineering], whereas a house in the country can be as quiet as 35 dB. Let's deduct 6 dB from that number as it is quite possible to discern musical content that level-wise lies within and seemingly should be masked by the noise floor.

16 bit audio has a theoretical dynamic range of 96 dB, which, if you add that onto the baseline 29 dB amounts to a peak of 128 dB, slightly below the threshold of pain. It seems to me that that span should suffice to adequately present the finest dynamics inherent in music, especially since the range I usually experience recording very high dynamic range avant-garde music lies at 54 dB, and many real-life concert venues have a noise floor at -60 dB FS. Of course less is highly preferable, but then it is also dependent on the quality of the ambient noise.

Note the use of the word "suffice". Clearly, having a theoretically usable resolution of 144 dB when working with (real) 24 bit audio is even better. It is tantamount when recording music while leaving adequate headroom--with no manual gain riding required and no need to use a compressor while tracking--, and during mixing to avoid introducing distortion while processing.

But once you are done, dithering carefully to 16 bit will be OK.

[December-2009]

John Watkinson in Resolution, March-2010 (p. 59): CD "[...] was put together by a skilled group of people who knew what they were doing, and it has stood the test of time. It's not broken and it doesn't need fixing."