Welcome to our community

Be a part of something great, join today!

  • Hey all, just changed over the backend after 15 years I figured time to give it a bit of an update, its probably gonna be a bit weird for most of you and i am sure there is a few bugs to work out but it should kinda work the same as before... hopefully :)

44/48kHz vs 96kHz or 192kHz - is there a difference? Yes, and it might surprise you!

Good points Mark. 48/24 for dialogue makes sense. My concern would be someone choosing 48/16 on the menu on a high quality recording device when they could have chosen 48/24. Not arguing the frequency range, or even the precision as much as giving post enough meat to run noise reduction filters and other tactics to save live sound that has issues.

Cheers - #19
 
I don't know if this is the best thing to do ...
But I end up doing a lot of stems and cleaning up, so I have been saving at float 96khz. I might have 12 layers of stems(stem that reads a stem which reads a stem), when I include heavy foley and sound design. So with foley and sound design my way of thinking its not 96k float that someone hears, but as i'm just working and saving stuff, i'm at high enough resolution where I'm not messing things up in my write/saving stem process. Also some of my big work horse plugins are doing tons of "phase" stuff, like rx5 I paint out what I want for sound design (i.e. i kind of think of it like i'm scratching out a painting that has a lot of layers). Another thing I do a lot for animation is when I want something to sound like something else (a person to sound like "wood") i do these morphing operations - my gut feel on this sound design morphing is 96k float is better (mainly because I'm mangling the waveforms so much). Another thing I do for sound design foley is use contact mics where I drastastically modify the frequency (water drops are a good example). A final thing I find 96k useful is in surround sound mic configurations, slightly adjusting the timing to avoid phase problems.

In RX5 i can see the beating (repeating dark spots in the rx5 view) on most frequencies because of phase issues, sometimes I really like what is happening with those beats which I have noticed can be coming from phase issues at around 20khz sample rates (i.e. even a 2 kHz signal may have phase beating at 20khz that is causing "canceling" beating in my viewport). I also notice when I go back and forth on ADR, that the 96khz is real nice, not because of the actual phase, but I think the software processing through the entire system has less latency - so I'm using 96khz often to "remove" echo on peoples headphones which messes up people talking. I think people hearing a very slight phase while talking (and listening to headphones) can be very distracting for talent. When it's all processed though, I personally remove almost everything above 12khz as I go to do my temp sound track prints.

I do a lot of experimenting with sound, which I find enjoyable. I think it will add to the film, if not, at least I am connecting to the story better as I edit. BTW I find the m903 headphone amp integration with the mac and protools so easy with 96khz. On other interfaces I have gone a bit crazy with 96khz, where the signal is getting all messed up as it goes to my monitors ... which when that happens 96khz is way worse since my monitoring is off. [i.e. this has nothing really to do with 48 vs 96 from a sound quality viewpoint, it just seems that a lot of the computer out to monitor processing works great at 48khz but at 96 kHz seems like a lot of configs are doing something wrong in the protocol processing] On the m903 I see it auto shift to 96khz and I think it also auto shifts to a advanced USB protocol when it does that (i.e. this I don't think is happening on some of my other interfaces when I do play back at 96khz, so the sound is actually worse then 48khz).
 
I don't think the playback chain in 99% of most theaters and broadcasting companies can handle it, either. I have no problem with anybody recording (say) classical music at 96kHz or 192kHz, particularly if it's going to get a high-res audiophile Blu-ray Audio or SACD release. But for motion picture sound, I think it's hopeless overkill, and I don't think there's a rational workflow that can handle it.


Sound processing does not end at the film mix stage.

Room EQ in the theater is mostly not very good - DataSat/CP850 ok, CP750 not great, everything else crumby.
The higher quality your input, the less awful it sounds for your viewers.
96khz oversampling definitely helps, obviously more so for music and sound design.

96khz is part of the DCI spec, if they can play a DCP, they must support 96khz

There is no reason not to mix and deliver a 24/96khz DCP, processing and storage overhead now is so minimal.
 
Sound processing does not end at the film mix stage.
I'm very aware of that. Did you read the links I posted?

Marc, please note that Blair's summary, "any sound captured live on set should go into a 96/24 format or better so it can hold up to manipulation without significant damage" does not necessarily mean it must be captured at 96K. It just means that if you are going to do any audio processing, you get far better results doing all processing at 96K (and, obviously, not converting up/down more than once, rather once at the very beginning and once at the very end).
I don't think it's necessary to do lots of manipulation provided you get good locations, good people, and put the microphones in the right place. Again: talk to any major re-recording mixer in New York, LA, London, or anywhere else and see what they say.

I think a lot of people get hung up on theoretical advantages of certain ideas, but they fail to consider what the real post workflow problems are going to be down the line, nor are they aware of standard industry practices. Trust me, Skywalker, Sony Pictures, Warner Bros., Fox, Disney, and so on are not mixing dialogue at 96kHz or 192kHz. In fact, I'm not sure if any aspect of their workflow beyond sound FX can go beyond 48Khz... and sound effects are kind of a very specialized, rarified application.

I can readily agree that if you're doing massive time-compression or expansion, it might make sense to do 96kHz for FX acquisition, particularly if you had very high-end microphones. Regular everyday dialogue? No.

Look at the last 100 films that were nominated for an Oscar or Bafta for Best Sound. Tell me if any of them were recorded at 96kHz. Even those with very, very complex recording schemes (I'd consider Les Miserables to be high on that list) just recorded at traditional 48kHz/24-bit, often using a couple of dozen microphones. I can totally understand using multiple microphones for a complex production like this -- that boils down to getting the right mic in the right place at the right time. Sampling frequency is not part of that solution.
 
No argument to Marc's notes, though I would point out that bigger shows like he mentions are simply going to ADR any live dialogue that came out too sketchy, not always an option on indies, who, coincidentally, are also the most likely to have less skilled boom ops/mixers on set.

+1 that a more robust format from a technical perspective is far less important than proper technique, pre-amp quality, etc.

Cheers - #19
 
...There is no reason not to mix and deliver a 24/96khz DCP, processing and storage overhead now is so minimal.
The main problem I have on my 96k workflow isn't what I was expecting, basically no problems with the entire recording chain, storage, cpu and monitoring output. My problem with 96k is that a I get a lot of plugin and OS errors. For example, yesterday I was doing some sound design yesterday with the protools to a separate sound processing process, and if I walked away from the workstation for too long I got a crash. These types of OS/Software things rarely happen in protools if I'm at 48k. So I think my main issue now with 96k is getting a perfect software config and basically freezing that entire machine. Freezing a machine though is kind of tough for me, since I do my previz and daily processes on my protools box. I think the software is a bit "flacky" 96k because so few people use the software at that mode ... and those that do have perfectly configured machines in the high end atmos type systems which are not multi-use.

... fail to consider what the real post workflow problems are going to be down the line, nor are they aware of standard industry practices. Trust me, Skywalker, Sony Pictures, Warner Bros., Fox, Disney, and so on are not mixing dialogue at 96kHz or 192kHz. In fact, I'm not sure if any aspect of their workflow beyond sound FX can go beyond 48Khz... and sound effects are kind of a very specialized, rarified application.
I think my big "gotcha" on 96k is what your talking about on post workflow, most people are on 48k workflow and I just run into a lot of workflow issues. I think mixed reality animation is a different world then normal production, where the 96k is a huge help (i.e. i often divide by half the frequency I'm working on to get the sound to "sound like" what people expect). Even still, the flakiness of the software at 96k might just force me to doing all my major stem work at 48k - it's kind of frustrating working through all the 96k workflow issues and just finding getting that last 10% of the problems solved turn out to be software stability related which I have no control over.
 
As an update on this, i'm finding 96k most important on production dialog where the character is going to vfx. Also really important if performance capture audio (from the performance capture production), can be used in post. Basically most of this has to go through RX5 in order for it to be usable, and the de-reverb and other noise reduction techniques really use the 96k. I.E. so long story short i'm finding 96k primarily very useful when I'm trying to use production audio from a green screened performance, which inherently is a bit of a echo and all sorts of other more horrible noises. Also some of these dereverb and noise reduction algorithms, along with pitch shifting algorithms really like a 96k pro tools sessions even if the data is 48k, I'm finding this very important on characters that need heavy sound design. Outside basically removing complex noise, I'm not finding much use for 96k. For really complex dereverb on close mic problems, I'm finding I reallty need to remove the echo at around 100khz, but my system can't handle that frequency (I use a cmit 5u, and the impulse response smooths that close mic echo so there is no reason to go to 192khz, also I really love the feel of cmit 5u so dont want to go to another mic just for those rare close mic/close_wall_near_mouth echo problems).
 
As an update on this, i'm finding 96k most important on production dialog where the character is going to vfx. Also really important if performance capture audio (from the performance capture production), can be used in post.
The recent performance capture stuff done for Captain America and Jungle Book was almost completely ADR'd in post. The little head-attached whip mics used on the actors in the 3D/capture stage were just recording live audio for timing and performance reasons. But they redid it all. The audio won't work because a) the environment itself is too noisy, and b) the mic is only about 3" from the actor's face, which sounds unreal and too close. Pull up the "Making of Avatar" videos, and I believe there is a segment showing the actors in the ADR booth re-recording all the dialogue, generally with 2 mics. The live-on-set dialogue sounds like it was recorded in a tile bathroom, really awful. Even the live-action scenes in the various control rooms and hallways were rough, but they were able to get usable sound out of them. But the CG stuff... mostly ADR'd.

I think shooting in better locations is really the best answer to getting cleaner audio, simply because you aren't having to do any major processing after the fact. Note also that the wireless rigs being used for performance capture won't provide any usable response over 15kHz, so it's ludicrous to try to record at 96kHz. Heck, even 48kHz is way, way beyond what those mics can do (assuming Nyquist applies for a 24kHz peak frequency response). Even the best lavaliers in the world -- and I'd put DPA in that category -- are nowhere near flat out to 20kHz. Indeed, it's hard to find $2000-$3500 studio mics and preamps that can go above 25kHz.

They do use fairly expensive studio mics in ADR, and often record actors in the VO stage from at least 2 positions at the same time (going to multiple tracks) to allow different levels of yelling, whispering, jumping around, and so on. It's quite an interesting art.
 
The recent performance capture stuff done for Captain America and Jungle Book was almost completely ADR'd in post. The little head-attached whip mics used on the actors in the 3D/capture stage were just recording live audio for timing and performance reasons. But they redid it all. The audio won't work because a) the environment itself is too noisy, and b) the mic is only about 3" from the actor's face, which sounds unreal and too close. Pull up the "Making of Avatar" videos, and I believe there is a segment showing the actors in the ADR booth re-recording all the dialogue, generally with 2 mics. The live-on-set dialogue sounds like it was recorded in a tile bathroom, really awful. Even the live-action scenes in the various control rooms and hallways were rough, but they were able to get usable sound out of them. But the CG stuff... mostly ADR'd.

I think shooting in better locations is really the best answer to getting cleaner audio, simply because you aren't having to do any major processing after the fact. Note also that the wireless rigs being used for performance capture won't provide any usable response over 15kHz, so it's ludicrous to try to record at 96kHz. Heck, even 48kHz is way, way beyond what those mics can do (assuming Nyquist applies for a 24kHz peak frequency response). Even the best lavaliers in the world -- and I'd put DPA in that category -- are nowhere near flat out to 20kHz. Indeed, it's hard to find $2000-$3500 studio mics and preamps that can go above 25kHz.

They do use fairly expensive studio mics in ADR, and often record actors in the VO stage from at least 2 positions at the same time (going to multiple tracks) to allow different levels of yelling, whispering, jumping around, and so on. It's quite an interesting art.
I've completly given up on the head performance capture mics, those need to be captured at around 192k and have super complex processing to be useful(probably about 3-5 years out in technology if even doable). What I do in performance capture is use "tradional" boom based technique's with a cmit 5u with a few acoustic baffles stands. Thus the actor does seperate waist up performance capture for medium and closeup "shots" where I have minimal suite movement. So as a clarifier, my approach only works when vfx is only head & shoulder framed, this takes some planing also since a lot of time in performance capture they don't know the "real" camera angle. By doing this "trick" on production recording of performance capture audio only on close up shots, it allows me to use traditional boom mic techniques with the actor in the suite so I'm not dealing with the hopeless noise of a full performance capture stage. I think the Big guys don't want to be so constrained, and they have about 1000 times more money then I do, so they go ADR all the way. I feel for low budget doing everything in ADR just costs too much to have a compelling performace. My little technique also I don't have to worry about aligning the vfx to adr on the wide shots and over the back shots, again a cost saver.

My more general feeling is some of the "techies" just approached things by removing people with technology, but boom operators and a experienced production mixer are probably one of the most difficult jobs to replace with technology. It's like trying to remove the cinematographer with previs, it just doesn't make sense. I learned a lot from the Revenant technology approach, use the new tools but it's not about the new tools. So when many productions replace the boom operators with the technology, they went down a dead end path leading to ADR (which cost profile doesn't work for low budgets). Sometimes they didn't fully replace the boom operators but I feel - I don't have experience to back this up - but I don't think if the mixer didn't like the sound he could call the shot bad (which in low budget with a mixer, if mixer can't hear the lines, often the shot will be redone because we don't have the money to go to ADR).

Some of the traditional film techniques shouldn't be ditched as the technology shifts, those techniques are actually part of the performance and result in a less compelling "flat" experience if they are not used. So in some wierd way, I use 96k and things like rx5 not as new technology per se, but as a technology where I can use traditional story telling techniques with performance capture. Anything to me that looses the commitment of a realtime performance is bad, but then I'm a bit old fashioned.
 
I've completly given up on the head performance capture mics, those need to be captured at around 192k and have super complex processing to be useful(probably about 3-5 years out in technology if even doable). What I do in performance capture is use "tradional" boom based technique's with a cmit 5u with a few acoustic baffles stands. Thus the actor does seperate waist up performance capture for medium and closeup "shots" where I have minimal suite movement.
Although I would say one major issue with trying to capture real audio in motion capture is that the final sound is going to sound like an actor on a motion control stage. A guy standing on a bridge, or a guy in a forest, or a woman running through a tunnel is going to sound a lot different. It all winds up sounding like somebody who is not really where the image logically dictates.

I feel for low budget doing everything in ADR just costs too much to have a compelling performace. My little technique also I don't have to worry about aligning the vfx to adr on the wide shots and over the back shots, again a cost saver.
I'd say there's two solutions: 1) raise much more money! Kickstarter, IndieGoGo... there's lots of different ways to find more money for production and post if you need to. 2) find lower-budget facilities. Very, very affordable recording studios exist. You don't necessarily have to record at a union facility like Sony Pictures or Paramount or Warner Bros. There's hundreds of hole-in-the-wall places all over California that can do this kind of work affordably. You'd be amazed at the number of cable shows that get by with very, very modest post budgets and yet crank out 20 or 30 new reality shows and documentaries a year, some that have vast amounts of narration and ADR. I've seen indie films go through facilities like this, and the work can be done affordably provided you're prepared and know what you're doing.

My more general feeling is some of the "techies" just approached things by removing people with technology, but boom operators and a experienced production mixer are probably one of the most difficult jobs to replace with technology. It's like trying to remove the cinematographer with previs, it just doesn't make sense. I learned a lot from the Revenant technology approach, use the new tools but it's not about the new tools. So when many productions replace the boom operators with the technology, they went down a dead end path leading to ADR (which cost profile doesn't work for low budgets).
ADR is a necessary part of all film production, particularly when your locations dictate no other solution. I guarantee you, when Leonardo DiCaprio is sloshing through a river fighting two other guys, a lot of that dialogue (and fighting sound FX) are all dubbed in afterwards. There are also tricks that sound crews use to get the job done; one I've used before is to get the actors together and have them do the scene a couple of times just for sound, in the general area in which they're shooting but away from noise contamination, just to give the editor alternate lines and reads for editorial purposes. In some cases, these lines and phrases help cover up any bumps during the actual picture production, so they can lift lines from different scenes and takes and salvage the scene without traditional ADR.

I think you'll find a lot of very experienced re-recording mixers would be horrified that you're using 192kHz recording in this way, because it's not what it was intended to do. I also think the improvements with RX 5 Advanced will be minimal, because so much depends on acoustics, not just the microphone and the related gear. What preamp are you using that's giving you response up to 100kHz? I really think you're looking for a problem when you've already got the solution in hand. My advice would be to avoid iZotope RX entirely except for scenes that absolutely cannot be done any other way, and don't look at it as something to run most (or all) of your dialogue through. It's a specialized tool, not an excuse to record all the dialogue in a substandard way.
 
Back
Top