Komodo 12.2 stops of dynamic range in CineD test.

Mike P. · Jan 19, 2021

Seems a little odd that the test was done poorly, but only the RED succumbed to the problem of really poor above-mid-grey performance and false-positive recovered highlight stops (the top two stops of the P6k still have clear RGB definition/capture even with the improper testing). Also a little ironic that, after fixing the colour temp issue, the results didn't change with regards to the highlight recovery phenomenon/missing ~2 colour channels in the highest recovered stops (Sidetone: Did they re-do the chart with 3200k balanced incandescent/full spectrum lights or just change the WB to 3200k?)

While it's cool that everyone is focusing on the imatest/xyla chart results and coming to their own DR conclusions (which are moot apparently), isn't the practical/real-world (and saturated) shot, that demonstrates why those those top two stops aren't all there, more important? CineD could not recover accurate skintone highlights after one stop overexposure (for comparison, they mentioned getting 4 stops over from the P6k).

While I'd like to chalk this up to bad testing, even RED's GioScope puts mid-grey at chip11 of 16... which would put the skintone at ~12, which aligns fairly accurately with CineD's results, but whatever, it must be that time of year... And hey, at least the massive green tint discrepancy was solved (which was more of a concern, to be honest).

Noah Yuan-Vogel said:
I do think the issue of whether to include recovered highlights or not is a complicated one. Is Komodo the only camera currently to have highlight recovery built in and visible on the monitor during recording? I would say it is a bit complicated since conceivably such highlight recovery is available to other raw cameras but not visible until processed in post. In your opinion how should the xyla chart be exposed with respect to RGB channel clipping? Should it the top chip be counted as in as long as at least one color channel is not yet clipped?

Admittedly I presumed that's how IPP2 handles highlights on all RED sensors -- takes none clipped channel(s), ramps down saturation the closer it gets to clip, so that there's "detail/texture" there, but it lacks solid colour (hence it's good for improving perceived roll-off, but you can't recover those stops as "usable" image unless it's a b&w project). I noticed as much when testing IPP2 on MX; more discernible detail, but lack of colour. It works well for things that are monochromatic, like clouds, but not so good on, say, skin tones (it shows texture where there was none, but it's greyish when you try to bring it into useable range). Again, that's not a (b&w) chart, but anecdotal observation.

As for others, BRAW has a highlight recovery check box in Resolve (which someone above mentioned wasn't used for CineD's testing), so it'd be neat to see how that looks on the scope. If memory serves (and again, anecdotally, by eye) it's not nearly as aggressive as IPP2. Come to think of it, pretty sure Canon raw has a 'Highlight Recovery' check box too.

OH, and not to muddy the waters further, but back when Weapon came out, someone noticed if you set the raw whitebalance to something right near clip, then used a WB node (not the raw settings) to attempt to rebalance the image to proper wb in post, you could get an additional ~1+ stops of detail in the high-end. It didn't work most of the time because the colours had a tendency to get wonky/thin (the correction node didn't always look good, and you'd have to selectively correct tons of other things in the frame). Pretty sure this is that, but on the scopes.

Michael Tiemann · Jan 19, 2021

AndreeMarkefors said:
As I wrote on another forum:

Call it 8 stops, call it 20. As long as you compare camera tests done by the same individual/site you should be good. All is relative.

I personally much prefer results that land around 12 stops than those that reach 16+.

I read this as a half-hearted attempt a humor, and also a half-bitter expression of cynicism. And since we're all human, I'll say "I hear you!"

But I'll also say that when the opportunity presents to use science to understand underlying systems, we should not treat objective truth as unknowable. Of course it is possible for people to do science wrong. And some people do it wrong intentionally, which is particularly insidious. But as long as people follow the cardinal rules: state your hypothesis, explain your methods, show your data, and prepare to be corrected if others cannot reproduce your results, then those who engage (and who practice well) should be trusted until proven otherwise.

I saw the data of a chart with red, green, and blue showing that white balance for a tungsten source had not been corrected. I saw further the dropping of data based on the inconsistencies that resulted from that error. I read further that there were some other mistakes made in terms of codec parameters. When a result is reported with such large asterisks, it's best to put that result aside and either make a fresh one or wait for somebody else to do so. If and when the experiment is done properly, in proper conditions, according to scientific best practices, we'll get an objective result. Which is good for what it is.

The objective result won't tell us what's not being measured, but it will tell us what is being measured, within the standards of error of the technique. That's all we can ask for. And it's a whole lot more than "you say 8 and I say 20. Believe whatever you want!"

Christoffer Glans · Jan 19, 2021

AndreeMarkefors said:
I agree with a lot of what you say, but they can't change their method around every now and then. Super-sampling and NR have not been part of their standardised test (I think they do mention it sometimes in the text though), so I wouldn't expect them to include it all of a sudden.

The main win is that all cameras they test use a similar setup.

But you can’t standardize a test done between a camera that shoots a raw format that is intended for post-processing, and cameras that use heavy in-camera processing.

The “in-camera processing cameras” get to a finished image that is greatly reduced in post production possibilities, but might look good straight out of the camera.

A true RAW camera however, does not look good straight out of the camera but greatly enhances post processing possibilities and get better consistent images at the end. It’s even proved to be the case in the test when he mentions that ProRes behaves better. Of course it does, it’s supersampled from 6K, and these are people saying they do a “lab test” without understanding what supersampling does to the final image •facepalm•

If you’re gonna compare true performance, the R3D needs to be processed into something that is more close to the processed images you get from a “in-camera processing” camera.

There can’t be one standard for processed footage and non-processed footage.

Karim D. Ghantous · Jan 19, 2021

This goes to show that good methodology is hard to come by. People still think that focal length causes 'compression', for goodness sakes. No wonder it's hard to do tests properly.

I haven't seen this test and I don't really care to. But I get the impression that they used a tungsten source and did not use a filter in front of the lens. Is that right? If so... wow.

A couple of years ago I saw a visual test between the Alexa and the Dragon. The Dragon 'won' by a small amount, IIRC. Of course I can't find it now.

Christoffer Glans · Jan 20, 2021

Are there any unbiased, independent testers who are actually doing real tests? What about DXO, they tested Dragon when it hit the 101-point mark in their testing history. Would much rather see them test Komodo.

Han Vogen · Jan 20, 2021

I’m not sure why so many have an issue with this test. It’s my understanding that all cameras are tested using the same method. Using special consideration for each brand would skew the results. 12.5 stops is more than usable and right in line with what I expected from this camera. All manufactures quote higher numbers and then seem to score lower in independent test. Canon claims 16+ for their C300 Mk3 with its DGO sensor; Yet only score 13.1 in the CineD lab test. Sony claims 15+ for the FX9, but only managed 11.5 stops in the same test. This score puts the Komodo at essentially the same score as the Sony A7S3, granted the Sony will score much better at higher ISOs. It’s a very workable result, especially considering the global shutter.

Christoffer Glans · Jan 20, 2021

Han Vogen said:
I’m not sure why so many have an issue with this test. It’s my understanding that all cameras are tested using the same method. Using special consideration for each brand would skew the results. 12.5 stops is more than usable and right in line with what I expected from this camera. All manufactures quote higher numbers and then seem to score lower in independent test. Canon claims 16+ for their C300 Mk3 with its DGO sensor; Yet only score 13.1 in the CineD lab test. Sony claims 15+ for the FX9, but only managed 11.5 stops in the same test. This score puts the Komodo at essentially the same score as the Sony A7S3, granted the Sony will score much better at higher ISOs. It’s a very workable result, especially considering the global shutter.

How do you compare a RAW system without proper post processing with cameras that use in-camera processing? If you don't process the RAW footage correctly before comparing and getting the numbers, then you basically half-bake the performance of that system. So the biggest problem is that they standardize a test around the direct out of camera performance that is good for all cameras having everything baked into the file you get out of it. While not doing proper processing of the RAW files so that the footage from cameras using in-camera processing has improvements that you don't see on the RAW footage.

The test needs to position each camera in a similar end point, not starting point. It's not rocket science to understand why. Either put everything into ACES apply NR and render out at 4K, or apply brand LOG modes, NR and render out at 4K. The end point should be a decided standard delivery format. Most common right now is 4K. So have every camera at an end point of 4K and process RAW footage correctly. Then compare the numbers.

If you don't do this, then you don't actually measure performance based on actual performance but choose a specific workflow that just suits one type of system and judge everything accordingly. It's just plain invalid and flawed as a test.

Christoffer Glans · Jan 20, 2021

Essentially this is the problem and a possible solution.

Phil Holland · Jan 20, 2021

Han Vogen said:
I’m not sure why so many have an issue with this test.

I've engaged with Gunther online about this now and am awaiting a few answers.

My main issues with this test are:

- When published the data was off what has currently been revised, to the tune of ~2 stops which produced a somewhat viral wildfire spread of misinformation. This has been tended to and corrected, which is good.
- Why the light and shadow slides are not being used to measure DR. They come with the Xyla and are used to negate the impact of the light side's flare lifting, contaminating, and adding noise to the noise floor. Yes this means all of their test results are invalid btw due to the unique optical pathways in each system as well as likely different lenses potentially used in these tests. You need to mitigate all variables when performing work like this.
- Not fully digesting or acknowledging how highlight information is captured, measured, and impacted by a tapered highlight roll-off and simply "not using" a full stop or two of captured dynamic range.

I'm not the only one who does tests such as these who have noticed this.

The stranger notion is now he is mentioning "hence my suspicion that an in-built highlight recovery algorithm is at work here" and not recognizing that as captured DR.

"Xyla21 chart is shot off-center to avoid lens flares" is not the same as using those slides. Not at all.

In terms of color stability a good +/- test shows off where luminance and chromatic information lands within the usable latitude.

One way or another visibly with our human eyes and compounded with the incorrect noise floor reading we are seeing different results that what is being presented.

Mike P. · Jan 20, 2021

But does any of that actually justify why the skintone highlight clipped after +1 (which, in their defence/coincidentally enough, corresponds with what their xyla results are showing, as faulty as they may or may not be)?

Even if the scientific testing was completely botched, I presume they did the same botch job with the P6k charts (which again, in their defence, didn't show the missing colour channels in the two top stops), but they claim the p6k still managed ~+4 over before being unable to recover that skintone highlight accurately.

I guess my question is, can that real-world Komodo +1 vs P6k +4 delta be fixed with better scientific testing (without underexposing 3 stops to protect the highlights enough to merely match the p6k's highlight performance, as that risks a noisy [potentially green] mid-low end)?

Christoffer Glans said:
Essentially this is the problem and a possible solution.

Pretty sure that's why they include the baked prores results; to have baked vs baked, and then include raw to make sure it doesn't have substantially different performance from the processed/polished baked codec. I mean, one could argue that compressed raw shouldn't be compared to uncompressed for the same reason.

Nick Vera · Jan 20, 2021

For anyone seeing comparisons, recommendation is doing your own under/over tests with a light meter in a controlled environment, if you want to compare film cameras. Stop using manufacturer stats and other's opinions. This is the only way to find the truth

Christoffer Glans · Jan 20, 2021

Nick Vera said:
For anyone seeing comparisons, recommendation is doing your own under/over tests with a light meter in a controlled environment, if you want to compare film cameras. Stop using manufacturer stats and other's opinions. This is the only way to find the truth

We're not, but we're also not listening to tests that aren't conducted correctly. If someone wants to test manufacturers' claims through a test that compares systems based on a standard, then they damn well need to have the standardization and procedures locked in a correct way. Not only does this spread misinformation, it also reinforces a lack of trust in others doing tests.

If someone claims to do unbiased tests in their "lab". It becomes problematic if, again and again, they don't know how to handle the cameras and post processing. That's not a "lab" with any experts. I'm still waiting for the experts.

Nick Morrison · Jan 21, 2021

All I will say is in *practical* terms we've been shooting Komodo, Gemini, and Dragon side by side and here's our conclusion:

Komodo is noticeably better than DSMC1 Dragon - more usable DR, and less mid-tone noise.

Komodo isn't quite as good as Gemini, but intercuts effortlessly. If you don't need off-speed, Komodo is a no brainer.

Komodo has a "fat negative", and is very gradable. Our colorist has been impressed.

Komodo is, quite frankly, remarkable value for what you're paying.

Karim D. Ghantous · Jan 21, 2021

Nick Morrison said:
Komodo is noticeably better than DSMC1 Dragon - more usable DR, and less mid-tone noise.

Maybe I'm naive, but this says it all. (Figuratively speaking!)

Joshua Cadmium · Jan 21, 2021

Here's a super long post about measuring dynamic range and why it's such a hard number to pin down. I think it's helpful in this situation.

---

There is actually hard math for dynamic range and it's really easy to calculate - it's simply the ratio between the clipping point of the sensor and the average noise of the sensor.

The math is simply the 20log10 (voltage ratio) of the Full Well Capacity in electrons (clipping point) divided by the root mean square Read Noise in electrons (average noise).

So the math formula is just 20log(FWC/RN). That's very simple, relatively speaking.

However, there are two problems with this approach.

---

One is that while the clipping point of the sensor is really easy to figure out, the average noise is absolutely not, at least while measured externally. Imatest's page on dynamic range has so much information that I have not seen elsewhere that explains the challenges in actually measuring DR: https://www.imatest.com/solutions/dynamic-range/

One huge takeaway is that even the tiny amounts of internal flare from lens optics absolutely affects dynamic range. That means that the dynamic range of a sensor, when paired with a lens, has an absolute limit (unless lens coatings get dramatically better) regardless of what the dynamic range of the sensor is actually capable of.

Imatest says that they have never seen anything better than about 16.5 stops of DR, even though there are sensors out there with a specified 20-25 stops worth of range. Simply adding a lens completely negates that higher dynamic range.

They also say specifically that dynamic range is improved by noise reduction. So, because of that, they recommend only using raw data that is stripped of any noise reduction and minimally demosaiced.

They say specifically that "Noise reduction can have a profound effect on DR measurements. In particular, SNR = 1, which is a criterion for the DR limit in some standards, may never be reached."

That means that noise reduction can prevent seeing where the noise floor is reached (SNR = 1 is where the signal and the noise floor is equal.) If you can't see the bottom limit, you have no way of actually measuring the average noise of the sensor.

So, yes, mathematically, comparing the output of a sensor that is highly processed with noise reduction to something that is only processed as minimally as possible is not going to be a fair comparison.

On top of that, Imatest actually has their own alternative to the Xyla chart because of challenges in using that chart accurately. The much brighter initial steps were leaking into the rest of the image due to lens flare, causing an inaccurate reading. (However, this was causing an increase in Dynamic Range, not a reduction in Dynamic Range.)

---

The second problem with calculating Dynamic Range is that measured dynamic range is actually considered something completely different.

Measured dynamic range is not Dynamic Range at all - it is a similar, but separate concept known as the Signal to Noise Ratio, and it is that SNR number that we actually want to know, not the true Dynamic Range.

The discrepancy has come about from how a sensor engineer would define what Dynamic Range means versus what we would consider dynamic range.

To a sensor engineer, Dynamic Range, in sensor-land, is only a mathematical ratio between Full Well Capacity and Read Noise. It is not theoretical, it is hard math. You have two numbers and you look at the ratio between them and that ratio can only be one thing.

However, that hard number Dynamic Range does not include the impact that simply putting a lens in front of the sensor causes. It also does not include any other type of noise, just Read Noise. When there is lots of light available, the main type of noise limiting the amount of Signal to Noise Ratio is something called Shot Noise.

---

Shot Noise is an inherent characteristic for photons to follow a certain pattern, known as the Poisson distribution. This distribution means that less light levels equals a noisier image, regardless of anything else.

Basically, photons do not come all at once, hitting the sensor evenly, but come in a staggered way that creates noise.

The only solution to the inherent noise of light (you can't currently escape it) is to increase the full well capacity. If you are able to capture more photons, it spreads out that distribution and lowers the effect of that noise.

This website has a good breakdown of all of this, including the chart I am listing below, which shows that SNR has less stops than actual Dynamic Range: https://www.lumenera.com/blog/unders...paring-cameras .

This quote sums it all up: "The result is that a larger dynamic range is always preferable because it allows for a higher signal-to- noise ratio, but it is not guaranteed. SNR will always be less than the dynamic range because it is limited by the noise in the image and is not always maximized due to challenging lighting conditions, exposure time limitations, and the choice of optics."

Bastien Tribalat · Jan 21, 2021

I'm late to the party and the rabbit hole but whatever.
Another (reputable) team of testers from France have conducted those tests (here) + compared their results to what they found conducting the exact same tests on an Arri Alexa Mini and a Canon C300MKIII.

So first, the TL;DR :

Alexa Mini : 14+ stops (up to 15 in RAW after denoising)
RED Komodo : 14 stops (up to 14+ in RAW after denoising)
C300 Mark III : 13+ stops (up to 14+ in RAW after denoising)

And now, a few more details.

01_arri_alexa_422hq_arri_log_c_vs_red_komodo_prores_422hq_log3g10-png.48873

02_arri_alexa_422hq_arri_log_c_denoiser_vs_red_komodo_prores_422hq_log3g10_denoiser-png.48890

03_arri_alexa_arriraw_12_bit_vs_red_komodo_r3d_hq-png.48877

04_arri_alexa_arriraw_12_bit_denoiser_vs_red_komodo_r3d_hq_denoiser-png.48889

01_red_komodo_prores_422hq_log3g10_vs_canon_c300_mk_iii_clog_2-png.48874

02_red_komodo_prores_422hq_log3g10_denoiser_vs_canon_c300_mk_iii_clog2_denoiser-png.48891

03_red_komodo_r3d_hq_vs_canon_c300_mk_iii_cinema_raw_light_12_bit-png.48878

04_red_komodo_r3d_hq_denoiser_vs_canon_c300_mk_iii_cinema_raw_light_12_bit_denoiser-png.48892

Christoffer Glans · Jan 21, 2021

So I've been in talks to Gunther at CineD as well to explain the criticism and I think the consensus really boils down to the test being performed correctly (after the decode mistake, which he's genuinely feeling bad for, was fixed), so Gunther is not really at fault here, there's just a lack of information that has spiraled the community into negativity.

The problem is that the article does not communicate the differences between the numbers. First off, not pinpointing how in-camera processing affects the Xyla test and that comparing between RAW systems and in-camera processing systems aren't really valid due to this difference. Second, that the interpretation of the test varies based on what we're actually trying to conclude. It's easy to blame Red for saying 16,5+ stops when it does not have that, but... it has, just count the visible bars on the Xyla test, it's 16,5+ stops. However, they're not usable stops. By measuring unprocessed R3D you get closer to the numbers that Gunther got to, so as a reference of tests it's valid, as long as only unprocessed RAW is compared against each other and all other in-camera processing systems are left out of the equation.

This is why the French test is an interesting comparison since it reaches 14 stops and is closer to what I get from the Xyla test when processing RAW with supersampling and NR.

So basically, the problem is more of how the article is framed, which led to the ideas of testing misconduct spiral out of hand together with the pitchforks against Red for "lying" about stops. This shows how important it is to communicate the variables and details of testing in order to underline what is actually measured and how to interpret the concluded numbers.

For some reason, Gunther is unable to join Reduser and can't at this time explain or defend the methodology, but hopefully, until then, we can have an understanding of what the real issues are here.

Phil Holland · Jan 21, 2021

Christoffer Glans said:
For some reason, Gunther is unable to join Reduser and can't at this time explain or defend the methodology, but hopefully, until then, we can have an understanding of what the real issues are here.

I don't have access to new member approvals, but we'll get him registered to post.

Here's the link he's been providing in many online replies about their testing methods:
https://www.cined.com/the-cinema5d-camera-lab-is-back-dynamic-range-tests/?fbclid=IwAR2wNRxOIfP3cNvXCNyO-Bfk9DKJilKmyLGYMD35T58msN_cSsQAV-UKjS4

Noah Yuan-Vogel · Jan 21, 2021

Bastien Tribalat said:
I'm late to the party and the rabbit hole but whatever.
Another (reputable) team of testers from France have conducted those tests (here) + compared their results to what they found conducting the exact same tests on an Arri Alexa Mini and a Canon C300MKIII.

I saw that test but I do question the usefulness of judging DR entirely from a waveform... Seems they judge more DR from komodo than the c300iii entirely because of the way the log curve is balanced, which seems somewhat meaningless when trying to judge what can be brought up from the shadows in a 10bit or even 12-16bit source. Also there is no mention of the fact that you can clearly see in their komodo waveform that the top two stops of DR appear to rely entirely on the built-in highlight recovery which may only work for monochromatic detail at certain whitebalances among other issues with it. And then I wonder if one were to use highlight recovery in post on the arriraw or C300iii raw if they would also benefit from an additional 1-2 stops of DR from highlight recovery. Definitely seems like a problematic issue to face. From some of the other tests I've seen, it does appear those highlight recovery stops really cannot be relied upon for certain kinds of tones and detail, but they may particularly improve the apparent results of a monochromatic xyla chart...

Mike P. · Jan 21, 2021

Noah Yuan-Vogel said:
Also there is no mention of the fact that you can clearly see in their komodo waveform that the top two stops of DR appear to rely entirely on the built-in highlight recovery which may only work for monochromatic detail at certain whitebalances among other issues with it. And then I wonder if one were to use highlight recovery in post on the arriraw or C300iii raw if they would also benefit from an additional 1-2 stops of DR from highlight recovery. Definitely seems like a problematic issue to face. From some of the other tests I've seen, it does appear those highlight recovery stops really cannot be relied upon for certain kinds of tones and detail, but they may particularly improve the apparent results of a monochromatic xyla chart...

This is the concern I'm talking about, and why I'm not totally convinced with 'the xyla was done wrong' claim either... Even if the charts are done wrong, the weak highlight performance (poorly balanced over/under DR) still remains and is something I've noticed with MX, Dragon and Helium (to varying degrees).

Case in point, CineD only being able to recover +1 skintone highlight accurately, when the p6k recovered +4. That's a huge difference, which requires counterintuitive tricks to minimize (but still not remedy entirely)... And if those same measures are applied to footage from other cameras (unnecessarily, if they have comparatively balanced over/under), would they have similarly (usable) DR gains?

This isn't a weakness that shows up in the monochrome charts. It shows up in post, when the high side isn't just hot, it's burned/clipped (even though it only measured +3). Is it the end of the world? No, if you know about it you can compensate/alleviate. Does it make your job more difficult than it needs to be? Yeah, I think so.

Welcome to our community

Be a part of something great, join today!

Komodo 12.2 stops of dynamic range in CineD test.

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Administrator

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Administrator

Well-known member

Well-known member