Author Topic: Discussion of 'Equivalence' (Read 75841 times)

David H. Hartman · « **Reply #60 on:** May 15, 2017, 22:49:20 »

Quote from: simsurace on May 15, 2017, 22:21:40

This holds only for the same exposure.
Without the condition, the statement is too easy to misconstrue.

Yes, please add that (same exposure). I think I stated that in another thread many pages back.

Dave

Jack Dahlgren · « **Reply #61 on:** May 16, 2017, 00:14:37 »

Quote from: simsurace on May 15, 2017, 21:50:41

From that formula your statement follows only if you keep DRPIX fixed. What is the problem? I think John made his notation quite clear. Otherwise you can always ask for clarification.

If you disagree with the formula for fixed DRPIX, please explain why!

You did not say why it is a strange idea, and where the units are missing. Incidentally, DR is dimensionless.

This is not what I said. Please read again. I was talking about why the equivalence theory assumes a consistent output target.

DRPIX is the dynamic range of a pixel. Why would you multiply it by number of pixels? If we do then we find that dynamic range of D5 is half that of D800.

simsurace · « **Reply #62 on:** May 16, 2017, 00:27:01 »

Quote from: Jack Dahlgren on May 16, 2017, 00:14:37

DRPIX is the dynamic range of a pixel. Why would you multiply it by number of pixels? If we do then we find that dynamic range of D5 is half that of D800.

Because the image is made up of more than one pixel, and it is multiplied by the square root of N (not N) ~~because the SNR (mean over standard deviation) of a Poisson random variable scales with the square root of its mean.~~ because the signal scales linearly while the standard deviations are added in quadrature (as John already said -- sorry I got myself confused for a moment).

The full well capacity and noise floor of a photosite are hardly the same between the D5 and D800, so DRPIX has to be set to a different value for each of the cameras when calculating DR. Check the tables on http://www.sensorgen.info to get numerical values for some common cameras. The D5 has not been included in the list, but the D4 for example has a FWC of 118339 electrons and a read noise of 18.8 electrons, while the D800 has one of 48818 electrons and a read noise of 4.6 electrons. (please click on the camera model to get the read noises at base ISO, since the table oddly lists the minimum read noise across all ISO values).

Unsurprisingly, the fat pixels of the D4 are designed to swallow much more photons before saturating. That is good, because they will receive more photons during a normal exposure (because they have a bigger area).

When designing a sensor, the full-well capacity of a photosite has to be chosen such that it is filled with what is considered the correct amount of photons according to the ISO norm for base ISO (accounting for losses due to the filters in front of the sensor). Or in other words, sensors with differently-sized photosites, but having the same quantum efficiency and the same base ISO (as is the case of the D4 and D800) should not have the same FWC.

Andrea B. · « **Reply #63 on:** May 16, 2017, 00:34:09 »

Quote from: JohnMM on May 15, 2017, 19:19:55

Wrong way around ? Lower density of light (=lower exposure) on the larger (FX) sensor.

JohnMM - yes of course. Sorry about that! A typo in the "heat" of the discussion. Has been corrected.

Jack Dahlgren · « **Reply #64 on:** May 16, 2017, 01:12:29 »

Quote from: simsurace on May 16, 2017, 00:27:01

Because the image is made up of more than one pixel, and it is multiplied by the square root of N (not N) ~~because the SNR (mean over standard deviation) of a Poisson random variable scales with the square root of its mean.~~ because the signal scales linearly while the standard deviations are added in quadrature (as John already said -- sorry I got myself confused for a moment).

The full well capacity and noise floor of a photosite are hardly the same between the D5 and D800, so DRPIX has to be set to a different value for each of the cameras when calculating DR. Check the tables on http://www.sensorgen.info to get numerical values for some common cameras. The D5 has not been included in the list, but the D4 for example has a FWC of 118339 electrons and a read noise of 18.8 electrons, while the D800 has one of 48818 electrons and a read noise of 4.6 electrons. (please click on the camera model to get the read noises at base ISO, since the table oddly lists the minimum read noise across all ISO values).

Unsurprisingly, the fat pixels of the D4 are designed to swallow much more photons before saturating. That is good, because they will receive more photons during a normal exposure (because they have a bigger area).

When designing a sensor, the full-well capacity of a photosite has to be chosen such that it is filled with what is considered the correct amount of photons according to the ISO norm for base ISO (accounting for losses due to the filters in front of the sensor). Or in other words, sensors with differently-sized photosites, but having the same quantum efficiency and the same base ISO (as is the case of the D4 and D800) should not have the same FWC.

Dynamic range exists at the pixel level. Pixels are independent and self-contained. If you have an image with 1000 pixels and one with 1000000 using the same photosite, the larger dynamic range is the same. Why would you multiply by the square root of Number of pixels?

simsurace · « **Reply #65 on:** May 16, 2017, 02:15:53 »

Quote from: Jack Dahlgren on May 16, 2017, 01:12:29

Dynamic range exists at the pixel level. Pixels are independent and self-contained. If you have an image with 1000 pixels and one with 1000000 using the same photosite, the larger dynamic range is the same. Why would you multiply by the square root of Number of pixels?

Dynamic range begins at the pixel level, but the analysis does not stop there because you cannot build an image from one pixel.

At the pixel level, DR is the ratio of the maximum number of photons* that you can record in one photosite (full well capacity FWC), and the noise floor (read noise RN).

At the sensor level, DR is the ratio of the maximum number of photons that you can record (the sum of all FWC, or N*FWC), resulting in a completely white frame, and the noise floor of a completely dark frame. The noise standard deviation is sqrt(N)*RN if the read noise in different photosites is independent (which is an approximation that is valid to varying extent depending on the sensor readout technology). The 'fake signal' due to read noise will have a similar size on the order of sqrt(N)*RN.

So the sensor-wide DR is N*FWC / (sqrt(N)*RN) = sqrt(N) * FWC/RN = sqrt(N) * DRPIX

*I'm always writing photons, but the correct term would be photo-electrons.

simsurace · « **Reply #66 on:** May 16, 2017, 02:47:19 »

Earlier in the discussion I brought up Poisson statistics which, however, don't appear in the previous calculation. I will clear that up later if needed. Sorry for any confusion that arises because of this.

Andrea B. · « **Reply #67 on:** May 16, 2017, 04:21:56 »

if the read noise in different photosites is independent (which is an approximation that is valid to varying extent depending on the sensor readout technology)

A question here -- is sensor readout a hardware based process or combo of hardware and algorithm process? Is it done per pixel or per row?
And I'm not even sure I am asking this question correctly.

So let's just stipulate that I've forgotten whatever specific details I used to know about readout and read noise and thus I'm getting a bit tangled up as to whether independence can be assumed.

Added later: It was binning I was trying to remember about. For independence there wouldn't be any binning, right? One photosite through one a/d converter (then amplifier) would map to one "pixel" in the photo? (Before demosaicing.)

Jack Dahlgren · « **Reply #68 on:** May 16, 2017, 06:34:46 »

Quote from: simsurace on May 16, 2017, 02:15:53

Dynamic range begins at the pixel level, but the analysis does not stop there because you cannot build an image from one pixel.

At the pixel level, DR is the ratio of the maximum number of photons* that you can record in one photosite (full well capacity FWC), and the noise floor (read noise RN).

At the sensor level, DR is the ratio of the maximum number of photons that you can record (the sum of all FWC, or N*FWC), resulting in a completely white frame, and the noise floor of a completely dark frame. The noise standard deviation is sqrt(N)*RN if the read noise in different photosites is independent (which is an approximation that is valid to varying extent depending on the sensor readout technology). The 'fake signal' due to read noise will have a similar size on the order of sqrt(N)*RN.

So the sensor-wide DR is N*FWC / (sqrt(N)*RN) = sqrt(N) * FWC/RN = sqrt(N) * DRPIX

*I'm always writing photons, but the correct term would be photo-electrons.

This definition of "sensor-wide" dynamic range is the part that makes no sense to me. Imagine for a moment that we are talking about film (chemical rather than electrical - but same idea). Cut a piece of tri-x in half and expose it to a step chart. All the pieces will record the same gradient from black to white. You won't magically get more dynamic range by increasing the size of the film. If that were the case 8x10 would have a tremendous dynamic range advantage as compared to 35mm. Sadly, it doesn't. The same is true of a silicon based sensor.

It appears that this definition is just part of a circular exercise where a smaller sensor has less dynamic range because the definition of dynamic range for sensors is based on their size, not in the range of values that they can capture (which in the case of selecting a dx sized area of an fx sensor is identical)

David H. Hartman · « **Reply #69 on:** May 16, 2017, 08:11:09 »

Quote from: Jack Dahlgren on May 16, 2017, 06:34:46

Cut a piece of tri-x in half and expose it to a step chart.

The step chart is woefully short of the dynamic range available in a scene in nature. Think of a diffused highlight on a while object to a darker object in deep shadow.

Once one captures whatever dynamic range one can they will have to compress that dynamic range into that of their printing paper or display monitor. To get a pleasing image the shadows and highlights will be compressed. In the days of film the toe of the film compessed the shadows and the toe of the paper compessed the highlights.

I'm guessing here but I think I could represent 13 stops on grade 2 paper with N-1 development of Tri-X of the '80s and '90s along with dodging and burning. Anyway Tri-X the way I used it had good dynamic range. Velvia on the other hand sucked!

Dave Hartman who will go back to reading.

simsurace · « **Reply #70 on:** May 16, 2017, 09:13:22 »

Quote from: Jack Dahlgren on May 16, 2017, 06:34:46

This definition of "sensor-wide" dynamic range is the part that makes no sense to me. Imagine for a moment that we are talking about film (chemical rather than electrical - but same idea). Cut a piece of tri-x in half and expose it to a step chart. All the pieces will record the same gradient from black to white. You won't magically get more dynamic range by increasing the size of the film. If that were the case 8x10 would have a tremendous dynamic range advantage as compared to 35mm. Sadly, it doesn't. The same is true of a silicon based sensor.

It appears that this definition is just part of a circular exercise where a smaller sensor has less dynamic range because the definition of dynamic range for sensors is based on their size, not in the range of values that they can capture (which in the case of selecting a dx sized area of an fx sensor is identical)

I don't understand this idea of 'circular exercise'. The definition of dynamic range at the sensor level that John and I gave, like the PDR definition by Bill Claff, is one that lends itself to comparison of images at a standardized output size. I think that this is the only way to meaningfully compare images of the same scene, and I think the definition is useful for that. Others prefer to think of secondary magnification.

It is not about 'better' or 'worse' definitions. Each definition has a specific, precisely delineated purpose.

If you use the per-pixel or per-unit-area DR, you will find that it does not correspond to what you can see in a standardized output size in terms of noise.
That is why sensor-level DR makes sense as a concept, and why Bill Claff uses PDR as a y-axis, as opposed to engineering dynamic range.

The definition is what it is. The non-trivial part is that if you accept the definition, you can do the little calculation that John and I did in order to estimate sensor-level DR. You get certain predictions. Then you can test those by looking at actual images and running statistical analyses on them. This part is definitely non-circular.
Using a different definition, you might end up with the same predictions, but using a slightly different calculation.

I said before that Bill Claff's graphs don't make sense if you don't understand the definitions. Perhaps too many people draw conclusions from the graph without understanding the definitions, but this is their own fault, not Bill Claff's, because he gives all required information.
The definition ensures that an iPhone 7 ends up lower on the scale than a D800, even though the individual photosites of the iPhone are likely as efficient as the ones in the D800, or even more efficient. And we probably agree that a given output from an iPhone does not look the same or better than from a D800.
It's all about how to present the data.

If the axis were engineering DR, all sensors from the same generation would be very close. One would have to separately calculate the effect on a standardized output by invoking secondary magnification, instead of being able to read off the difference in stops directly from the graph. Both lead to the same conclusions.

About your film example: If you print your gradients from the smaller piece of film, the grain of the film will be more strongly magnified, giving you less certainties about the dark tones of the gradient and blackness than if you print from a bigger piece of film. Again, looking at a standardized output size. This has been described by Bjørn as 'different secondary magnification', but both concepts are 'equivalent' in terms of predictive power, if you allow me to use a pun.

The analogy with film is not strict because a piece of film, differently from a digital sensor, does not have a full-well capacity, so dynamic range is not a precise number as for a sensor. I do not know the technical definition of dynamic range of film, but I guess that the upper limit involves some kind of cutoff where the density does not change meaningfully by exposing more.

simsurace · « **Reply #71 on:** May 16, 2017, 09:44:28 »

Quote from: Andrea B. on May 16, 2017, 04:21:56

if the read noise in different photosites is independent (which is an approximation that is valid to varying extent depending on the sensor readout technology)

A question here -- is sensor readout a hardware based process or combo of hardware and algorithm process? Is it done per pixel or per row?
And I'm not even sure I am asking this question correctly. So let's just stipulate that I've forgotten whatever specific details I used to know about readout and read noise and thus I'm getting a bit tangled up as to whether independence can be assumed.

Added later: It was binning I was trying to remember about. For independence there wouldn't be any binning, right? One photosite through one a/d converter (then amplifier) would map to one "pixel" in the photo? (Before demosaicing.)

I think it is mainly the transfer of charge from the pixel to the downstream circuitry.
There is also quantization noise in the A/D conversion, and thermal noise.

However, I have to pass on those questions. I would have to read some literature about what is known.

The correlations of all noise sources combined (except photon noise of course) can be measured by looking at correlations in the RAW image of a dark frame. I think I have seen some pictures of Discrete Fourier Transforms of dark frames somewhere, but I don't remember the place.

Les Olson · « **Reply #72 on:** May 16, 2017, 11:21:42 »

Quote from: simsurace on May 16, 2017, 09:13:22

[...] comparison of images at a standardized output size. I think that this is the only way to meaningfully compare images of the same scene [...]

The requirement that the same scene is reproduced at a standardised output size creates the circularity. As soon as that requirement is accepted everything else follows, but if it is rejected the equivalence argument falls to the ground.

But why should that requirement be accepted? Other than to make the argument possible, of course, which is as clear a definition of circularity as you could ask for. Under what real-world circumstances does anyone use different cameras to take the same photograph, for a standardised output size?

Bjørn Rørslett · « **Reply #73 on:** May 16, 2017, 11:22:03 »

A pity people today isn't more familiar with film technology. Then it would be obvious that even a small piece of the same emulsion would have had precisely the same ability to encompass a light range (or dynamic range) as a larger sheet of the same material. This would avoid the confusing step of making DR a direct function of area, and avoid the inherent circularity or at least eschew spuriously correlated parameters for later models. Is the film analogy without solid backing? Not at all. In the ancient age of film, cutting pieces of photosensitive material for testing of exposures was commonmplace, in particular for development of sheet film, plus the similar procedure of exposing strips of paper in the darkroom. If the final outcome had depended in any way on area we would have been royally screwed but fortunately we weren't. Using a fraction kept the properties of the whole. A lesson apparently quickly forgotten in the onslaught of digital photography and one of the reasons pseudo-science is generated.

The crucial aspect not addressed by this 'equivalence theory' is WHY are larger formats perceived as "better" in the sense that they allegedly have a superior dynamic range. Partly this arise because one fails to recognise the entire imaging chain as a self-contained system. Singling out a single component and drawing system-wide conclusions therefrom is fraught with problems. We need to include the optics, image circles projected, the nature of light, magnification requirements, and the purpose for which the system was created in the first place to relate different systems to each other and also at the same time understand for each end purpose there is an "optimal" system; no single system can be optimised for every user requirements.

So, why is a tiny format like those of a smartphone inherently less 'dynamic' than a larger format? Is it because the pixel count is lower? No, today's smartphones have pixels aplenty. Now, imagine then a smartphone sensor with photo sites of outstanding dynamic range and the same number of pixels as a larger camera format. By the presented calculations these formats should have the same inherent dynamic range. Oh well, the small format by the requirement of the fixed output target actually would need more (both pixels and DR even though pixels as such are dimensionless, but the constraints imposed on the "same framing" criterion and side effects on other parameters thereof induces a dimensionality dependence nonetheless). To make life easier for the smartphone, we might compare against a larger camera with a lower pixel count though. Again, we now have this über-super smartphone at our disposal fulfilling the wildest dreams of pixel count and dynamic range - so now we can end up with a print of same quality as from the lager format? Not likely, as the tiny format very soon will run into empty magnification and no more information can be conveyed to the output medium. Thus the band-width of information has a ceiling no amount of pixels or DR can remove. Any format will be constrained in this manner, but the problem will manifest itself much earlier with the smaller formats.

Is the example sketched above far-fetched? Absolutely not. It corresponds in the film era the practice of making ultra-high resolution images on small formats by extremely fine-grained film specially developed. It was easy to reach or surpass the output from say sheet film or medium format in that way, up to the inevitable ceiling set by empty magnification. This was of course facilitated by the fact that on an area basis, optics of smaller formats (within limits) can be made to resolve better. This outcome of events was easy to understand when one thinks in terms of the overall system, an enigma if the "sensor" (i.e. film) was thought to be the principal determinant. I once designed and built an underwater stereographic device following these principles, in which high-resolving film and 35 mm format lenses were used and tested it against the existing 6x6 cm Hasselblad-based device for the same documentation purpose (frame coverage on the sea bed were identical). The smaller system easily won out in image quality when we scrutinised the film under 20X magnification in a binocular loupe. However, frames from the Hasselblad could be be enlarged to mural size, the smaller frames couldn't unless a severe drop in perceived quality was accepted.

Whatever one's attitude towards 'equivalence theory', it is pretty obvious that the 'theory' has little to do with practical photography. Just use the current thread as an example.

simsurace · « **Reply #74 on:** May 16, 2017, 11:47:48 »

Quote from: Bjørn Rørslett on May 16, 2017, 11:22:03

Partly this arise because one fails to recognise the entire imaging chain as a self-contained system.

A very compact mathematical description of the entire imaging chain could be achieved by considering the transformation T between the (normalized) image space light density map (light intensity as a function of normalized image space coordinates) and the (normalized) reflectance map (for a print) or screen light intensity map (for on-screen viewing). The criteria of equivalence would be formulated as the changes of parameters of the intermediate stages of this transformation, such that T does not change. Or more concisely: two imaging chains are called equivalent if they have the same T. This terminology fits with the mathematical notion of 'equivalence class'.

I would argue that it is precisely equivalence theory that attempts to consider the whole chain, and treat it more or less as a black box with a small set of parameters, for better or worse. As someone already said: a useful (to some extent) abstraction and simplification.

News:

Author Topic: Discussion of 'Equivalence' (Read 75841 times)

David H. Hartman

Re: Discussion of 'Equivalence'

Jack Dahlgren

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'

Andrea B.

Re: Discussion of 'Equivalence'

Jack Dahlgren

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'

Andrea B.

Re: Discussion of 'Equivalence'

Jack Dahlgren

Re: Discussion of 'Equivalence'

David H. Hartman

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'

Les Olson

Re: Discussion of 'Equivalence'

Bjørn Rørslett

Re: Discussion of 'Equivalence'

simsurace

Re: Discussion of 'Equivalence'