The purpose of this article is to address a prevalent misconception about the relationship between binary computer numbering systems and the range of digital image tonal values. There is no such relationship. The most serious misconception is that because of the computer’s binary ordering, there are digitally more highlight tones in an image than shadow tones. It goes something like this:
|2048 levels||first f/stop|
|1024 levels||second f/stop|
|512 levels||third f/stop|
|256 levels||fourth f/stop|
|128 levels||fifth f/stop|
This misguided perception is based on the premise that each bit in a digital number represents twice or half the values of the adjacent bits. This is true. But, “therefore, the highlights have twice as many tonal values as all the other zones”? That is simply not true. The chart is misleading.
Tonal values are simply integer numbers and the values are linear. How they are encoded in the computer is irrelevant. The dynamic range of the sensor determines how many exposure stops of light can be recorded in a single image. This will range from 8 to 12 stops in various typical sensors. It is sometimes referenced in decibels, 48 to 72. The bit depth determines the precision of the recorded values, not the dynamic range.
8-bit, 12-bit, or 16-bit encoding does not change the dynamic range.
I have seen this perception in several web articles. More recently, I saw this as the lead-in chapter in a hard copy book. What started as two photographer/campers imbibing around a campfire has become "group-think", "parrot speak". If you are interested, this is my rebuttal.
It is true that computers store all data and numbers in binary format. This is a string of ones and zeros. These binary values are based on the logarithmic powers of two. In that sense, each additional bit doubles the range of values previously available. It is also true that exposure f/stops are based on a series of logarithmic powers of two progressions. This is a coincidence, not a correlation.
Computers use binary encoding to store numbers. Thus, the number of bits available determines the maximum value that can be contained. There are many ways to encode a number into a series of bits. There are integers, short integers, and long integers, floating point, double precision floating point, and packed decimal to name just the most common ones. Integers can also be signed or unsigned. Each data type occupies a different amount of space in memory and has a different maximum value. Some hardware systems arrange the digits left to right and some do it right to left. These are known as big endian (as in Unix, IBM, & Macintosh) and little endian (as in Intel & AMD). This is also sometimes called byte ordering. The binary encoding of these numbers is important to the computer circuitry and its internal arithmetic operations, not the resulting values.
The bit-depth of an image file determines the maximum data value that can be stored for a single pixel. Therefore, it constrains the granularity of the individual tones represented. Black is still zero and white is the still the maximum value. Bit-depth should not be confused with dynamic range either. Dynamic range refers to the minimum and maximum luminosity that can be captured by a sensor in a single image, with perceivable details in each. Even the human eye cannot capture the details of a moon-less night and the details of a sunny day at noon without time to adjust. In other words, form two separate images. White and black will always be the min-max values in the image file. The bit-depth merely determines how many unique tones will lie between these.
To tie the bit-depth to the relevant effects of binary encoding assume for a moment only a single color channel. An 8-bit image has a range of tones of 0 to 255. A 12-bit camera image has a range from 0 to 4095. Note that this is 256 unique tones in 8-bit mode and 4096 unique tones in 12-bit mode.
When a 12 or 14-bit image is loaded into Photoshop, it automatically becomes a 16-bit image. So, one would assume that the unsigned values are 0 to 65535. Wrong. Photoshop treats these as a hybrid, unsigned 15-bit plus 1, integer data so the possible values are 0 to +32768. This accommodates 32769 unique tones. I have seen Adobe refer to this as quasi 15-bit. That's probably the most descriptive term. Please note that this is not a standard data type in any way.
So, let’s start with the fact that an image can be divided into lighting zones, as taught by Ansel Adams. The zone system consists of 11 zones from white to black. Zone 5 is the 50% gray tone (18% reflectance) and in the center of the scale. Each zone represents double or half of the light (one exposure stop) in the adjacent zone, so they follow a logarithmic progression (powers of two) in terms of the amount of light. But they still represent a linear progression in terms of their tonal values.
Let’s assume a black and white image with an 8-bit color depth. Zone 10 has a value of 255, zone 0 has a value of 0, and zone 5 has a value of 128. Each zone’s value follows a linear progression. In theory, each zone represents half the light of its neighbor to the left. The following figure shows gray scale zones and their values on the top. This is followed by corresponding gray scale zones from a single target at different exposures on the bottom.
This was produced by shooting a photographic gray card in one-stop exposure increments, eleven times. The camera was a Nikon D1X with all manual settings. The correctly metered exposure produced a mid-tone at 127 as expected. If this is placed in zone V there are only four stops before the highlights max out. At the same time, with six stops of under exposure the image is still not pure black (0).
The following chart illustrates the relationship produced with this camera.
The data demonstrates the fact that the exposure verses tone relationship is not linear or logarithmic. Over exposure will clip tonal details quickly. Under exposure also distorts tonal values, but it retains shadow detail longer and can be recovered easier. This should not be news to any digital photographer.
It also demonstrates the fact that a relatively small miscalculation in the base exposure can result in a somewhat significant shift in tonal values. Coupled with the fact that each color channel (RGB) reacts differently, some might consider digital imaging pure magic.
Gamma is another term thrown around to describe logarithmic properties of tones. The dictionary defines gamma as a number indicating the degree of contrast between the darkest and lightest parts of a photographic image. This would be a straight line. A better description of gamma is a non-linear exponential function describing the relationship between two variables. In either case, it describes a curve on a linear scale or a straight line on a logarithmic scale.
The need for gamma correction is fundamentally that most devices are non-linear. In a display, a linear change in voltage does not produce a linear change in brightness. In a photo diode, a linear change in brightness does not produce a linear change in voltage. Both follow logarithmic curves defined as the gamma number. Windows video drivers typically use log base 2.2 (gamma) while Macintosh drivers typically use 1.8, even for the same monitor. Gamma is set in the video driver, not the monitor. The chemicals in film have similar properties. Photons striking the silver crystals dislodge electrons, creating energy (voltage), which alters the chemical structure. Hence, we can legitimately talk about the gamma curve of a particular film. For a long time it was thought that the gamma curve of a video camera was the inverse of the gamma curve for a TV CRT, thus canceling each other. This is not precisely correct, hence the need for gamma correction. To wrap this up, the gamma of light, based on the inverse square law, is 2. The gamma of two sets of numbers in the same ordinal space is 1, the same as a linear relationship.
Gamma (other than 1) produces a curve similar to the one below.
Here, the chart is linear but the function is logarithmic. If the left axis of the chart were logarithmic, the line would be straight. The exception would be a gamma of 1. Most photographic texts illustrate the gamma curve as a straight line on a logarithmic scale. The chemicals in film and print media and the sensors in digital cameras and scanners also have characteristic gamma curves. Note that if we replace voltage on the x-axis with development time we have a typical published film gamma curve.
Logarithmic charts are not unique to photography. They have been used for years to show stock price movements over time. These are log base 10 rather than log base 2. A linear chart would not illustrate the most recent stock changes adequately since as the base price rises, it takes more of a change to preserve the historic percentage of growth. A straight trend line on a logarithmic scale quantifies the periodic compound rate of return handily. Gamma is also used in physics, chemistry, medicine, and statistics.
In addition, logarithms are use in physics and engineering. Logarithms are fundamental to the design and operation of the slide rule, a precursor of modern calculators.
Most of the manipulations done in digital image editors should be called tone adjustments rather than gamma correction. Gamma corrections are done in the respective capture and display devices. Often, this is done in analogue circuits rather than by digital processing.
S-Curves describe non-linear end-point compression in tones. They represent a mathematical non-linear periodic function similar to the curves in a sine wave. Mathematically, Fourier transforms are sometimes used to generate a curve of this shape. These should not be confused with gamma as they track sensitivity limits near the extremes of a function instead of logarithmic progressions. As with gamma most electronic and chemical processes exhibit some sort of s-curve function at their practical design limits. As the voltage reaches its clipping points, minimum and maximum, changes in brightness have smaller effects. We call this tone compression.
A typical curve for film will have this characteristic S shape. The straight-line portion in the middle tones follows the gamma function. It is sometimes referred to as the gamma and sometimes as the characteristic curve. With film curves and gamma are addressed with chemicals and the developing and printing processes. With digital cameras or scanners they are addressed in the device’s analogue to digital circuits or the firmware.
With image editors such as Photoshop you can alter the relationship between the tones in an image. With a curves adjustment you can increase or decrease contrast locally (shadows or highlights), in a particular color channel, or globally. This is simply mapping an input tonal value to a new output tonal value. As a Photoshop curve, it is a linear chart that starts out with gamma=1. When you create curve shapes, you are effectively adding Fourier transforms. These are then used to map input tones to output tones.
A histogram basically shows the frequency of tonal values in an image. It is compressed and scaled to show the data in a small window. Read more about histograms here.
Once the image data has been recorded, it can be mapped into a histogram for easy review and analysis. The only way to “correct” a histogram while shooting is to change the base exposure. After image capture, it can be manipulated as the file is being opened (raw files only) or later with image editing software (such as Photoshop) and tools such as levels and curves.
Note that the histogram is only a tool. In the figure below, the vertical scale is dominated by the light blue background. However, there is still plenty of shadow detail. There are no discernable highlights because the image has almost none. In fact, there is white in the eyes. There is nothing wrong with this exposure.
Some processing assumptions had to be applied before a preview image can even be shown. Thus, the sensor gamma correction, contrast, saturation, and many other adjustments may be only approximations in the preview. This is the image used to form the camera's histogram.
The three histograms below are taken from the gray scale experiment above.
Note that with the proper exposure (EV 0) the data is all in the middle as it should be. It is wider than you might expect for a single tone gray card. This is because the lighting was a little uneven. This was not an accident. In the over exposed histogram (EV +3) the data shifts right. The shape of the curve changes some and some data is clipped. In the under exposed histogram (EV –3) there is no clipping but the curve has been compressed by about 60%. Tonal information has been severely compressed. Neither incorrect exposure shows the correct range of tones that was available.
The first experiment could justifiably be criticized as not representative of the real world shooting conditions. So, I decided to perform a second experiment. This time the objective is to shoot an image that contains black, white, and gray tones. The gray was the same photographic target (Delta 18%). The white was the whitest paper I could find. The black was velvet fabric. Two opposing tungsten lights to get the flattest lighting possible. Each swatch was sized to contribute one third of the image pixels. Again, I took 11 shots starting at ISO 125, f/8, 1/15. This time, I recorded the white, gray, and black tonal values.
At the base exposure (1/15), the gray tone is spot on. Both black and white are just below their respective clipping levels. The black velvet is about as black as I can get. It is much blacker than any paper target I could find. The white paper is not the whitest that I can find. I have some "silver white" stationery that I'm sure is whiter, but it is somewhat reflective. This may be unscientific, but it sure indicates that this camera can handle eleven zones of dynamic range.
As I increased and decreased the exposure you can see that the middle tone does not exactly track the zones, though it is close. Moving up, it gets too bright too fast. Moving down, it gets too dark too fast, then changes direction. This is the same result as in the first experiment. But this one demonstrates that both the highlights and the shadows start to clip quickly at only one stop change in exposure.
The next set of numbers show the effects of using ACR exposure adjustments to compensate for the exposure errors. This indicates that we can compensate for one stop over exposure or two stops under exposure and keep the tones close to where they should be. This would be a reasonable exposure latitude of three stops. Beyond that, there is some skew in the mid tones. Note that there is very little change in the highlight or shadow tones.The bottom line is that nothing's changed. The correct exposure is a photographic event, not a digital event.
One last concept to clear up and we can be finished with this discussion. A standard photographic gray card is spec'ed at 18% reflectance and as middle gray (50%). If you use this to determine your exposure, the middle tones will be right on, properly exposed. So, how do these numbers relate. The answer has to do with the gamma of light itself. The intensity of light and therefore exposure values are based on powers of two, logarithmic base2, or the same thing restated as gamma 2.
If we take 11 exposure zones and map them into 255 Adobe RGB tones we get the following values.
|Tonal %||Illumination %|
The tonal values are based on a simple linear percentage of the range (0-255). The illumination is based on the same logarithmic progression as the exposure zones. The middle tone at 50% corresponds with the illumination of 17.7%, close enough for me. Here, the illumination is the amount of light "reflected" from the subject (or gray card).
One discrepancy needs some explanation. At a zone or tone of zero (black) the illumination is still 3.1%. How can this be? The answer it that mathematically it takes another 13 stops of exposure reduction to get to less than .05% illumination. That would make the dynamic range be 24 stops. The zone system is based on only 11 stops. The mathematicians will tell you that there can always be more light (to infinity) and you can never reach zero darkness. The engineers will say this is close enough and go build something useful.
There is tone compression in the shadows. But it is due to the properties of light. It has nothing to do with whether the numbers have been digitized or not. It has nothing to do with the minimum or maximum density of any single device. It has nothing to do with the gamma of any single device, including the eyeball.
Negative film behaves differently than digital sensors or slide film simply because the negative inverts the tones. Black is recorded as white (transparent) and white is recorded as black (opaque). These tones will be inverted again during printing. The fundamental paradigms are inverted. Thus with film one exposes to avoid clipping in the shadows, and with digital (or slides) one exposes to avoid clipping the highlights. Not withstanding, the optimal exposure is based on the mid tones.
Computers do use log base2 arithmetic for their internal data representations. The properties of light are also based on log base2 arithmetic. There is no correlation at all between these two fundamental properties.
This article has focused on neutral tones. All the factors discussed so far have similar effects on color tones.
I hope you also gained some new insight from this article. For one, logarithms can mathematically explain, but they do not modify physical properties. If you have any comments, or suggestions, I would welcome your input. Please send me an Email. Read more about Linear Gamma and Linear Capture or ETTR.
Rags Int., Inc.
204 Trailwood Drive
Euless, TX 76039
July 25, 2004
This page last updated on: Saturday December 08 2007
You are visitor number 40,299 since 07/27/04