HK1167768B

HK1167768B - Automated raw image processing method and apparatus, and processing device

Info

Publication number: HK1167768B
Application number: HK12108405.8A
Authority: HK
Inventors: G．G．马库; M.齐默; D．海沃德
Original assignee: 苹果公司
Priority date: 2006-10-13
Filing date: 2012-08-28
Publication date: 2016-04-08

Description

Automatic RAW image processing method and device and processing equipment

This application is a divisional application of the patent application having application number 200780038185.3, application date 2007, 9/27, entitled "system and method for processing images using a predetermined tone reproduction curve".

Cross Reference to Related Applications

This application is a non-provisional application filed on 13.10.2006, entitled "System and Method for Raw Image Processing," U.S. provisional application Serial No. 60/829,519, which is hereby incorporated by reference in its entirety and is hereby claimed to have priority.

Technical Field

The disclosed subject matter relates generally to processing RAW images into absolute color space, and more particularly to deriving tone reproduction curves for image conversion during pre-processing of RAW images and systems and methods for applying the tone reproduction curves.

Background

For digital image sensors of digital cameras, such as Charge Coupled Devices (CCDs), which have a plurality of light-sites (photo-sites) arranged in a color filtered array or pattern, the array or pattern may be an RGB Bayer pattern as described in U.S. patent No.3,971,065. In the RGB Bayer pattern, each light spot is filtered to accept red, green, blue or some variant thereof. RAW bit for bit digital image files captured by digital imaging sensors are referred to as RAW files or RAW images. RAW pictures may typically require 8 to 18MB of storage space, depending on the number of variables. The types of color filter arrays and digital imaging sensors typically vary depending on the digital camera article. For example, some color filter arrays use yellow, cyan, green, and magenta patterns.

In general, a digital camera has an image processing flow (or "pipeline") that performs demosaicing (demosaicing) or de-Bayer processing on a RAW image and transforms the image using a compression algorithm to output JPEG or other type of compressed file suitable for display and viewing. However, RAW images captured by a digital camera can be uploaded to a computer, and computer software operating on the computer, such as Apple's Aperture 1.0, can then allow the user to perform various manual operations on the RAW images.

The color information of the processed digital image may be characterized by a plurality of color models. One such color model is the RGB color space, which uses a combination of red (R), green (G), and blue (B) colors to generate multiple colors. Some RGB color spaces for digital cameras include the standard RGB (srgb) and Adobe RGB. Another color model is the CIE XYZ color space created by the international association for lighting (CIE) in 1931. Mathematical techniques may be used to transform color information from one color space to another.

Digital cameras have white balance settings such as auto, incandescent, fluorescent, cloudy, clear, or sensitivity (e.g., ISO100, ISO2000, etc.). These settings are used to match the digital camera white balance to the color temperature of light from the light source illuminating the subject in the image. The characteristics of various standard light sources are defined in the art. For example, light source a is used to represent incandescent lighting and is defined by the distribution (profile) of a black body radiometer of 2856K. The series of light sources D is used to represent natural daylight. Numbers are used in D-series light sources to indicate the Correlated Color Temperature (CCT) of a source. For example, the CCT of the light source D50 is 5000K, and the CCT of the light source D65 is 6500K. The light source series F is used to represent various types of fluorescent lighting. For example, light source F2 represents cool white fluorescence, while light source F11 represents narrow band fluorescence. When a digital image is acquired by a digital camera under a certain light source, the characteristics of the image under different light sources can be estimated by using a white point conversion technology.

Various factors must be considered when processing digital images acquired with a digital camera or other imaging device. One of the considerations relates to preserving the spatial quality and detail of the digital image, while the other consideration relates to adequately representing the colors of the digital image. In many respects, these two considerations are interrelated.

Drawings

Preferred embodiments and other aspects of the disclosed subject matter can be best understood by reference to the following detailed description of specific embodiments when read in conjunction with the accompanying drawings, wherein:

FIG. 1 illustrates one embodiment of a system for generating and processing a RAW image.

Fig. 2 illustrates one embodiment of a software stack of a general purpose processing device that implements RAW processing according to certain teachings of the present disclosure.

Fig. 3 illustrates one embodiment of an automated RAW image processing flow according to certain teachings of the present disclosure.

Fig. 4 illustrates one embodiment of the automated RAW processing stage of fig. 3.

FIG. 5 illustrates one embodiment of a process for performing highlight recovery (highlight recovery) for the RAW processing stage of FIG. 4.

Fig. 6 illustrates one embodiment of a process for reducing noise and bright points (stuck pixels) by deriving distributions for the cameras in the RAW processing stage of fig. 4.

FIG. 7 illustrates one embodiment of a process for creating a half-size green edge image for the RAW processing stage of FIG. 4.

FIG. 8 illustrates one embodiment of a process for creating unsharpened and sharpened green reconstructed images for the RAW processing stage of FIG. 4.

FIG. 9 illustrates one embodiment of a process for creating a half-size RGB image for the RAW processing stage of FIG. 4.

Fig. 10 illustrates one embodiment of a process for creating red and green reconstructed images for the RAW processing stage of fig. 4.

Fig. 11 illustrates one embodiment of a chroma blur (chroma blur) operation for the RAW processing stage of fig. 4.

FIG. 12 illustrates one embodiment of a process for deriving the feature matrix used in the first conversion stage of FIG. 3.

FIG. 13 illustrates one embodiment of a process for deriving black level (black level) adjustments for use in the first conversion stage of FIG. 3.

FIG. 14 illustrates one embodiment of a process for deriving the conversion matrix used in the first conversion stage of FIG. 3.

FIGS. 15A-15C illustrate one embodiment of a process for deriving tone reproduction curves for use in the second conversion stage of FIG. 3.

Fig. 16A illustrates additional automated processing for the automated RAW image processing flow in fig. 3.

FIG. 16B shows a detailed view of the automated process of FIG. 16A in terms of input and output luminance tables.

Fig. 16C illustrates one embodiment of brightness enhancement (luminance boost) used in the automated RAW image processing flow of fig. 3.

While the disclosed subject matter is susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are herein described in detail. The drawings and written description are not intended to limit the scope of the present principles in any way. Rather, these drawings and written description are provided to exemplify the principles of the invention to those skilled in the art with reference to particular embodiments, as specified by 35u.s.c § 112.

Detailed Description

RAW image processing system

Referring to fig. 1, one embodiment of a system 100 for generating and processing RAW images is schematically illustrated. The system 100 includes an imaging device 110 and a general purpose processing device 150. The imaging device 110 and the general purpose processing device 150 may typically be integrated together in one device having the necessary processing and storage capabilities, such as a digital camera. Alternatively, devices 110 and 150 may be separate elements as shown in FIG. 1. For example, the imaging device 110 may be a digital camera, camera phone, etc., while the processing device 150 may be a computer, laptop computer, etc. For purposes of illustration, the imaging device 110 will be referred to as a camera in the following description, and the processing device 150 will be referred to as a computer.

The camera 110 includes an imaging sensor 120 for capturing digital images. The imaging sensor 120 may be any of various types of imaging sensors known and used in the art, such as a charge coupled device. The camera 110 also has processor and memory hardware 130, and an interface 140 for communicating data to a computer 150. The computer 150 has an interface 160, processor and memory hardware 170, an operating system and software 180, and an output device 190.

In use, the imaging sensor 120 of the camera 110 captures a RAW image 122. As previously described, the imaging sensor 120 has a color filter array that may be arranged in an RGB Bayer pattern. Thus, the color value of each locus of light in the RAW image 122 represents a red intensity value, a green intensity value or a blue intensity value, and each value is typically a 10-bit value ranging between 0 and 4095. Other camera articles may have other color filter arrays, and thus, RAW image 122 may have different information. Taking Sony '828' digital camera as an example, its color values are in four-channel RGBE coordinates representing red, green, blue, and deep yellow-green (Emerald). For descriptive purposes, subsequent discussion will refer to RAW image 122 having RGB values.

After capturing the RAW image, the processor/memory hardware 130 may output the RAW image 122 via an interface 140, which interface 140 may be a Universal Serial Bus (USB) interface. The computer 150 can then use its interface 160 to receive the RAW image 122. Preferably, the computer 150 stores the RAW image 122 as a master file in order to maintain the original copy of the RAW image 122 while also performing the following pre-processing on the copy of the RAW image 122. In an alternative to inputting the RAW image 122 from the camera 110, the image may be input from a network, a memory card, another computer, or an external storage medium such as flash memory, an external hard drive, or a CDROM.

Using techniques detailed later, the processor/memory hardware 170 and operating system/software 180 preprocess the copy of the RAW image 122 and ultimately convert it to an absolute color space (e.g., sRGB, Adobe RGB, etc., which is also referred to as a rendering color space) suitable for display and viewing. After the RAW image 122 is pre-processed and converted to absolute color space, the processor/memory hardware 170 and operating system/software 180 make the resulting image (resulting image) available to various types of software and applications on the computer 150. For example, the resulting pre-processed image is available to imaging applications, such as Preview, iPhoto, and Aperture software components, which may be Apple Computer, in which the image may be processed, viewed, manipulated, etc.

As described above, the operating system/software 180 is used to pre-process the RAW image 122. Thus, we turn now to a description of the software architecture for pre-processing the RAW image 122.

Software stack of RAW image processing system

FIG. 2 shows a layered diagram of one embodiment of a software stack 200 for a general purpose processing device, such as the computer 150 of FIG. 1. Some of the components of the software stack 200 may even be part of the digital camera 110 of figure 1. The software stack 200 includes an operating system (O/S) core layer 210, an O/S services layer 220, a resources layer 230, an application framework and services layer 240, and an application layer 250. These layers are illustrative and certain features are omitted. For example, the low-level software and firmware that underlies the O/S core layer 210 is not shown here. Typically, a software element displayed in a layer uses resources of the layer below it and provides services to the layer above it. In practice, however, not all components of a particular software cell may operate exactly in this manner.

The O/S core layer 210 provides core O/S functionality in a highly protective environment. Above the O/S core layer 210, the O/S services layer 220 extends functional services to the layers above it. The O/S service layer 220 for computer operating systems provides a variety of functional services, while the camera built-in operating system for digital cameras may provide more limited services. In the present embodiment, the O/S service layer 220 has a RAW image processing service 222 that preprocesses a RAW image according to various techniques described below.

A resource layer 230 is located above the O/S service layer and displays graphics resources, such as the open graphics library ("OpenGL"), the Apple Computer' S PDF kit, and so forth. OpenGL, developed by Silicon Graphics, inc, is a specification for various Graphics functions. A RAW processing Application Programming Interface (API)232 is located between the resource layer 230 and the RAW image processing service 222 in the O/S service layer 220. Layer 240 is a functional hybrid (amalgamation) that is generally represented as two layers, an application framework and application services. This layer 240 provides high-level and generally functional support for application programs that reside at the highest level, here shown as application layer 250.

The RAW processing service 232 pre-processes the RAW image using a Central Processing Unit (CPU) in the general processing device. The RAW processing service 232 may also use OpenGL to enable utilization of a Graphics Processing Unit (GPU) associated with a general purpose processing device. The RAW processing API 232, in turn, makes the preprocessed images of the service 232 available to various applications in the application layer 250. Thus, the RAW image can be pre-processed using the RAW image processing service 222 in the O/S service layer 220 and the hardware (e.g., CPU/GPU) of the general purpose processing device. In this way, the RAW processing can be focused on the OS service layer 222 to generate an image with higher quality and enhanced editability compared to a conventional JPEG file provided by a camera, because the RAW image contains more information.

Further, the pre-processed image may be made available to various applications in the application layer 250 with or without RAW image processing capabilities. Applications that may use the pre-processed image in layer 250 may include a word processing application, a graphics application, an address book application, an email application, or any other application that may use an image. Furthermore, the RAW processing techniques disclosed herein may also be applied to RAW video, thereby enabling applications using video to benefit from the disclosed techniques as well.

Overview of C.RAW image processing procedure

As shown in the above embodiment, the preprocessing of the RAW image can be performed in the RAW image processing service 222, wherein the service 222 is operated in the O/S service layer 220 of the general-purpose computer. Flow is a way to characterize image processing. Fig. 3 illustrates one embodiment of a RAW image processing flow 300 according to certain teachings of the present disclosure. The process 300 is illustrated as a plurality of stages 301, 302, 303 and 304 for performing pre-processing on a RAW image 312 from a digital camera 310 or other source and outputting a resulting image 349 for use by various applications (not shown). In an initial image capture phase 301, a camera or other imaging device 310 captures an image and stores it in RAW format as the aforementioned RAW image or file 312. At this time, depending on the processing power of the camera 310, the RAW image 312 may be transferred from the camera 310 to a general purpose processing device or computer (not shown) or may remain on the camera 310 for RAW processing.

Since the subsequent processing stages will still depend on the properties of the image 312 and the camera 310, to enable subsequent processing, the white balance and other metadata 314 of the RAW image 312 and the particular camera 310 used to acquire the RAW image 312 will be identified. The metadata 314 may be in an exchangeable image file (EXIF) format and may include image information of how the RAW image 312 is captured, including, for example, shutter speed, aperture, white balance, exposure compensation, photometry settings, ISO settings, date and time.

In the RAW processing stage 302, RAW processing is performed on the RAW image 312 using one or more different processes, including a black subtraction (black subtraction) process 320, a highlight restoration process 321, a highlight elimination process 322, an automatic exposure adjustment process 323, a demosaicing (debye) process 324, and a chroma blurring process 325. One purpose of the RAW processing stage 302 is to preserve the spatial quality and detail of the scene captured in the RAW image 312.

To implement these processes 320-325, the metadata 314 is used to obtain various RAW processing algorithms associated with these processes (block 328). While one or more of these processes 320-325 are shown in a particular order, these processes may be rearranged according to particular needs, which may be any details associated with the camera. Furthermore, one or more of these processes 320-325 may or may not be used for various reasons, and other common processes in image processing techniques may also be used. Further, some of these processes 320-325 may actually be combined in a manner not necessarily expressed herein.

At the end of the RAW processing stage 302, a demosaicing process 326 generates a camera RGB image 329 from the initial RAW image 312. Here, camera RGB indicates that the image 329 has an unrendered RGB color space, such as isorbgb defined in ISO 17321. The camera RGB image 329 embodies the RAW colorimetry (original colorimetry) estimation of the scene captured in the initial RAW image 312, and the camera RGB image 329 maintains the dynamic range and color gamut (gamut) of the RAW image. The camera RGB image 329 must also undergo additional processing in order to be able to be displayed, viewed or printed via an output device.

In general, the demosaicing process 326 will transform RAW data of the device-specific initial image 312 into unrendered RGB data in a resultant image that is independent of the device (e.g., camera) used to capture the original image 312. The actual implementation of the demosaicing process 326 may depend on the details of the image 312 and/or the imaging sensor used to capture the image 312. For example, since each light locus has only one color value for R, G or B, the actual color for a pixel in the camera RGB image 329 is interpolated from the multiple light loci of the RAW image 312. This image reconstruction process is called demosaicing (de-Bayer for a Bayer matrix filter). Generally, demosaicing process 326 will select a plurality of color values, represented by R, G or B, located at the light locus of RAW image 312 and will output pixels having R, G and B values in camera RGB image 329. Each component of a given pixel is typically calculated from its neighboring pixel intensities. Thus, demosaicing process 326 is typically device specific, since the process depends on the arrangement of filters in the camera color filter array.

Following the RAW processing phase 302 are two conversion phases 303 and 304. One purpose of these conversion stages 303 and 304 is to adequately represent the color of the original scene captured. In many respects, the considerations for adequately representing color in these stages 303 and 304 are interrelated with maintaining the details of the RAW processing stage 302. The final result of these conversion stages 303 and 304 is a resultant image 349 in a rendered color space, e.g., standard RGB or Adobe RGB. The resulting RGB image 304 may then be available for use by a word processing application, a picture application, an address book application, an email application, or any other application that may use images, as previously described with reference to fig. 2.

In the first conversion stage 303, the camera RGB image 329 is automatically processed and converted to XYZ tristimulus values in the XYZ color space, thereby generating an XYZ image 339. Because of its various advantages, the XYZ color space is used here, whereby the image data will be compatible with the processing performed in this stage 303. For example, by converting to XYZ color space, a correlation of color values to a measurable color temperature can be achieved to adequately represent the original colors of the scene captured by the camera 310.

At this stage 303, the black compensation process 330 adjusts the black level in the camera RGB image 329. Meanwhile, the matrix conversion process 332 converts the camera RGB image 329 into an XYZ image 339 having XYZ tristimulus values. To perform the above-described processing during this stage 303, some metadata 314 (including white balance) is used to obtain the transformation matrix (M) (block 334), as well as perform other necessary calculations detailed later.

In general, the black compensation process 330 subtracts the black level from the camera RGB image 329 to provide more efficient encoding and processing at a later stage. Furthermore, the black level to be subtracted depends on the image content and the camera model, and may also be adjusted to reduce noise or generate a better color rendering (rendition). The black compensation process 330 is configured for those particular cameras used to capture the RAW image 312. The process 330 receives as input pixels having R, G and B values and outputs black compensated R, G and B values. The black compensation process 330 optimizes the black offset (black offset) on each RGB channel so that the conversion from camera RGB tristimulus values to XYZ tristimulus values will have minimal error. The black compensation process 330 is derived from the process described in more detail with reference to fig. 13.

The matrix conversion process 332 combines the above-described black compensation and takes the R, G and B values of the camera RGB image 329 as inputs and outputs XYZ tristimulus values. In general, process 332 uses two or more camera feature matrices M that are specific to a camera, model, etc. and associated with a predetermined light source₁And M₂. Each camera feature matrix M₁And M₂Are derived by comparing the measured color values in a color chart (color chart) with the data sensed when the color chart was taken by a given camera. One embodiment for deriving the feature matrix will be described below with reference to fig. 12. One advantage of this stage 303 is that: the pre-calculation and derivation process for the camera feature matrix involves an automatic process that is independent of user subjectivity and other implementations of RAW conversion. Thus, the automated processing in flow 300 may provide quality processed images for use by various applications of the computer.

Using the camera feature matrix specific to the camera used to acquire the RAW image 312, the process 332 derives a conversion matrix (M) that is then used to perform black compensation on the camera RGB image 329 and convert it to XYZ tristimulus values. The conversion matrix M is used in conjunction with white balance in the metadata 314, whereby an optimal matrix conversion from camera RGB to XYZ tristimulus values can be estimated. The transformation matrix M is derived from a process discussed in more detail with reference to fig. 14.

Finally, in a second conversion stage 304, the converted XYZ tristimulus value data is processed and converted to values in an absolute color space, for example standard RGB (srgb) or Adobe RGB. First, the color adaptation process 340 reproduces the color appearance in the image 339 by applying a color adaptation transform. The color-adaptive transformation converts XYZ tristimulus values of those input colors acquired at the input light source into corresponding XYZ tristimulus values of the output colors at the predicted output light source. Color-adaptive transforms are known in the art, with most transforms based on the von Kries model.

Next, the color tuning process 342 adjusts the tone curve of the XYZ tristimulus values of the image. Finally, a transformation process 344 in conjunction with the color tuning process 342 transforms the XYZ tristimulus values into RGB tristimulus values in a rendered absolute color space, e.g., sRGB or Adobe RGB, thereby obtaining a resulting image 349 in a particular RGB color space. The rendered color space (e.g., sRGB or Adobe RGB) is based on colorimetry of real or virtual output characteristics. To perform processing during this second conversion stage 304, some metadata 314 from the camera 310 is used here to acquire the tone curve (block 346) and perform other necessary calculations detailed below.

In general, the color tuning process 342 sets the gain and output tone reproduction curve in order to optimize the rendering of the image to the output media. Process 342 automatically derives conversion parameters according to the camera type and manufacturer. Being automatic, this process 342 allows the derivation of output based on known goals without manual intervention by the user, thereby eliminating the subjectivity associated with manual color tuning by the user. One embodiment of a process for deriving tone reproduction curves for an automatic color tuning process will be discussed in detail below with reference to fig. 15A-15C.

The transform process 344 is combined with the tone tuning process 342 described above. In general, the transform process 344 transforms XYZ tristimulus values to RGB tristimulus values that lie in a defined rendered color space, e.g., sRGB and Adobe RGB. The resulting RGB image 349 may then be accessed by an application or other software for performing further processing and operations.

Details of RAW processing stage in RAW image processing flow

As described above, the RAW image processing flow 300 of fig. 3 has the RAW processing stage 302 including various processes. Fig. 4 shows details of one embodiment of the RAW processing stage 400. Although stage 400 is described as a series of steps, it is to be understood that a given implementation may employ a different order of steps, may exclude certain steps, or may add additional steps not shown. The inputs for the RAW processing stage 400 are the RAW image 402 and metadata. RAW image 402 is Bayer packed (e.g., the light sites in image 402 are arranged in a Bayer pattern or the like, and each light site has an R, G or B value).

1. Black subtraction

In a first step 410 of the RAW processing stage 400, black subtraction is performed on the input RAW image 402. In this step 410, the color samples (R, G, B) of the light locus are scaled and biased based on camera-specific factors. For most cameras, an image with a light locus having a small non-zero color value is obtained even if the image is taken with the lens cover closed. These non-zero color values correspond to black offsets or bias values of the camera that may be caused by noise or other causes. For the final image to be corrected, the black offset value must be removed from the color sample values.

The RAW image 402 from the camera may or may not be subjected to black compensation processing when generated. Some cameras automatically perform black compensation processing on the sensor data after it is generated. In this case, the metadata of the RAW image may have an offset value of 0.

In some cases, the camera may provide an offset value in the metadata of the image 402 under which there is no sensor data. In this case, in step 410, the black offset value is subtracted from the color value of the RAW image 402. For example, for a RAW image 402 from some cameras, its black offset value can be estimated from the average of the values on the edges (margin) of the imaging sensor that block incident light. For a camera model that does not use sensor masking at the edges, the black offset value may be a fixed value for the camera model. Either way, the black offset value may then be subtracted from each intensity value of the image 402. In general, the R, G and B light loci of the cameras may have different offset values. G may typically have two offset values. In addition, the camera may also provide offset values that depend on the "line".

As can be seen from the above, the metadata associated with the RAW image 402 is one source of the black offset value. An override file may be a secondary source of black offset values. The reload file may include black offset values that have been predetermined according to each camera model. This secondary black offset may be used to augment or replace the value associated with the metadata, and may be different for each color channel (R, G, B).

After subtracting the black offset value, the resulting value is preferably clipped (clip) at zero to eliminate negative values in the RAW image 402. Further, since different cameras have different RAW sensing values, it is necessary to scale the values to a common range. This process may be done by multiplying all the black subtracted values by a predetermined value and then dividing it by the current camera's white value. The white value may be predetermined and may be obtained from an reload file or metadata.

2. Highlight restoration

In a second step 420 of the RAW processing stage 400, the RAW image 402 is subjected to highlight restoration processing to correct sensed values in the image 402 that have been cropped at the sensor response maximum level. The imaging sensor used to acquire the RAW image 402 is capable of responding to light up to a maximum value, referred to herein as a maximum output (max-out) value. Thereby, the sensor means do not detect any received light above said maximum output value. To restore the area of the image 402 where the light values have exceeded the sensor maximum output value, the sensed values in the image 402 that were clipped at the maximum output value will be replaced by new estimate values that are estimated with adjacent un-clipped values of the different color channels (R, G, B).

In one embodiment, values of neighboring sensors near the maximum output sensor value are located. These neighboring sensors may have R, G or B values of approximately 85% to 90% of the maximum output value and may be used to determine the value that should be used for the sensor with the maximum output. This process works for those images that isolate the sensor values (i.e., are at maximum output) and can be used inside the RAW image. But in general, a given image may have clusters or regions where the sensor values are already the largest output, so averaging over adjacent pixels is not feasible.

In another embodiment, FIG. 5 shows a highlight recovery process 500 for recovering the luminance and chrominance of pixels in which one or both channels are clipped by the sensor boundary. This process 500 was implemented very early in the demosaicing (debye) process. First, the R, G and B sensor channels will be scaled so that they are in neutral (block 502). The scale factor is determined using white balance, which is obtained from metadata associated with the RAW image. Next, the value at which each channel will reach saturation is calculated (block 504). This calculation is performed using correlation information relating channel saturation values to overexposed images that have been examined for a given camera artefact and model. This allows the relevant information to be predetermined and stored for a plurality of camera artifacts and models for later access by the highlight restoration process 500. In this way, camera information obtained from metadata associated with the RAW image is used to access corresponding relevant information for the camera artefact or model used to obtain the RAW image.

The saturation value is then scaled (block 506) using the same factor used in block 502. The green channel has the smallest saturation value for most cameras, and this property can be exploited by the highlight restoration process 500. The process 500 samples and analyzes each pixel in the image (block 510). If all three channels R, G and B for a given pixel have not been clipped (block 520), the pixel remains intact. If all three channels of a given pixel are clipped (block 520), the pixel is replaced with an intermediate color gray value (block 522).

If at least one, but not all, of the channels are clipped (block 520), the process 500 looks at the saturation of each channel. As described in detail below, the channel of a given pixel having a saturation value is replaced by an estimate value. This process of replacing the saturated value with the estimated value is performed stepwise and also depends on how close the original value is to saturation. The estimate is calculated from the raw RGB values at the pixel.

For example, if the red value in a given pixel is at or near saturation (block 530), then the original red value will be replaced with an estimated red value, which is calculated from the original R, G and B values (block 532). The estimated red value isThis replacement is based on a weighted average calculated between the original value and an estimate that depends on how close the red value is to saturation. For example, if the red value is an 85% sensor trim value, then 100% of the original red value is used. As another example, if the red value is a 90% sensor trim value, then 50% of the original red value is used for replacement along with the 50% estimate. Also, for example, if the red value is a 95% sensor trim value, then a 100% estimate is used for the replacement.

If the green value in a given pixel is at or near saturation (block 540), it is gradually replaced with an estimated green value computed from the original R, G and B values (block 542). The estimate for the G channel is based on a weighted average of the original R, G and B values. The weights of R, G, B used range from (0.5, 0, 0.5) to (0.375, 0.625, 0.0) and depend on the calculation of the degree of greenness of the original pixel. In addition, the substitution also uses the weighted average described above.

If the blue value in a given pixel is at or near saturation (block 550), it is gradually replaced (by a square) with an estimated blue value calculated from the original R, G and B valuesBlock 552). As with the R channel, the estimate of the blue value isThis alternative also uses the weighted average described above.

As a final step, when one or more channels have been estimated, if the recovered RGB values are very bright and in saturation, the color will shift to an intermediate color (block 560). The magenta tone, which is less likely to be seen in the bright areas of the image, will move more aggressively toward the intermediate color. The process of moving the color toward the neutral color is likely to be influenced by user preferences. After a given pixel is analyzed, a determination is made as to whether additional pixels in the image need to be analyzed (block 570), and process 500 returns to block 510 for sampling and analyzing additional pixels of the image.

3. Bright spot cancellation/noise processing

Returning to fig. 4, the RAW processing stage 400 includes a third step 430 of using a bright spot removal and/or noise processing technique to change the light sites in the RAW image 402 that have outliers. In a given RAW image, different photo sites may have atypical values due to the sensor not having the correct response (e.g., sensor retention or charge accumulated in the photo site).

For these stagnant light spots, 3 × 3 similar light spots adjacent thereto are estimated, and a range of 8 surrounding values obtained from the adjacent light spots is determined, thereby obtaining maximum and minimum values. For example, if in a green pixel, only the green light locus is examined. The size of the range will be determined to be equal to the maximum value minus the minimum value of the range. The value at the detention light site will be compared to the range of adjacent similar light sites. If the value of the selected light location point is higher than the maximum value by a predetermined multiple of the range size or the value of the selected light location point is lower than the minimum value by a predetermined multiple of the range size, the value of the selected light location point is set to be the average of the 3x3 neighboring pixels. The predetermined factor may be configured according to the camera and other factors. By replacing the atypical value with a numerical average adjacent to the atypical value, the number of bright point values appearing in the RAW image 402 can be reduced.

Noise in a given image generates channel samples with outliers. This noise may be amplified when constructing an RGB image from a RAW image that is Bayer-encoded. Typically, the rendered noise is colored, which is undesirable. This noise may be related to the ISO setting (i.e., sensor readout gain) and the amount of exposure time used in capturing the RAW image 402. Thus, the noise processing in sub-step 430 preferably uses a correlation between ISO settings, exposure time, and noise amount to reduce or control noise color for the processed RAW image 402.

Fig. 6 shows a process 600 for deriving noise and bright spot distribution data for a digital camera. The profile data relates to the amount of noise generated by the camera at a given ISO setting and exposure time. Thus, the distribution data can be used to process RAW images that were pre-processed in the RAW processing stage 400 of fig. 4.

First, it is ensured that the lens cover and eyepiece of the digital camera are closed and multiple images are acquired for that particular camera in a wide range of ISO settings and exposure times (block 602). The RAW images are then grouped into a plurality of tissue sets (block 604). For example, the first tissue set may have images with a small range of exposure times and a wide range of ISO settings. Further, the second tissue set may have images with a small ISO setting range and a wide exposure time range, and the third tissue set may have images with a wide exposure time range and a wide ISO setting range.

The RAW images will then be processed to quantify the noise response (e.g., the amount of noise) and to quantify the bright spot response in each RAW image (block 606). Thereafter, a noise distribution and a bright spot distribution are calculated for the grouped images (block 608). For each set of organizations, quadratic regression, for example, creates a noise profile of the camera noise response that is related to the given set's ISO settings and exposure time. For each tissue set, the quadratic regression model also creates a bright spot distribution of camera bright spot responses that are related to ISO settings and exposure times, for example. The resulting distribution of the cameras is then saved in the database in the form of a small number of floating point values for subsequent use during pre-processing in the RAW processing stage 400 of fig. 4 (block 610). The previous steps are then repeated for one or more additional cameras, storing the profiles for these particular cameras (block 612).

Finally, the stored profile data characterizing the various cameras can be used to decide how to perform adjustments on a given RAW image with respect to noise. For example, when processing a RAW image in the RAW processing stage 400 of fig. 4, the camera model, ISO settings, and exposure time associated with the RAW image are all obtained from the associated metadata (block 614). The appropriate distribution data for the camera model is located in memory (block 616). Given ISO settings and exposure time information are inserted into the formula for the noise distribution for a particular camera model, and a noise estimate for the image is determined from the calculation (block 618). Then, given ISO settings and exposure time information are inserted in the formula for the bright spot distribution of the camera model, and it is determined whether to enable or disable bright spot estimation for the image (block 620).

When processing the RAW image in the RAW processing stage 400 of fig. 4, adjustment can be performed using those estimated information of noise and bright spots determined in the previous step. For example, the information may determine whether and to what extent noise reduction and bright spot elimination processing is performed when reconstructing a RAW image. Further, the information may also determine whether and to what extent the ramping down sharpening process is performed during or after reconstruction of the RAW image. Finally, this information may also determine whether and to what extent chroma blurring is used in converting the RAW image to RGB at the end of the reconstruction process, explicit desaturation in deep shadow areas, or a tone reproduction curve for saturation control during contrast enhancement to control noise chroma. In the present disclosure, these processes will be described later.

4. Automatic exposure adjustment

Returning to fig. 4, the RAW processing stage 400 includes a fourth step 440 in which the RAW image 402 undergoes automatic exposure adjustment. The automatic exposure adjustment adjusts the brightness of the RAW image 402 so that its exposure satisfies a predetermined criterion. Preferably, the adjustment uses a predetermined brightness variable that has been stored for the adjustment in the RAW processing stage 400. The predetermined brightness variable is based on survey information obtained from a plurality of people observing the activity of the respective image with the adjusted exposure. The survey uses reference images produced by various cameras in multiple exposures. The average luminance of these reference images is calculated. By changing the brightness variation of these reference images using Monte Carlo (Monte Carlo) simulations, survey participants may be asked to select the most visually appropriate image example. The results of the survey are then aggregated into an acceptable resulting exposure whose brightness variation is related to the original input brightness.

The result is associated with a particular camera and stored for subsequent processing. When the RAW image 402 is received for processing, the automatic exposure adjustment calculates the average brightness of the RAW image 402 and determines the exposure from the associated metadata. Using the stored survey results, the automatic exposure adjustment process will then determine what brightness variables to apply to the image 402 to achieve a desired result based on the average brightness and the exposure used. Finally, the brightness of the RAW image 402 is adjusted by a brightness variable determined in the automatic processing.

5. Interpolation process

The fifth step of the RAW processing stage 400 involves an interpolation process 450 that is part of the demosaicing (debye) process. The interpolation process 450 generates a resulting RGB image using a plurality of sub-steps 451-458 that end in a final chroma blurring operation 460. These substeps 451 to 458 comprise: creating a half-size green edge image (451), determining an interpolation pattern (452), constructing a green reconstructed image (453), sharpening the green reconstructed image (454), creating a blurred half-size RGB image (456), and constructing a red reconstructed image and a blue reconstructed image (453). Each of these steps 451-458 and 460 will be discussed below. Finally, the green, red, blue and blurred RGB images from these sub-steps 451-458 will be combined in a chroma blur operation 460 described below.

In general, the interpolation process 450 uses gradient-based interpolation, i.e., first interpolating the luminance channel (i.e., the green channel) for the light locus of the RAW image 402, then interpolating the chrominance channels (i.e., the red and blue channels) for the light locus of the RAW image 402, and finally combining the channels to create a resulting RGB image having R, G and B-valued pixels.

a. Half-size green edge image

Since the human eye is sensitive to luminance variations, finding edges in the green channel enables interpolation to be performed from these edges. Sub-step 451 constructs a half-size green edge image from the RAW image 402, whereby the green edge image may indicate an edge caused by a brightness change of the original RAW image 402. Fig. 7 shows a process 700 for creating a half-size green edge image 750 from Bayer-encoded data of a RAW image 710 (only a portion of which is shown). The RAW image 710 has a plurality of 2 × 2 cells as light spots arranged in a Bayer pattern, where each cell has two green samples (G1, G2), one red sample (R), and one blue sample (B). Various patterns are known and used in the art, and some of the patterns shown herein are merely exemplary.

To create the half-size green edge image 750, only the green samples (G1, G2) of the Bayer-encoded RAW image 710 are used here, while the red and blue samples are ignored. First, the two green samples (G1, G2) in each 2 × 2 unit of the RAW image 710 are averaged, thereby generating a corresponding green value in the intermediate half-size green image 720. Thus, the half-size green image 720 has only a green value, and its size is half of the size of the original RAW image 710. Next, the first derivative (i.e., rate of change) of the green sample values is calculated in both the horizontal and vertical directions. The squares of the horizontal and vertical first derivatives of each value are then summed to generate an edge image 730. Thereafter, a two-pass single pixel blurring process 740 is performed on the edge image 730 to generate a blurred final half-size green edge image 750.

In the single pixel blur process 740, a first pass process 740 is performed on the edge image 730 with a center weighting of 0.5 for four neighboring pixels of each pixel. A second pass 740 is then performed on the previously blurred edge image 730, with a center weighting of 0.125 for the four adjacent pixels of each pixel, thereby generating a final green edge image 750 of half-size. The resulting half-size green edge image from this step 452 will be used in further processing by other steps, as shown in fig. 4.

b. Interpolation direction determination mapping table

Returning to fig. 4, sub-step 452 of the interpolation process 450 creates an interpolation direction determination map for determining how to fill the missing green values for the R or B channel-containing photo-sites located in the RAW image 402. In one embodiment, the adjacent red and blue samples of each 2 x 2 cell of the Bayer pattern may be averaged to determine the missing green value for the photosite using standard demosaicing techniques. In a preferred embodiment, vertical or horizontal neighboring samples in the RAW image 402 can be used to determine the green color channel for these red or blue samples. For example, if a region of a regular image has horizontally oriented edges or stripes, then the horizontally adjacent light sites of the R and B light sites in that region are preferably used to determine their green channel values. On the other hand, if the regular image area has vertically oriented edges or stripes, then the vertically adjacent light sites of the R and B light sites in that area are preferably used to determine their green channel values.

To determine the horizontal or vertical direction for interpolation, R, G and B-channel gradients of the RAW image 402 are calculated for each light locus. The gradient is then mapped in an interpolation direction determination map to determine how to fill in the missing green channel values at a given R or B light site in the RAW image 402. In general, if the green channel gradient shows that it is locally unchanged, the gradient of the green channel will have the strongest influence in determining how to fill in the missing green channel values for the R or B light locus. If the green channel gradient is changing locally, the non-green channel gradient for a given light locus is used to determine how to fill in its missing green value. The gradient of the green value over the photosite can be determined by looking at the half-size green edge image at that location.

The determined direction for each light location point is then entered into the mapping table as a vote value. For example, a vote value of "1" corresponds to a decision to use the horizontal direction in the mapping table during interpolation of the missing green channel at the light site, while a vote value of "0" corresponds to a decision to use the vertical direction in the mapping table during interpolation of the missing green channel at the light site. These voted values may then also be used in the voting process to fill in the missing green channel values. The actual steps for filling in the missing green channel values are discussed below with reference to green reconstruction step 453.

c. Green reconstructed image

Sub-step 453 of interpolation process 450 uses the raw Bayer-encapsulated image from step 440, the half-size green edge image from sub-step 451, and the interpolation direction of sub-step 452 to determine a mapping table, thereby creating a green reconstructed image. One embodiment of a process 800 for creating a green reconstructed image 850 is schematically illustrated in fig. 8. To create the image 850, each green channel G of the photosites in the RAW image 810 is maintained for a corresponding photosite in the green reconstructed image 850. For example, green samples G in the RAW image 810₄₃Is the same as that used in the green reconstructed image 850. However, the light loci for the R and B channels in the RAW image 810 do not have green values, so it is necessary to interpolate their values in the green reconstructed image 850 using the interpolation direction decision mapping table 820. For example, for reconstruction in greenThe selected photosite 852 in the image 850 is said to correspond to the blue sample B in the RAW image 810₄₄Thus, the photosite does not have a green value.

As described above, when filling in the missing green values of the R and B samples, a voting process is used to determine which direction to use. This voting process allows analysis of adjacent samples sampled in the interpolation direction decision mapping table 820 so that an average of direction decisions inside adjacent samples can be used to determine which direction to use to fill in the missing green value.

The interpolation direction decision mapping table 820 has a directional vote at each pixel position. This voting process uses the decision mapping table 820 and can be implemented in a variety of ways. For example, the direction decision D in the mapping table 820 may be "1" for the horizontal direction and "0" for the vertical direction. In an alternative, the single table decision 822 corresponding to the selected light location point 852 can be used alone to determine the interpolation direction. In another alternative, a consensus of several voting opinions from the decision's neighboring decisions 824 may be taken, and the voting opinion consensus may decide which direction to use to interpolate the green value for the selected photosite point 852.

Global voting is a kind of consensus voting that can be used. The overall vote uses a neighboring decision 824 that contains a decision of an odd number of samples (either 3X3 or 5X5 samples). The sum of the decision values of all neighboring decisions 824 is calculated. If the sum is greater than half of the total number of light sites of the adjacent decision 824, then a "1" (horizontal direction) is used for the decision of the selected light site 852. Otherwise "0" (vertical direction) is used.

Pass-band voting (pass-band voting) is another way of voting that may be used. In such voting, the sum of the neighboring light location point decisions is calculated (excluding the center decision 822 of the selected light location point 852). If the sum is less than the predetermined threshold, the decision value of the selected light spot 852 is set to "0" (vertical direction). If the sum is greater than another predetermined threshold, the decision value for the selected light point 852 is set to "1" (horizontal direction). The threshold depends on the size of the neighbor decision 824 used. If the sum is equal to or between the two thresholds, then the centering decision 822 of the selected light site 852 is not intervened.

For a 3 × 3 neighbor decision 824, the neighbor decisions may be 8 (excluding the center decision 822). For "wide-passband voting," the lower threshold may be "2" and the upper threshold may be "6". For "narrow pass band voting," the upper threshold may be "4" and the lower threshold may likewise be "4," indicating that the neighboring decisions must be exactly the same as the used center decision 822. This wide pass band voting is preferred because it does not affect the natural directional preference of the selected light location point 852.

After determining the direction (vertical or horizontal) in which the green value is to be determined for the selected photosite 852, a mathematical calculation then determines the green value for the selected photosite 852 in the green reconstructed image 850 using the samples in the RAW image 810. First, the average of the two green samples in the selected direction is used as the base result for the green value of the selected photosite 852. For example, in FIG. 8, the determined direction is vertical, thereby determining the green sample G₄₃And G₄₅And the average is used as the base result (e.g., green result) for the selected photosite point 852. However, the green result is also adjusted according to the value of the corresponding sample (red or blue) 812 in the RAW image 810. First, the average of two homogeneous adjacent samples of the corresponding sample in the sampling direction is calculated (e.g.). Then, the difference between the value of the center sample 812 and the average will be calculated (e.g., difference B)₄₄-averaging). The difference is then scaled according to the green edge intensity of the corresponding sample 812, producing a scaled difference. (the green edge image previously constructed in sub-step 451 of FIG. 4 may be used to determine the edges at the corresponding samples). Finally, the selection for use in the green reconstructed image 850 is calculatedThe resulting green value G of highlight point 852 is used as the zoom difference added to the green result. The effect of this is to rely more on the values of the other channels (R or B) in the region with more green edges.

The entire process 800 is repeated until the green reconstructed image 850 is completely filled with green values, i.e., green values directly obtained from the RAW image 810 and green values interpolated using the decision mapping table and voting described above.

d. Green sharpening

Returning to FIG. 4, a subsequent sub-step 454 of the interpolation process 450 performs a green sharpening operation on the green reconstructed image 850 from the above sub-step. First, the green image is transformed into a space as close as possible to the perceptual gradient. The perceptually graded reconstructed green image is then blurred by a radius of the sharpening operation, thereby generating a blurred reconstructed green image. By subtracting this blurred reconstructed green image from the perceptually graded reconstructed green image, a green high-pass image can be generated. The green high-pass image contains the high frequency information of the green reconstructed image. The green high-pass image is used for two purposes. The first purpose is to sharpen the image. The temporally sharpened green reconstructed image is generated by adding a green high-pass image of a predetermined sharpening multiple to the original image. The second purpose is to compute an edge mask (edge mask), so that the sharpening operation can be limited to only the operation on the image edge. The edge mask is generated by taking the absolute value of the green high-pass image, blurring it by a small amount, and then increasing its contrast by a large factor. The result is a threshold at a predetermined level and fixed at range 0.. 1 to generate an edge mask image. The edge mask image is used as a mask to decide which regions in the temporarily sharpened green reconstructed image to blend with the green reconstructed image to form sharpened green reconstructed image 870. This sharpened green reconstructed image 850 will later be combined with the red reconstructed image and the blue reconstructed image to generate an image with RGB values on each sample.

e. Blurred half-size RGB image

Returning to fig. 4, sub-step 456 of the interpolation process 450 creates a blurred half-size RGB image, and this image will be used in a chroma blur operation 460, discussed later. Fig. 9 shows an example of a part of Bayer encoded data of the RAW image 910 used as an input of this step. First, the green value (G) contributed to each pixel 922 in the intermediate half-size RGB image 920 is determined using the average of two green light loci (G1, G2) in the 2 × 2 cell 912 in the original full-size RAW image 610. The red and blue samples (R, B) contribute to each pixel 922 primarily from a single sample of those colors (R, B) found in the 2 x 2 unit 912 of the full-size original image 910, along with smaller contribution values added from adjacent similar color (R, B) samples using convolution resampling or similar techniques. Finally, the intermediate half-size RGB image 920 is blurred with a gaussian kernel 930, thereby generating a blurred half-size result RGB image 940. In addition, one or more other blurring operations may be used, including but not limited to Gaussian blur, selective blur, bilateral filtering, intermediate filtering, and box filtering (box filter). This resulting image 940 is blurred and may thus be used in the chrominance blurring step 460 of fig. 4 as described below.

f. Red and blue reconstructed images

In the interpolation process 450 of fig. 4, sub-step 458 creates a red reconstructed image and a blue reconstructed image. Fig. 10 schematically illustrates a process 1000 for creating a red reconstructed image 1050 and a separate blue reconstructed image 1070, where the image 1070 is only partially shown in fig. 10.

In the reconstruction process 1000, each optical spot in the RAW image 1010 is sampled. If the current sample of the RAW image 1010 is the correct channel (i.e., R or B), then the value of the resulting sample in the reconstructed image (1050 or 1070) is equal to the original sample value plus the "green sharpening difference". At this point in the process, RAW image processing 400 of fig. 4 has produced a sharpened green reconstructed image (sub-step 454) and an unsharpened green reconstructed image (sub-step 453). The "green sharpened difference" is the difference between the green channel at the sample point in the sharpened green reconstructed image (from sub-step 454) and the green channel at the sample point in the unsharpened green reconstructed image (from sub-step 453). The effect of the "green sharpening difference" is to apply any green sharpening process to the red and blue channels when reconstructing these channels. For example, in reconstructing the red reconstructed image 1050 in fig. 10, the current sample 1012 is the correct R-channel, and the resulting red locus 1052 in the red reconstructed image 1050 is the original sample 1012 plus the "sharpening difference".

If the current sample in the RAW image 1010 is not the correct channel (R or B), it is determined whether the current sample has a horizontal or vertical neighbor of the immediate desired channel. If so, an average of the horizontal or vertical neighbors is taken and the resulting samples in the reconstructed image are generated by adding the "green sharpening difference". For example, in reconstructing the red reconstructed image 1050 in fig. 10, the current sample 1014 is not the correct channel (i.e., not R). The resulting red light locus 1054 in the red reconstructed image 1050 is therefore the average of the horizontally adjacent samples of the correct channel (i.e., R) plus the aforementioned "sharpening difference".

If the current sample in the RAW image 1010 is not the correct channel and does not have a direct horizontal or vertical neighbor of the correct channel, then the average of all four diagonal neighbors is computed and the resulting sample in the reconstructed image is generated by adding the "green sharpening difference". In reconstructing the blue reconstructed image 1070 in fig. 10, for example, the current sample 1016 is not the correct channel (i.e., B for blue reconstruction) and does not have a horizontal or vertical neighbor of the correct B channel. Thus, the resulting blue locus 1076 in blue reconstructed image 1070 is the value of the average of the diagonally adjacent samples of the correct channel (i.e., B) added to the aforementioned "green sharpening difference". The final result of the reconstruction process is a red reconstructed image 1050 and a blue reconstructed image 1070, which may be stored in a buffer until combined in a subsequent process to generate a resulting RGB image.

g. Chroma blur operation

Finally, chroma blur operation 460 in FIG. 4 uses the sharpened green reconstructed image from sub-step 454, the blurred half-size RGB image from sub-step 456, and the red and blue reconstructed images from sub-step 458 to generate the complete RGB image. FIG. 11 illustrates one embodiment of a chroma blur operation 1100 for the automated process of FIG. 4. Initially, a red reconstructed image 1102, a sharpened green reconstructed image 1104, and a blue reconstructed image 1106 are obtained from previous processing (block 1110), and the reconstructed RGB image is created by combining R, G and B samples from each of these images into RGB pixels in the reconstructed RGB image (block 1112). The luminance of the reconstructed RGB image at each pixel is then calculated (block 1114). Bilinear interpolation is then used to adjust the blurred half-size RGB image 1108 to a blurred full-size image (block 1120), and the luminance of each pixel in the blurred full-size image is calculated (block 1122). The luminance of the blurred colors within the blurred full size image are then scaled at each pixel to match the luminance of the reconstructed RGB image (block 1130). Finally, the fast chroma blur operation 1100 produces a full-size RGB pattern with reduced color edge effects (block 1140).

Details of transition stages in E.RAW image processing flow

1. Derivation of feature matrices

As previously described in block 334 of fig. 3, the feature matrix used to convert the camera RGB image to XYZ tristimulus values is pre-computed and used in conjunction with the white balance of the image to estimate the optimal matrix conversion from camera RGB to XYZ tristimulus values. FIG. 12 shows in flow chart form a method for automating RAW processingOne embodiment of a process 1200 of converting derived feature matrices. In process 1200, a plurality of camera feature matrices (e.g., M) are derived for a plurality of cameras, camera types, camera models, articles of manufacture, imaging sensors, or other categories₁、M₂Etc.). The purpose of process 1200 is to exclude user intervention and subjectivity in the interpolation matrix selection process during the RAW process of fig. 3, and to automatically derive parameters for RAW conversion according to camera, type, manufacturer, or other categories.

Initially, a reference image (e.g., Macbeth Color Checker) or similar image of a Color table is acquired using a particular camera and some known light source (e.g., light source A, D65, etc.). The reference image is not the only reference image produced by the camera. As will be discussed in detail below, a plurality of such reference images are acquired for a given camera, camera type, camera model, camera manufacturer, camera imaging sensor type, or other category. The reference image may be made in a laboratory, and the color chart may contain various color patches (color patches), highlights, shadows, etc., and may be located in a light box with multiple standard light sources.

XYZ color values are measured for each color region or patch from the color table using standard instrumentation such as a colorimeter under the same light source (block 1204). The RGB values for these regions are then mathematically fit to the corresponding XYZ tristimulus values measured for these regions (block 1206). The fitting process involves solving matrix coefficients in a matrix that is used to correlate the RGB values of the reference image with the XYZ tristimulus values measured from the color table. The matrix may be 3x3 due to the tristimulus color space involved, so that nine variables would be involved in solving the fit. In this fitting process, the original white balance of the reference image is ignored. Thus, the fitting process no longer assumes that the camera is calibrated, since the original white balance is ignored. Instead, a fixed white balance is used in solving the matrix coefficients. White balance is usually expressed as a vector having three valuesFor this fitting process, the vectorIs equal to 1, and thus any white balance effect in the fitting process is not recognized.

In the fitting process, it is preferred to use multiple color patches or regions of the color table to derive the matrix coefficients. Accordingly, process 1200 may be repeated for different color patches of the same color table under the same light source and/or different color tables under the same light source (block 1210). This allows a balance to be found for the coefficients of the derivation matrix so that the coefficients will have the proper weighting to correlate the RGB values of the various colors with the measured XYZ tristimulus values measured for these different colors using the same light source (block 1212). Finally, process 1200 derives a first feature matrix M for the first camera and the first light source₁. The first matrix M₁Is a 3x N matrix where N corresponds to the number of channels (e.g., R, G, B) for the camera.

In deriving a first camera feature matrix M₁Thereafter, blocks 1202-1212 are repeated one or more times for one or more additional light sources using the same camera that was previously used (block 1214). Previous measurements of one or more color tables may be reused. The result of repeating these steps is that multiple camera feature matrices under different light sources are acquired for the same camera. The matrix coefficients of these matrices are optimized or adjusted to reduce color errors when matching each image to the corresponding measurements of one or more color tables underlying each light source (block 1216). This process can be done using testing and tuning. Once optimized, the coefficients of each matrix are normalized using normalization factors specific to each matrix (block 1218). Preferably, the normalization factor for a given matrix is the sum of the diagonal coefficients of the matrix, whereby each coefficient of the matrix is divided by this sum.

Finally, the entire process of blocks 1202 through 1218 may be repeated here for one or more additional cameras, models, etc. (block 1220). The result will be a plurality of camera feature matrices. Wherein each camera feature matrix is associated with a respective light source and one camera, type, model, article, etc. These camera feature matrices are then stored for access while interpolation is performed in subsequent processing discussed below with reference to fig. 14 (block 1222).

As described above, multiple camera feature matrices may be made for a camera in a laboratory, and these feature matrices stored for later access in the automatic RAW image processing of the present disclosure. Thus, multiple cameras may store their own camera feature matrices in memory. In an automatic pre-processing process, the metadata of the RAW image can be used to select the appropriate camera feature matrix for the image from memory. In addition to having a predetermined matrix of cameras, the user can also create a feature matrix for their cameras independently by using the techniques described above and carefully constructing the lighting environment and using color calibration targets.

2. Derivation of black compensation

As previously described in fig. 3, the black compensation process 330 is performed for the camera RGB image 329 in the first conversion stage 302 to XYZ color space. The process 330 performs compensation using the derived black level adjustment. The black order adjustment is embodied as a camera feature matrix (M) derived for the camera₁And M₂) And by a transformation matrix (M) generated from these feature matrices and from white balance information received from the camera. Fig. 13 illustrates one embodiment of a process 1300 for deriving black level adjustments for use in the automated RAW process of the present disclosure.

The process 1300 for deriving the black compensation value is incorporated into the process for deriving the camera feature matrix discussed above with reference to fig. 12. As previously described, the matrix derivation process 1200 of fig. 12 involves acquiring a reference image of a color table using a known light source. Part of this step also involves subtracting the standard black level offset, if any, of the captured image (block 1302). The RGB tristimulus values for each region within the reference image are then fitted to the measured XYZ tristimulus values from the color table as before to derive a feature matrix (block 1304). This fitting process yields a certain amount of fitting error regardless of how many patches are analyzed, because the number of variables for the eigenmatrix coefficients is 9 variables for a 3x3 matrix.

To reduce the error, a variable for black compensation is used in the fitting process. Each of these black compensation variables is subtracted from one of the color channels (R, G, B) of the reference image. In this manner, block 1304 fits the RGB tristimulus values of the reference image to the XYZ tristimulus values measured using 12 variables (i.e., 9 variables from the matrix, and 3 variables relating to RGB channel offsets). The value of the black compensation variable in the fitting process of block 1304 is then used for derivation until the color error is reduced to some threshold level (block 1306). In this way, by reducing the color error between the two color sets, i.e., the measured color set and the color set estimated in the reference image, an optimal black compensation variable can be derived for each color channel. These black compensation variables also represent additional black adjustments in addition to the standard black offset.

The black compensation variable may then be organized and stored for subsequent access in the process (block 1310). For example, to organize the black compensation variables, these variables may be associated with the feature matrices used to derive their respective cameras and light sources. This results in each feature matrix having an associated black compensation variable. Furthermore, the black compensation variables may also be associated with different conditions for deriving them. These conditions may include, but are not limited to: the light source used, the image content, the white point camera setting (white point camera setting), and the ISO sensitivity involved. If the values of the variables show clustering characteristics, the different correlations between the black compensation variables can be averaged out between the clusters. Furthermore, if these variables differ, different black compensation variables can be distinguished and classified according to these different conditions.

Process 1300 provides a more objective and consistent optimal black order compensation decision for use in RAW image processing. For example, in the process 300 of FIG. 3, these black compensation variables may be initially subtracted from the RGB tristimulus values of the camera RGB image 329 in step 330 to reduce color errors in the conversion, prior to performing the matrix conversion. The selection of the set of black compensation variables to use may then be based on the camera, type, model, etc. in the metadata 314. Further, the selection may also be based on different internal shooting conditions (e.g., camera ISO settings, exposure time, etc.) or external conditions such as light sources, which may also be indicated in the information of the metadata 314.

Furthermore, the selection of the black compensation variable used in the process may be based on an average or interpolation between the sets of variables. As previously described, the first set of black compensation variables is derived for the first light source and is associated with the first feature matrix M₁Associated with a second set of black compensation variables derived for a second light source and associated with a second feature matrix M₂And (4) associating. These two sets may be stored and may be used in conjunction with the feature matrix M discussed in the next section for computing the optimal transformation matrix M₁And M₂And (4) associating. In one embodiment, the optimal black compensation variable is the slave and feature matrix M₁And M₂Calculated in a fixed average (fix average) of associated first and second sets of variables, and the feature matrix M₁And M₂Which can be used to calculate the optimal transformation matrix M as described below. Alternatively, the optimal black compensation variable is calculated using linear interpolation between the sets of black compensation variables. The linear interpolation applies the same interpolation factors as those used to calculate the optimal conversion matrix M as described below.

3. Computing transformation matrices

As previously mentioned in the RAW image processing of fig. 3, a matrix conversion process 332 is performed on the camera RGB image 329 to convert it into XYZ tristimulus values. Process 332 performs the transformation using the transformation matrix M. The transformation matrix M is a 3 × N matrix, where N is the number of channels of the camera. The conversion matrix M depends on the white balance and at least two pre-computed camera characteristic matrices corresponding to the two reference light sources, and is computed according to the particular camera used, the type of camera, the article, etc.

Fig. 14 illustrates one embodiment of a process 1400 for computing a transformation matrix (M) for use in the automated RAW process of the present disclosure. The process 1400 is performed dynamically for each image at the time of conversion, and finds a conversion matrix M to convert camera RGB to XYZ tristimulus values using the white balance from each image of the camera and using predetermined feature matrices corresponding to at least two reference light sources.

In particular, the transformation matrix M is derived from the feature matrix M₁And M₂And unique white balance information of the cameraWherein the white balance information is image dependent and provided by the camera as metadata (e.g., 314 of fig. 3). White balance informationIs a three-valued vector that, when normalized, has only two significant elements since the third element is equal to 1. Through an iterative process, M can be used₁、M₂And white balance using imagesTo solve the matrix equation for the white point W of the image. The solution converges to a unique matrix M and a unique white point W.

What the process 1400 solves is a transformation matrix M that may be specified for a given camera type, a given camera style, a single camera, a manufacturer, or other featuresAnd the process finds the chroma scaling factor (chroma scaling factor) and the appropriate white balance from which the optimal transformation matrix M is derived. Only two feature matrices M are used in the process 1400 described below₁、M₂. Other embodiments may use more than two feature matrices, which generally improves the interpolation process.

As described above with reference to FIG. 6, a plurality of feature matrices M for a given camera₁，M₂，...M_nAre generated by tracking images of the same scene under different light sources. For example, M₁There may be a camera feature matrix obtained with light source a (xa ═ 4476, ya ═ 4074), M₂It may be the camera feature matrix obtained with light source D65 (xd-3127, yd-3290). These light sources can be represented as vectors in the XYZ space or the CIE-1931 chromaticity space x, y.

Multiple feature matrices M₁，M₂，...M_nOrdered according to the correlated color temperature from 1 to n. Thus, the farthest illuminant can be used to estimate the actual illuminant of the scene by deriving a preferred transformation matrix M for a given camera, which converges to the estimated illuminant in the RAW image being processed. Initially, a previously determined camera feature matrix M is input₁And M₂Will (block 1402) and set an initial value (block 1404). The initial value includes a first feature matrix M set to₁Equal first intermediate matrix ma1, and second feature matrix M₂An equal second intermediate matrix ma 2. These pairs M are processed iteratively as follows₁、M₂The initialized intermediate matrices ma1 and ma2 will eventually converge to the preferred transformation matrix M. For example, the maximum number of iterations K is set to 0, and the total number of iterations limit N is set to 20. Further, the first scaling factor fa1 for white point chromaticity is set equal to 1, and the second scaling factor fa2 for white point chromaticity is set equal to 0.

The threshold T for determining convergence is also set to a value, e.g. 0.001, whereby the process will be repeated until the absolute value of the difference between the first and second chrominance scaling factors fa1 and fa2 is below the threshold. By using this threshold, the iterative process may typically require 3-5 iterative steps for most cameras. In an alternative embodiment the norm differences of the matrices ma1 and ma2 may be calculated for comparison with thresholds. However, in the following example, the first and second chrominance scaling factors fa1 and fa2 are used to measure convergence, since the first and second chrominance scaling factors fa1 and fa2 are scalars that have been calculated.

Initially, the absolute value of the difference between the first and second chrominance scaling factors fa1 and fa2 will be determined if it is greater than a preset threshold T (block 1410). Of course, for the first iteration, the chrominance scaling factors fa1 and fa2 are 1 and 0, respectively, and this will be larger than the threshold T. The process then begins with the first feature matrix M1 (block 1420).

For the first feature matrix M1, the inverse of the intermediate matrix ma1 is computed (block 1422). Then, the weights in the XYZ color space are calculated from the image white balance (block 1424). For example, Xw ═ mi [0] [0]/p [0] + mi [0] [1]/p [1] + mi [0] [2]/p [2 ]; yw ═ mi [1] [0]/p [0] + mi [1] [1]/p [1] + mi [1] [2]/p [2 ]; and Zw ═ mi [2] [0]/p [0] + mi [2] [1]/p [1] + mi [2] [2]/p [2 ]. Next, x-y weights are computed based on the xyz weights (block 1426). For example, Xw ═ Xw/(Xw + Yw + Zw), and Yw ═ Yw/(Xw + Yw + Zw). An X-Y weighting factor is calculated based on the X-Y weighting and the particular light source value associated with the image (block 1428). For example, fx1 ═ x-xa)/(xd-xa) and fy1 ═ y-ya)/(yd-ya).

The first chrominance factor fa1 for the intermediate matrix ma1 is then calculated as fa1 ═ sqrt (fx1 @ fx1+ fy1 @ fy1) (block 1430). Finally, the new value of the first eigenmatrix M1 is calculated as M1 ═ 1-fa 1. mu.m 1+ fa 1. mu.m 2 (block 1432).

Similar processing is then repeated for the second feature matrix M2 using the following equation (block 1440):

mi＝Inverse(ma2)；

Xw＝mi[0][0]/p[0]+mi[0][1]/p[1]+mi[0][2]/p[2]；

Yw＝mi[1][0]/p[0]+mi[1][1]/p[1]+mi[1][2]/p[2]；

Zw＝mi[2][0]/p[0]+mi[2][1]/p[1]+mi[2][2]/p[2]；

xw＝Xw/(Xw+Yw+Zw)；yw＝Yw/(Xw+Yw+Zw)；

fx2＝(xw-xa)/(xd-xa)；

fy2＝(yw-ya)/(yd-ya)；

fa2＝sqrt(fx2*fx2+fy2*fy2)；

m2＝(1-fa2)*ma1+fa2*ma2；

once the first and second eigenmatrices M1 and M2 are calculated in the manner described above, the intermediate matrices are set equal to the respective eigenmatrices ma1 — M1 and ma2 — M2 (block 1442).

Finally, process 1440 may be repeated as long as the convergence threshold T (block 1410) or the total number of allowed iterations (block 1450) has not been met. If the process 1400 has exceeded the set total number of iterations K (block 1450), the process 1400 is forcibly stopped and proceeds to block 1460. Otherwise, the process 1400 returns to block 1410 to see if the threshold T has been exceeded and repeats another converged feature matrix M if necessary₁And M₂The iterative process of (2). If the threshold T has not been exceeded, blocks 1410 through 1450 are repeated for another iterative process to converge on the feature matrix.

In any case, once convergence is complete or the total number of iterations is exceeded, the final chroma scaling factor is calculated as fa ═ fa1+ fa2)/2 (block 1460). Finally, the resulting conversion matrix M is determined by first calculating M ═ 1-fa2, ma1+ fa2, ma2, and then calculating M ═ inverse (M) (block 1462). The resulting conversion matrix M may then be used in RAW image automation for a particular camera, regardless of the particular light source used in the process of using the conversion matrix M to transform the camera RGB image (block 1464).

4. Derivation of tone reproduction curves

As previously described in the RAW image processing flow 300 of fig. 3, the color tuning process 342 and the transformation process 344 are used to convert XYZ tristimulus values to generate a resultant RGB image 349. The conversion is implemented using a 3 xn conversion matrix M, where n is the number of channels per camera (e.g., R, G and B), as described above. Furthermore, the conversion is achieved with a tone reproduction curve, wherein the curve is intended to optimize the rendering of the image from linear space to display or output space. Thus, the tone reproduction curve used in the color tuning process 342 will affect the appearance, contrast, shading, highlight detail, and overall image quality of the image.

Fig. 15A-15C illustrate one embodiment of a process 1500 for deriving tone reproduction curves for use in the automated RAW process of the present disclosure. The process 1500 may automatically derive an optimized tone reproduction curve for a particular camera, camera type or brand, manufacturer, or other camera-related criteria to be used automatically in the color tuning process 342 of fig. 3 when performing a RAW process for the particular camera, camera type or brand, manufacturer, or other camera-related criteria.

Initially, a plurality of reference images 1502, 1504 are selected in the derivation process 1500. For convenience, only two reference images 1502, 1504 are shown in fig. 15A. The first reference image 1502 is associated with camera information 1503, i.e., with a particular camera, camera style, artifact, and so forth. This reference image 1502 is referred to as a target reference image. One or more other reference images 1504 may be generated by the same or different cameras and may be processed with different techniques and processing software. Each reference image 1502, 1504 contains the same scene (e.g., color table, etc.) using the same settings, light sources, etc. Preferably, the other reference image 1504 contains highlights, shadows, good tone distribution, and distribution of underexposed to overexposed images. Accordingly, the other reference image 1504 may undergo certain automatic and/or manual operations performed using image processing software. In general, the reference images 1502, 1504 may be generated in a laboratory, whereby the scene, exposure, size, etc. may be substantially the same between images. However, the RAW processing embodiments disclosed herein may allow a user to independently generate a reference image in order to generate a tone reproduction curve.

In the reference images 1502, 1504, the color of each pixel depends on the sensor of the camera used, the demosaicing process applied, the noise reduction algorithm applied, and other details used to generate the pixels. Thus, even with the same scene, the two images 1502, 1504 are less likely to have pixel-by-pixel corresponding colors due to differences in the imaging sensors used to capture the images, differences in the noise reduction process, differences in the demosaicing process used, and other possible causes.

In a first step 1510, the images 1502, 1504 are scaled down to a size intended to reduce demosaicing differences and noise artifact effects therebetween. Since the debayer, highlight recovery, image size, and other details of the two images 1502, 1504 are unlikely to be identical, it is often necessary to perform scaling. A low pass filter is preferably used here to average the pixel colors into the regions. The result is two scaled reference images 1512, 1514 of substantially the same size. In an optional step (not shown), gamma correction may be performed on the reference images 1512, 1514 as necessary.

In a second step 1520, the scaled images 1512, 1514 are transformed into grayscale images 1522, 1524. The images 1512, 1514 may have four color graph patches (gray ramp patches) to aid in converting the images from color to gray scale. The process of image downscaling and color down to grayscale is intended to reduce noise and the effect or difference of the demosaicing process between the original images 1502, 1504.

In a third step 1530, a gain factor is determined that matches the maximum brightness of the second image 1524 with the maximum brightness of the first target image 1522. This gain factor is then used to scale the gray level of the second image 1524. In a fourth step 1540, a pairing between the grayscale images 1522, 1524 is performed, wherein the pairing compares the grayscale values pixel by pixel between the images 1522, 1524. In a fifth step 1550, the one-to-one pixel pairs between the grayscale target image 1522 and the grayscale reference image 1524 are then plotted as a coarse tone curve y (x), where x is the grayscale of the reference image 1524 and y is the grayscale of the target image 1522. An example of such a coarse tone curve is shown in graph 1580 of fig. 15B. In a sixth step 1560, interpolation, averaging and smoothing processes are used for the missing gray levels of x and the discontinuity of the curve to refine this coarse tone curve. In addition, any outliers or values caused by errors that are mapped may be removed during this refinement. At final step 1570, a final tone reproduction curve is generated. An example of such a final tone curve is shown in the chart 1590 of fig. 15C. Generally, the tone reproduction curve has an S-shape that enhances the color saturation and brightness contrast of the image to produce a desirable result for each camera model. In one example, the tone reproduction curve may be described by four consecutive cubic polynomials.

Then, for the camera 310 for which the RAW processing is performed, the tone reproduction curve 1590 described above may be used in the automatic color tuning process 342 of fig. 3 when the camera satisfies the initial camera information criterion 1503 associated with the target reference image 1502 used to generate the tone reproduction curve 1590. Accordingly, a plurality of tone reproduction curves 1590 are generated for a plurality of different cameras, camera types, and other criteria. In this manner, when the tone reproduction curve is acquired in block 346 of the process of FIG. 3, the metadata 314 from the camera 310 for the processed image may be used to select an appropriate tone curve from a pool of previously stored and configured tone curves.

In the current embodiment of the process 1500 of FIG. 15A, the final tone reproduction curve is generated using gray scale levels rather than mutually distinct color channels (e.g., R, G, B). When this tone reproduction curve is used in the automatic color tuning process 342 of fig. 3, it is actually applied to each channel in the same manner. In an alternative embodiment, a process similar to that disclosed above may be used to generate a separate tone reproduction curve for each color channel of the image, whereby the automatic color tuning process 342 of FIG. 3 may apply a separate tone reproduction curve independently for each channel in order to implement color tuning. For example, the reference images 1502, 1504 may be subjected to a filtering process to generate separate images for each channel. The image associated with each channel may then undergo stages of processing 1500 to generate an independent tone reproduction curve for the associated channel, which may then be applied to each channel of the processed image.

Additional processing in the flow of raw image processing

A number of additional automated processes 1600 used in the RAW processing from the original RAW image 1610 to the resulting RGB image 1690 are shown in fig. 16A. These automated processes 1600 include deep shadow desaturation processing 1620, luminance enhancement 1650, and RGB separable enhancement 1680. In the RAW image processing flow 300 of fig. 3, the time and position at which these processes are actually performed depend on the details of the camera, the RAW image, and other characteristics.

1. Deep shadow desaturation processing

In one embodiment, the automated deep shadow desaturation process 1620 can be implemented in the first conversion stage 303 from camera RGB to XYZ color space in fig. 3. The process 1620 focuses on RGB color values in a camera RGB image where the luminance values of the image are below a shadow desaturation threshold. Fig. 16B illustrates the shadow desaturation threshold with a line T in the input/output luminance graph 1602. The desaturation process 1620 reduces the saturation of those color values below the threshold T in proportion to the extent to which those depth shaded RGB color values are close to black. The shadow desaturation threshold T may correspond to a brightness level within 1% of the black range.

At the point of desaturationIn principle 1620, the original luminance Lo is calculated for each pixel in the RGB image using standard techniques. The calculated luminance Lo is compared with the shadow desaturation threshold T to determine whether the calculated luminance Lo is below the threshold T. If it is below the threshold T, the interpolated gray-level luminance L is calculated using the interpolation function_G. This interpolated gray-level luminance L is then_GReplacing each original color value R, G and B of the depth shaded pixel. In the interpolation process, the gray-scale luminance L_GPreferably proportional to the degree to which the original luminance Lo of the pixel is close to black, whereby the interpolated gray-level luminance L_GA smooth transition from the depth shadow desaturation threshold T to black. By replacing the depth shadow luminance with interpolated gray level values, the color noise in the image depth shadow can be reduced.

RGB separable enhancement

Automated RGB separable enhancement 1680 may be performed in flow 300 of fig. 3 when the image is in RGB color space. In general, RGB separable enhancement is a preferred technique for enhancing image contrast by adjusting the tone curve of the image by individually modifying the luminance of each R, G, B channel. When contrast is enhanced using RGB separable enhancement in the shadow and lower mid-tone regions of an image, the saturation of the color may be enhanced as a side effect of the RGB separable enhancement. When the image has a lot of noise, this enhancement of the contrast will result in the color of the noise being more prominent and less desirable. The more "monochromatic" (in lines with local colors of the image) the noise appears, the higher the similarity of the noise to the conventional analog film grain. If the RGB separable enhancement enriches the color of the noise, the noise will appear to shift from the local color of the image to red and blue, which is undesirable. In tonal areas where undesirable artifacts occur, the RGB separable enhancement 1680 is replaced with the luminance enhancement described below. Accordingly, in the present embodiment, the RGB separable enhancement 1680 is primarily concerned with higher hues (e.g., those hues that are higher than the mid-tone level indicated by line 1606 in the graph 1602 of FIG. 16B). When applied to these higher hues, RGB separable enhancement 1680 tends to reduce color in the highlight regions of the image.

3, luminance enhancement and tone region conversion

Automated luminance enhancement 1650 may also be performed in flow 300 of fig. 3 when the image is in RGB color space. As with RGB enhancement 1680, luminance enhancement 1650 may also be used to control contrast in the RGB image 1690. Unlike RGB separable enhancement 1680, however, luminance enhancement 1650 does not modify the luminance of each R, G, and B channel individually. Instead, the brightness enhancement 1650 is applied to all three channels simultaneously. Further, luma enhancement 1650 is concerned with transformed tonal areas that are lower than the mid-tonal levels used for RGB separable enhancement 1680. As indicated roughly in the graph 1602 of fig. 16B, the transformed hue region 1604 for the brightness enhancement 1650 is a region between the quarter-hue gradation 1608 and the mid-hue gradation 1606. When applied to this tonal region 1604, the luminance enhancement 1650 tends to reduce the drawbacks associated with the processing of enhancing each channel R, G, B separately in the tonal region 1604 in a separable form.

Fig. 16C shows one embodiment of automated brightness enhancement 1660. First, the luminance values of the pixels in the RGB image are calculated (block 1662), and those luminance values below the maximum luminance value (e.g., the level 1606 in FIG. 16C) are determined (block 1664). As previously described, the defined transformed hue region may be the region 1604 in fig. 16B having the upper limit luminance value 1606 and the lower limit luminance value 1608. For those pixels having the original luminance value that are inside the tonal region 1604, a new luminance value is calculated using interpolation (block 1666).

When processing is performed using the GPU of the processing device, an interpolation function, which may be embodied in a cubic equation, is used for the interpolation processing. On the other hand, when the CPU is used to perform the processing, the interpolation is preferably embodied as a lookup table containing a plurality of entries calculated by a cubic equation. Such a look-up table may have a value of about 65,000 entries. For example, standard techniques are used to calculate R in the original image₁、G₁、B₁Pixels of channel value to have a luminance value L₁。

In the interpolation process of block 1666, the luminance values inside the tonal region are preferably smoothly transformed with those outside the tonal region. To determine a new luminance value L₂Using the calculated luminance L in a piecewise cubic function or look-up table of luminance₁(e.g. L)₂＝Table[L₁]) Wherein the look-up table embodies an estimate of a piecewise cubic function of luminance. In one embodiment, the piecewise cubic equation used by the GPU and used to construct the look-up table for the CPU may be characterized as:

if(x＜lu min ance_threshold_1)

y＝ax³+bx²+cx+d；

else if(lu min ance_threshold_2)

y＝ex³+fx²+gx+h；

else if(lu min ance_threshold_3)

y＝ix³+jx²+kx+l；

else

y＝mx³+nx²+ox+p；

any two adjacent cubic curves are preferably designed to have matching values at the luminance thresholds separating them. Furthermore, the adjacent cubic curves are designed to have matching slopes at the same luminance threshold separating them. In general, the luminance _ threshold _1 is 0.1, the luminance _ threshold _2 is 0.2, and the luminance _ threshold _3 is 0.5. In addition, the value of y at 0 x is zero. This means that the coefficient d is also equal to 0. Where x is 1, the value of y is also 1. This is meaningful for the third power equation defined by the coefficients m, n, o and p.

After interpolating the new luminance values, the new luminance L is calculated₂And old luminance L₁Ratio of the factor to the original color value R (block 1668), and the ratio of the factor to the original color value R₁、G₁、B₁Multiplication to obtain new color values R for the image₂、G₂、B₂Thereby generating a pixel having a brightness enhancement modified color value (block 1670). For example, a modified pixel will have a pixel with a pixel having a pixel with aAnda color value of the token.

As shown in fig. 16B, the transformed tone region 1604 is defined by a minimum luminance value 1608 and a maximum luminance value 1606. Below the minimum luminance value 1608 of the transformed tonal region 1604, the luminance enhancement 1650 described above is used directly to calculate enhanced RGB values for that region. However, inside the converted tone region 1604, blend enhancement is used according to the amount of noise in the image. In hybrid enhancement, both luminance enhancement 1650 and RGB separable enhancement 1680 are evaluated.

Thus, at block 1672 of FIG. 16C, a determination is made as to whether the brightness value of the selected pixel is within the tonal region. If not, then the enhanced RGB values from the previously computed luminance enhancement are used for the given pixel, and process 1600 proceeds to block 1678.

At block 1672, if the original luminance value of the selected pixel is within the transformed tonal region, RGB separable enhancements for the luminance value are calculated 1680 (block 1674). Enhanced RGB values for these luminance values inside the tonal region may then be calculated by interpolating between the previously calculated luminance enhancement 1650 and RGB separable enhancement 1680 according to the position of the original luminance value in the transformed tonal region 1604 (block 1676). The interpolation is calculated using a smoothing function that ranges from "0" at the minimum luminance value 1608 of the transformed tonal region 1604 to "1" at the maximum luminance value 1606 of the transformed tonal region 1604. As previously described, RGB separable enhancement is used solely for values above the maximum luma level 1606, while luma enhancement 1650 is used solely for values below the minimum luma level 1608.

Preferably, the hybrid enhancement that interpolates between RGB separable and luminance enhancement is used only when the image has a given amount of noise. For example, when the image has a small amount of noise, it is preferable to use RGB separable enhancement 1680 for the entire tonal range. However, if the image has an amount of noise above a certain threshold, the hybrid enhancement discussed above is used. If the image has an intermediate amount of noise, then the direct RGB separable enhancement and the hybrid enhancement are computed simultaneously and interpolation is performed between the direct RGB separable enhancement and the hybrid enhancement to determine the resulting enhancement to be used. Once the selected pixels have been enhanced, the process 1600 of block 1678 repeats blocks 1666-1678 for additional pixels in the original image. Finally, a resulting image having modified luminance values can be generated (block 1674).

As previously described, a processing device (e.g., a computer, imaging device, camera, etc.) having an operating system may perform the automated RAW image processing methods, services, and techniques disclosed herein. Further, a program storage device readable by a programmable processing device may have instructions stored therein to cause the programmable processing device to perform the automated RAW image processing methods and techniques disclosed herein.

The above description of preferred and other embodiments is not intended to limit or restrict the scope or applicability of the inventive concepts conceived of by the applicants. For example, although the present disclosure focuses on RGB Bayer patterns and RGB color spaces, it should be appreciated that the teachings of the present disclosure are equally applicable to other embodiments of color filter arrays and color spaces. In exchange for disclosing the inventive principles contained herein, the applicants desire all patent rights afforded by the appended claims. Thus, it is intended that the appended claims cover all such modifications and variations as fall within the true scope of the appended claims or their equivalents.

Claims

1. An automated RAW image processing method implementable by a processing device, the method comprising:

receiving a representation of a target RAW image having photosite pixels each indicating a single color value and metadata indicating an imaging device or capture condition associated with the target RAW image;

obtaining predetermined information for RAW image processing based on the metadata, the predetermined information being obtained by processing at least one RAW image previously obtained with the imaging apparatus or the capturing condition;

pre-processing the target RAW image to produce a resultant image defined in a color space, wherein pre-processing comprises: automatically pre-processing the target RAW image based on the predetermined information with an operating system service of the processing device, and wherein pre-processing the target RAW image comprises: creating a reduced-size image from the target RAW image that is interpolated into the color space; and

making the resulting image available to an application executable on the processing device.

2. The method of claim 1, wherein the act of automatically pre-processing comprises: a first image is generated in a first unrendered color space, wherein the first unrendered color space comprises an unrendered RGB color space.

3. The method of claim 2, wherein the act of automatically pre-processing comprises: the second image is generated in a second unrendered color space based on the first image by applying a transformation matrix to transform RGB tristimulus values of the unrendered RGB color space in the first image into XYZ tristimulus values in the second image.

4. The method of claim 3, wherein the transformation matrix is derived from a plurality of feature matrices associated with the predetermined information.

5. The method of claim 3, wherein the act of automatically pre-processing comprises: based on the second image, the result image is generated in the rendered color space.

6. The method of claim 5, wherein the act of generating the result image in the rendered color space comprises: the XYZ tristimulus values in the second image are transformed into RGB tristimulus values of the rendered RGB color space in the result image using a color-adaptive transformation.

7. The method of claim 5, wherein the act of generating the result image comprises:

obtaining a stored tone curve associated with the predetermined information; and

applying the obtained tone curve to each color channel of the second image to produce the resultant image.

8. The method of claim 1, wherein the reduced-size image comprises: a half-size image based on the target RAW image.

9. The method of claim 1, wherein the predetermined information comprises one or more of: black shift information, saturation information, noise amount, bright point amount, luminance information, information for interpolating the RAW image, information for converting from one color space to another color space, and information for adjusting hue.

10. The method of claim 1, wherein the predetermined information is obtained by processing the at least one RAW image previously obtained with the imaging device or the capture conditions to produce at least one predetermined result, the at least one predetermined result including one or more of: the contrast level, saturation level, number of pixels affected by bright spots, number of pixels affected by noise, brightness level, color level, sharpness level, and hue level in the resulting image.

11. The method of claim 1, wherein the capture conditions comprise one or more of: shutter speed, aperture, white balance, exposure compensation, photometry setting, ISO setting, black offset value, properties of the imaging device, and properties of the RAW image.

12. The method of claim 1, wherein the act of automatically pre-processing comprises:

obtaining one or more black offset values from the predetermined information; and

subtracting the one or more black offset values from each photo-site pixel of the target RAW image.

13. The method of claim 1, wherein the act of automatically pre-processing comprises:

scaling each color channel of the target RAW image based on a white balance indicated by the metadata;

calculating a saturation value at which each color channel reaches saturation based on correlation information obtained from the predetermined information, the correlation information correlating the saturation value with an image;

for each color channel of each pixel, analyzing whether the saturation is higher than the calculated saturation value associated with the color channel; and

based on the analysis, any pixel of the target RAW image is modified.

14. The method of claim 1, wherein the act of automatically pre-processing comprises:

obtaining a bright spot distribution from the predetermined information;

calculating an estimated value of a bright point in the target RAW image using the capturing condition and the bright point distribution; and

processing the target RAW image for a bright spot based on the estimate.

15. The method of claim 1, wherein the act of automatically pre-processing comprises:

obtaining a noise distribution from the predetermined information;

calculating an estimate of noise of the target RAW image using the capture conditions and the noise distribution; and

preprocessing the target RAW image based on the estimated amount of noise.

16. The method of claim 1, wherein the act of automatically pre-processing comprises:

calculating the average brightness of the target RAW image;

obtaining a brightness variable from the predetermined information based on the imaging device, the average brightness, and the capture condition; and

adjusting the brightness of the target RAW image by the determined brightness variable.

17. The method of claim 1, wherein the act of automatically pre-processing comprises:

determining a first luminance value for each first pixel position of an interpolated image interpolated from the target RAW image;

creating a blurred reduced size image from the target RAW image;

adjusting the blurred reduced size image to a blurred full size image;

determining a second luminance value for each second pixel location of the blurred full-size image; and

scaling a first luminance value in the interpolated image to match a second luminance value in the blurred full size image.

18. A processing device, comprising:

at least one processor;

a memory that stores predetermined information for RAW image processing, the predetermined information being associated with an imaging device or a capturing condition previously used to obtain a RAW image and being obtained by processing a previously obtained RAW image;

a device interface module to receive data associated with a RAW image from an imaging device; and

an operating system executable on the at least one processor, the operating system configured to:

receiving, via the device interface module, a target RAW image and associated metadata obtained from an imaging device, the target RAW image having photosite pixels, each photosite pixel indicating a single color value;

determining the predetermined information based on the metadata;

automatically pre-processing the target RAW image with the determined predetermined information to demosaic the target RAW image and generate a resultant image defined in a color space; and

19. An automated RAW image processing apparatus implementable by a processing device, the apparatus comprising:

means for receiving a representation of a target RAW image having photo site pixels each indicating a single color value and metadata indicating an imaging device or capture condition associated with the target RAW image;

means for obtaining predetermined information for RAW image processing based on the metadata, the predetermined information being obtained by processing at least one RAW image previously obtained with the imaging apparatus or the capturing condition;

means for preprocessing the target RAW image to produce a resultant image defined in a color space, wherein preprocessing comprises: automatically pre-processing the target RAW image based on the predetermined information using an operating system service of the processing device, and wherein the means for pre-processing the target RAW image comprises: means for creating a reduced-size image from the target RAW image that is interpolated into the color space; and

means for making the resulting image available to an application executable on the processing device.

20. The apparatus of claim 19, wherein the act of automatically pre-processing comprises: a first image is generated in a first unrendered color space, wherein the first unrendered color space comprises an unrendered RGB color space.

21. The apparatus of claim 20, wherein the act of automatically pre-processing comprises: the second image is generated in a second unrendered color space based on the first image by applying a transformation matrix to transform RGB tristimulus values of the unrendered RGB color space in the first image into XYZ tristimulus values in the second image.

22. The apparatus of claim 21, wherein the transformation matrix is derived from a plurality of feature matrices associated with the predetermined information.

23. The apparatus of claim 21, wherein the act of automatically pre-processing comprises: based on the second image, the result image is generated in the rendered color space.

24. The apparatus of claim 23, wherein the act of generating the result image in the rendered color space comprises: the XYZ tristimulus values in the second image are transformed into RGB tristimulus values of the rendered RGB color space in the result image using a color-adaptive transformation.

25. The apparatus of claim 23, wherein the act of generating the result image comprises:

26. The apparatus of claim 19, wherein the reduced-size image comprises: a half-size image based on the target RAW image.

27. The apparatus of claim 19, wherein the predetermined information comprises one or more of: black shift information, saturation information, noise amount, bright point amount, luminance information, information for interpolating the RAW image, information for converting from one color space to another color space, and information for adjusting hue.

28. The apparatus of claim 19, wherein the predetermined information is obtained by processing the at least one RAW image previously obtained with the imaging device or the capture conditions to produce at least one predetermined result, the at least one predetermined result including one or more of: the contrast level, saturation level, number of pixels affected by bright spots, number of pixels affected by noise, brightness level, color level, sharpness level, and hue level in the resulting image.

29. The apparatus of claim 19, wherein the capture conditions comprise one or more of: shutter speed, aperture, white balance, exposure compensation, photometry setting, ISO setting, black offset value, properties of the imaging device, and properties of the RAW image.

30. The apparatus of claim 19, wherein the act of automatically pre-processing comprises:

31. The apparatus of claim 19, wherein the act of automatically pre-processing comprises:

based on the analysis, any pixel of the target RAW image is modified.

32. The apparatus of claim 19, wherein the act of automatically pre-processing comprises: