Image Processing System, Method and Computer Program Product
The present invention relates to an image data processing system, method, and computer program product.
There exists a significant quantity of recorded image data such as, for example, video, television, films, CD-ROM images and computer generated images . Typically once the images have been recorded or generated and then recorded, it is very time consuming and very difficult to give effect to any changes to the images .
It is often the case that a recorded sequence of images may contain artefacts which were not intended to be included within the sequence of images. Alternatively, artefacts may be added to a sequence of images. For example, it may be desirable to add special effects to the images. The removal or addition of such artefacts is referred to generically as post-production processing and involves a significant degree of skill and time.
The quality of images, in terms of both noise and resolution, recorded upon some form of image carrier, such as a film, video tape, cd-rom, video disc, mass storage medium or the like, depends upon, firstly, the quality of the camera used to capture the image and the recording process and, secondly, upon the quality of the image carrier. In order to improve the quality of the images derived from such carriers, complex digital filtering is typically required which is, again, both time consuming and expensive in terms of supporting hardware.
It is therefore an object of the present invention to at least mitigate some of the problems of the prior art.
Accordingly the present invention provides a method for producing a first image from a second image and a third image within an image processing system comprising storage
means for storing the first image and first image camera data governing the orientation (a, bk, qk) and the first focal length (lk) of the first image, and for storing the second and third images with second and third image camera data corresponding to the second and third orientations ( x bi q , a. b. q:) and second and third focal lengths (l 1.) of a camera when the second and third images were captured, the method comprising the steps of setting the first image camera data; and deriving first image data for the first image selectively from at least the second and third images using the first, second and third image camera data.
_ A second aspect of the present invention provides an image processing system for producing a first image from a second image and third image, the system comprising means for storing the first image and first image camera data governing the first orientation (ak, bk, _,.) of the first image and storing the second and third images with second and third image camera data corresponding to the second and third orientation (aλ bL q a3 b. q.,) and second and third focal lengths (l 1:) of a camera when the second and third images were captured; the system comprising means for setting the first image camera data; and means for deriving first image data for the first image selectively from at least the second and third images using the first, second and third image camera data .
A third aspect of the present invention provides a computer program product for producing a first image from a second image and third image within an image processing system comprising storage means for storing the first image and first image camera data governing the first orientation (ak, bk, qk) and first focal length (lk) of the first image and for storing the second and third images with second and third image camera data corresponding to the second and third orientation (ax h^ qlf a. b. q:) and second and third focal lengths (11( 1.,) of a camera when the second and third images were captured, the computer program product comprising computer program means for setting the first
image camera data; and computer program code means for deriving first image data for the first image selectively from at least the second and third images using the first, second and third image camera data.
Embodiments of the present invention will now be described, by way of example, only with reference to the accompanying drawings in which: figure 1 illustrates a computer suitable for implementing an image processing system and/or method according to the embodiments of the present invention; figure 2 depicts a sequence of source images from which at least one generated image can be produced; figure 3 shows the production of a first image using image data derived from two source image frames i and j , figure 4 illustrates a frame co-ordinate system for an image ; figure 5 shows the mapping of the frame co-ordinate system of figure 4 into a world co-ordinate system; figure 6 shows in terms of a world co-ordinate frame of reference how a vector may intersect several image frames; figure 7 illustrates from another perspective the intersection of a vector with several images frames; figure 8 depicts a frame co-ordinates system and image frame of a virtual image or image to be generated; figure 9 shows the extrapolation of a portion of the image frame of figure 8 onto a source image; figure 10 illustrates a flowchart for mapping the image portion of an image to be generated onto a given source image ; figure 11 illustrates the generation of a clean plate; figures 12 to 15 show the source image frames from which a clean plate can be generated; figure 16 illustrates a partially completed clean plate; figure 17 shows a completed clean plate; figure 18 illustrates the propagation of an edit throughout a sequence of images; figure 19 there is shown a single generated image 1900
having a significantly different aspect ratio as compared to that of the source images ; and figure 20 illustrates the process of removing a scratch from a source image; figure 21 shows the generation of a clean plate from several source images .
Referring to figure 1 there s shown a computer 100 suitable for implementing embodiments of the present invention. The computer 100 comprises at least one microprocessor 102, for executing computer instructions to process data, a memory 104, such as ROM and RAM, accessible by the microprocessor via a system bus 106. Mass storage devices 108 are also accessible v a the system bus 106 and are used to store, off-line, data used by and instructions for execution by the microprocessor 102. Information is output and displayed to a user via a display device 110 which typically comprises a VDU together with an appropriate graphics controller. Data is input to the computer using at least one of either of the keyboard 112 and associated controller and the mass storage devices 108.
Referring to Figure 2, there is illustrated schematically the concept of the present invention. Data for a first image 200, that is an image to be generated or modified, is produced by selecting portions of image data from at least two images 202 and 204 of a sequence of images 206. Each image 1 to N of the sequence of images 206 and the first image to be generated 200 has associated therewith image camera data 208 to 218. The image camera data 208 to 218 governs the position with reference to, for example, a world co-ordinates frame of reference of the images 1 to N and the first image 200. The image camera data represents or is related to the focal length of a camera which was used to capture the images 1 to N as well as the orientation of the camera when the images were captured. For any given image, for example image I 202 the associated image camera data comprises ax and b, which determine the orientation within the world co-ordinates reference of a frame, and
hence indirectly the camera, by way of an axis of rotation. qt represents a degree of rotation about that axis relative to a reference within the world co-ordinate system. The focal length lx of image i 202 governs the distance from the origin, which coincides with the optical centre of the camera to the centre of the image frame i 202.
Referring to Figure 3, there is shown, in greater detail, the derivation of the first image, image k 200, from second and third images, image i 202 and image j 204. The majority of image k 200 is derived from image i, that is to say, image k is identical to image i but for a first image portion 300. The first image portion 300 is derived from a corresponding portion 302 of the second image 204. Therefore, rather than the first image 200 containing the image data contained within a correspondingly located portion 304 of the second image 202, the image data for first image portion 300 is derived from the image data contained within portion 302 of the third image.
Each image 1 to N of the sequence of images 206 is stored with reference to a corresponding image frame coordinate system such as that shown in Figure 4. The image frame co-ordinate system 400, in the present embodiment, comprises three mutually orthogonal axes 402, 404 and 406 which correspond to the x, y and z axis in a right handed Cartesian co-ordinate system respectively. An image 408 is centred on the z-axis 406 at a distance l - which represents the focal length of the camera at the time when the image 403 was captured, generated or retrieved.
With reference to Figure 5, there is shown the image 408 of Figure 4 and the corresponding image frame coordinate system when that co-ordinate system has been mapped into, for example, a world co-ordinate system 500 which also comprises three mutually orthogonal axes 502, 504 and 506 forming a right handed Cartesian co-ordinate system. A pixel Vp within image 408 is mapped from its position within the image frame co-ordinate system to a corresponding
position, Vp1, within the world co-ordinate system by a suitable matrix S. Therefore, the position, in terms of the world co-ordinate system 500, of any pixel within the image 408 can be determined by multiplying the image frame 5 reference co-ordinates of that pixel by the matrix S.
There is shown schematically in Figure 6 the sequence of images 1 to N of Figure 2 when positioned within the world co-ordinate system 500. It can be seen that a vector
10 602, which may represent the line of sight of a virtual camera (not shown) , passes through several images of the sequence of images 206. The point of intersection 604 is illustrated as a single pixel on image frame number 74. It will be appreciated that the frame co-ordinates of the point
15 of intersection of the line of sight 602 of the camera within each of the image frames will vary between image frames. Figure 7 shows a sequence of consecutive but noncontiguous images 700 having identified within each of the images 702 to 708 a corresponding portion 710 of the images
20 as the camera which captured the images 702 to 708 panned about a vertical axis from left to right. It can be seen from Figure 7 that the position of the point of intersection, within the images, varies as between image frames . It can be seen that the location of the 25 corresponding portion 710 traverses the images. Therefore, the position, in terms of frame co-ordinates, of the corresponding portion 710 will also change.
The procedure for determining or selectively obtaining
30 image data of a first image to be generated will now be described. Referring to Figure 8, there is shown a first image to be generated or modified 800 as positioned within a corresponding frame co-ordinate system 802 comprising three mutually orthogonal axes 804, 806 and 808 which form a right 35 handed Cartesian co-ordinate system. The focal length lk of the image 800 is shown along the z axis. A portion 810 of the image 800 to be generated is selected. The portion 810 may correspond to a single pixel of the image 800 or to a group of pixels. The location, or co-ordinates, of the
portion 800 with respect to the frame co-ordinate system 802 is determined. The co-ordinates of the portion 810 are mapped, using an appropriate transformation, into corresponding world co-ordinates of the world co-ordinates system illustrated in Figure 5.
A determination or selection from the sequence of images 206 is then made in order to establish those images from which image data for the first portion 810 can be derived. Each of the source images 1 to N also has associated therewith a matrix RL to RN which transforms that source image from corresponding frame co-ordinates into world co-ordinates as well as the inverse of such a matrix Rx ' λ to RN _1. The inverse matrices map the images 1 to N from the world co-ordinate system 500 into the corresponding frame co-ordinate systems.
Assume that image i has been selected as a source image from which image data for the first image portion can be derived. The location, V -,,, of the first image portion 810 of the first image within the world co-ordinate system is transformed into the frame co-ordinate system for frame i using the inverse matrix R 1 as follows V pι, = R^1 V p, . The result of the mapping is illustrated in Figure 9. It can be seen from Figure 9 that the location V p., of the first portion 810 has fallen short of image i. It is therefore necessary to project the mapped first portion 810 onto image i, that is, it is necessary to determine the corresponding location within image i of the first portion 810. The projection of the first portion 810 onto image i identifies image data 900 of image i which can be used in order to derive the image data of the first image portion 810.
Referring to Figure 10, there is shown schematically a flow chart which implements the mapping of a first image portion 810 of a first image onto a second image portion 900 of a second image, for example image i, in order to determine second image data of the second image portion 900 from which first image data can be derived for the first
image .
It will be appreciated from Figures 6 and 7 that any orientation of the co-ordinate system shown in Figure 8 within the world co-ordinate system may map the first image portion onto several images from which image data can be derived in order to produce the first image data for the first portion 810. Therefore, there will be several projections of the first image portion onto prospective images from which data can be derived. In such a case, the steps 1006 to 1004 of Figure 10 are repeated for all or a selectable portion of the eligible source images 206. An eligible source image is an image which is intersected by the vector, or the extrapolation thereof, produced by the co-ordinates of the first image portion 810 in terms of corresponding frame co-ordinates.
There are various embodiments or combinations of embodiments which can be used in order to determine or derive the image data for the first image portion 810 from the eligible source images. One embodiment determines the distance of each of the projections 900 of the first portion 810 onto the eligible source frames 206 and selects image data according to the distances.
For any given point or portion having co-ordinates P' ' = (x' ' ,y' ' , z ' ' ) , given in terms of a frame co-ordinate system, the location of the point of intersection or the projection of that point onto the image is given by
l
t) . The distance of the projections 900 of the first image portion 810 onto the eligible source images from the corresponding centres of the eligible source images is determined. The image data of the projection 900 which is closest to its corresponding image frame centre is used as the basis for deriving first image data for the first image portion 810. In an alternative embodiment, the image data of the projections 900 of the first image portion 810 onto the eligible source frames 206 is averaged and that average value is used to determine or derive the first image
data for the first image portion 810. In a preferred embodiment, the above calculated average is a weighted according to the distance of any given projection from its corresponding image frame centre.
The above process of selecting an image portion 810 of a first image to be generated and the determination of appropriate image data from eligible or given source images is repeated until a complete first image is formed.
Typically, the image data retrieved from the projections 900 represents the colour or RGB component at the location of the projections within the eligible source images . The colour data may be derived or sampled from an eligible source image using, for example, a bilinear interpolation technique or some other more sophisticated sampling technique. By sampling images and interpolating, the resolution of the generated image can be increased. This has the consequence of improving the image quality, that is, an increase in the resolution of the generated image as compared to the eligible source images can be realised. Furthermore, the noise of images which is attributable to the grain of, for example, a film medium or a video tape used to record the source images can be reduced by generating images using the present invention. Still further, the noise introduced during image capture can be reduced if the data for any given pixel, or portion of the image co be generated, is derived from corresponding portions of several images .
Applications of the present invention to the processing of a sequence of images will now be described.
Clean plate generation As mentioned above it is often the case that a scene which contains only background information, that is, the scene is completely free of actors, is required in order to produce special effects .
Referring to Figure 11 there is shown an image 1100 which has been generated from a plurality of eligible source images in which the image data has been selected from the projections which were closest to the centre of corresponding eligible source images . It can be seen from within the defined area 1102 that this image data selection strategy results in a smearing or blurring of foreground or moving images. It will be appreciated that the source image sequence from which the image 1100 was derived illustrates a person walking past the steps of an entrance to a house. These individual eligible source images or at least a selection of the individual eligible source images from which the image 1100 in Figure 11 can be derived are shown in Figure 12, 13, 14 and 15. Referring more particularly to Figure 12, there is shown a region 1202 from which background image data can be derived in order to remove the smearing which is depicted in Figure 11 within box 1102. The image data derived from the portion of Figure 12 defined by the region 1202 will be sufficient to remove the smearing defined within box 1104 of image 1100, that is the three right most images of the person can be removed by incorporating into the image 1100 the background image data contained within or derived from the identified region 1202 of Figure 12. Similarly, the left most person depicted in Figure 11 can be removed by copying into that region of Figure 11 the image data defined or contained within, for example, the region 1502 of Figure 15. Therefore, by firstly combining the image data of Figures 12, 13, 14 and 15 and then by combining the resultant image 1100 shown in Figure 11 with selected portions of the source images an image 1600 as shown in Figure 16 can be produced.
The image 1600 shown in Figure 16 represents a clean plate, that is, an image which contains only background image data. Figure 17 represents the clean plate 1600 which results from the above processing without the defining box 1102.
Object removal, object placement
Referring to Figure 18 there is shown a sequence of four consecutive but non-contiguous source images 1800 to 1806. A selected portion of one frame 1800 is edited in order to remove, for example, the eagle 1808 using either an appropriate tool for editing or a combination of selectable portions of source image 1800 and, for example, source image 1806 in substantially the same manner as defined above with reference to the clean plate generation process. The edited frame 1810 is utilised in conjunction with the remaining source images 1802 to 1806 in order to generate new images 1812 to 1818. For each of the generated images 1812 to 1818 there is a defined region 1820 to 1826 in which the enclosed image data is derived from a corresponding portion of the edited source frame 1810. It can therefore be seen that the eagle 1808 has been removed from the source frame 1800 and that removal has been propagated throughout subsequently generated images 1812 to 1818. Similarly, a source frame could be edited to include some new artefact and that new artefact can be propagated readily throughout subsequently generated images in substantially the same manner as described above with reference to object removal.
Different camera formats
Using the present invention images can be generated which have a different camera format as compared to the format in which the source images were captured. Referring to Figure 18, again, there is shown four consecutive but non-contiguous source images 1800 to 1806 which have been recorded in a particular format, for example, having a particular aspect ratio. Referring to Figure 19, there is shown a single generated image 1900 having a significantly different aspect ratio as compared to that of the source images of Figure 18. The generated image of 1900 of Figure 19 has been derived from several source images. In this way the aspect ratio of a generated image can be set. The smearing depicted in Figure 11 is also apparent in Figure 19. The image data for the area of the larger aspect ratio image 1900 contained within the box 1904 can be derived from at least one of any of the source images 1800 to 1806.
Therefore, the larger aspect ratio image comprises the background of several source images together with the foreground or action derived from at least one of the source images . The above derivation can be repeated in order co produce a plurality of larger aspect ratio source images or a sequence of larger aspect ratio images .
Scratch/blemish removal
Referring to Figure 20 there is shown a sequence of consecutive but non-contiguous source frames 2002 to 2006 in which one of the source images 2004 comprises a scratch 2008. An image frame 2010 is generated in which the image data within a pre-defined area 2012 of the source frame is derived from a corresponding area of, for example, a source image 2006 in substantially the same manner as described above. Therefore, the scratch 2008 can be removed from the image, thereby improving the quality of the images.
Clearing the streets Referring to Figure 21, there is shown four consecutive but non-contiguous frames 2100 to 2106 each containing a defined region 2108 which, over a significant number of source images, would have substantially constant image data.
The exception to the substantially constant image data arises when a moving object passes through the region 2108.
An image or clean plate 2110 can be generated from all or a selectable subset of the source images as follows. For each pixel within the image 2110 to be generated, corresponding image data is derived or extracted from all or a selectable subset of the eligible source images. It will be appreciated that the derived image data for those regions of each source image which do not comprise moving images will remain substantially constant . The image data derived from the source images for any given pixel or portion of the image to be generated is arranged within, for example, a histogram according to, for example, predeterminable bands of the colour information. The histogram is used to determine the most frequently occurring colour or predeterminable range of colours within the corresponding
portions of the source frames . That most frequently occurring colour is then used for the appropriate portion of the generated image 2110. In this way an image or clean plate 2110 is generated which is completely free of any moving or changing image data .
Clean sequence generation
It will be appreciated that repeated application of the above clean plate generation process or the "Clearing the streets" process can be utilised in order to produce a sequence of generated images which contain only background information. Such a clean sequence is again very useful for compositing and rotoscoping techniques when producing special effects for a film.
Although specific instances of the application of the above invention have been given, it will be appreciated that the present invention is not limited thereto. An image processing system can be realised in which any combination or all of the above features are provided.
The term captured refers to the generation or recording of digital images. It will therefore be appreciated that a CCD camcorder captures digital images and stores the capture image on a suitable tape. It will also be appreciated that capturing an image includes the generation of a digital image within or by a computer or other image/data processing environment .
The images which are generated as a consequence of the present invention may be output together with the image camera data for further processing such as, for example, incorporation within a virtual reality environment or to an environment which combines computer generated images with the generated images. Similarly, camera data taken from a computer animation environment together with the corresponding images, be utilised in order to combine the animated images with the captured images .
The setting of the image camera data for an image includes retrieving the image camera data for an existing image or generating image camera data or retrieving image camera data from some storage medium or receiving image camera data via some transmission medium.
In certain situations the first image to be produced may in fact be the second image, that is, the first image and the second image are one and the same. In such a situation the term "producing an image" includes modifying that image.
The derivation of image data for an image from ac least one other image, preferably two images,- can take place either substantially concurrently or sequentially. Hence although image data may be derived from several source images, each source image may be processed one at a time.
A suitable transformation for mapping data between images or for locating data within an image may be the rotation given below:
T = f.R
where T is the rotation matrix, f is given as x'=x
r.z
r/λ f= y'=y
r.z
r/λ z'=λ and λ represents the focal length of an image and R is given by cosθ) + u
y sinθ cosθ) - u
x sinθ
: ) cosθ + u
2 2 where u =
represents the orientation of an arbitrary axis with respect to a reference frame and θ represent a degree of rotation about the arbitrary axis u with respect to a reference , for example a common reference or another image frame . The components of u, that is u
x, Uy
and u-, are derived from the orientation angles α and β of the arbitrary ax s within a world co-ordinates frame of reference. Therefore, u
* = sin α.cosβ, U
y = sin β and u
z = cos α.cosβ where cc is the angle between the projection of u onto the xz and the z axis and β is the angle between the projection of u and u.
The rotation may represent the mapping between a source image and a target image or may represent the positions of the source and target images within a frame of reference relative to some common reference. The rotation is used to identify the locations within source and target images of corresponding image data, that is, image data which represents the same object or part of an object or the same background feature
The image camera data may be derived from a sequence of images camera using the invention which s the subject of UK patent application no. GB 9721592.5, the content of which is incorporated herein by reference for all purposes and a copy of which is included herein m the appendix.
annex
Image processing system, method and computer program product
The present invention relates to an image processing system, method, and computer program product.
There exists a significant quantity of recorded image data such as, for example, video, television, films, CDROM images and computer generated images. Typically once the images have been recorded or generated and then recorded, it is very time consuming and very difficult to give effect to any -changes to the images .
It is often the case that a recorded sequence of images may contain artefacts which were not intended to be included within the sequence of images. Alternatively, artefacts may be added to a sequence of images. For example, it may be desirable to add special effects to the images. The removal or addition of such artefacts is referred to generically as post-production processing and involves a significant degree of skill and time.
Producing a mosaic of or modifying images which have already been recorded is a very difficult task. The degree of effort required to modify or mosaic images or a sequence of images can be reduced if information is available which relates to at least one of the orientation or the position of a camera at the time when the images were produced. Deriving image camera data from a sequence of images is disclosed in, for example, "Creating Full View Panoramic Image Mosaics and Environment Maps", Szeliski R. , Shum, HY, Computer Graphics Proceedings, Annual Conference Series, pp 251-258, 1997. However, the processing required by the techniques disclosed within the above paper are very complex and require a significant degree of numerically intensive processing in order to solve equations containing eight unknowns .
annex
Often, special effects may be added to an image or other edits may be made either digitally or manually. The placement of such special effects typically relies upon the judgment of the artist or operator giving effect to the edits. Clearly, there is scope for some error in the placement of the special effects as they are made to a clean plate .
It is therefore an object of the present invention to at least mitigate some of the problems of the prior art.
Accordingly the present invention provides a method for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, said method comprising the steps of storing in said memory the first and second images; and deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.
Advantageously, the camera image data for each image within a sequence thereof can be determined. The determined camera image data can then be used in order to give effect to further image processing such as, for example, producing a mosaic of selected images .
A second aspect of the present invention provides an image processing system for determining camera image data for at least one of first and second images of a sequence of images, the system comprising memory for storing said first and second images, means for storing in said memory the first and second images; and means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images .
It will be appreciated that the means referred to above and in the claims can be preferably realised using a
annex computer such as that schematically depicted in figure 1 and suitable program code.
Accordingly, a third aspect of the present invention provides a computer program product for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, the product comprising computer program code means for storing in said memory the first and second images; and computer program code means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images .
Embodiments of the present invention will now be described, by way of example only, with reference to the accompanying drawings in which: figure 1 illustrates a computer suitable for implementing an image processing system or method according to the embodiments of the present invention; figure 2 depicts a sequence of source images from which at least one generated image can be produced; figure 3 shows a world co-ordinate system having an image i positioned therein according to corresponding image camera data α„ β1( θ1 and λ1(- figure 4 illustrates several equations utilised by the present invention; figure 5 shows a flow chart for deriving camera image data, x , βi ,θ' , Λ(=ln(λ ) and μ; and figure 6 illustrates a further equation utilised by the present invention.
Referring to figure 1 there is shown a computer 100 suitable for implementing embodiments of the present invention. The computer 100 comprises at least one microprocessor 102, for executing computer instructions to process data, a memory 104, such as ROM and RAM, accessible by the microprocessor via a system bus 106. Mass storage
annex devices 108 are also accessible via the system bus 106 and are used to store, off-line, data used by and instructions for execution by the microprocessor 102. Information is output and displayed to a user via a display device 110 which typically comprises a VDU together with an appropriate graphics controller. Data is input to the computer using at least one of either of the keyboard 112 and associated controller and the mass storage devices 108.
Referring to Figure 2, there is illustrated schematically several images 1 to N of a sequence of images 200. The images may be derived from a film, video tape, CD- ROM,- video disc or other mass storage medium. Typically, the image data varies as between images as a consequence of a combination of motion of an object within an image and as a consequence of, for example, variations in the position or orientation of a camera during image capture. Typically, only once the images are recorded, the camera orientation information is lost.
Referring to figure 3 there is shown schematically a world co-ordinate system 300 comprising three axes 302, 306 and 308 which correspond to the x, y, and z axes respectively. The position of one image, image j, with reference to another image, image i, can be described in terms of the orientation of a unit vector, u, within the world co-ordinate system in conjunction with a degree of rotation, θ1( about that vector. The orientation of the unit vector within the world co-ordinate system 300 can be specified using two angles, αt and . The combination of αιr βL and ΘL define a rotation which maps image i onto image j or which can be used to identify corresponding portions of images i and j . The distance λ represents the distance from the origin to the centre of the image i. Similarly, the distance λ. represent the distance from the origin to the centre of image j . Conceptually, the camera is positioned such that the optical centre thereof is as close as possible to the origin. The line of sight 308 of the camera is
annex normal to the centre of the image i, which represents in reality the field of view of the camera at a focal length determined by λx .
It will be appreciated that the camera image data αx, β1( θι; and λx is not available when retrieving images from some mass storage media such as, for example, a video disc or CD-ROM or the like.
The relationship between a point Q=(x,y,z), in terms of world co-ordinates, within an image frame or field of view at a first orientation of the camera and the same point Q=(x',y',z') within the field of view at a second orientation of the camera is given by the equations shown in figure 4 where u^, Uy and nz are the components of the unit vector u, θ is the degree of rotation about the unit vector, Qr= (xr,yr, zr) represents the rotation of the point Q about the unit vector u and x' , y' and z' represent the location, in terms of world co-ordinates, of the projection of point Qr onto the second field of view. It will be appreciated that f is the scaling required after rotation of Q by R to project the rotated point onto the field of view of at the second orientation.
Referring to figure 5 there is shown a flow chart for deriving camera image data, x , βt ,θ' and Λ(=ln(λ ), describing the orientation within a world co-ordinate system of a camera which captured each of the images in the sequence of images shown in figure 2, wherein θ' decouples θ and λ using the equation shown in figure 6.
Essentially, the determination of the above parameters utilises a function which compares the colour differences between pixels of any two given images . The first image of the two images is addressed using conventional pixel row and address information. The second image of the two images is addressed using co-ordinates which have been transformed
annex using the matrix T as shown in figure 4. The second image is preferably addressed using sub-pixels so that colour interpolation can be performed. The function which is used to measure the colour difference and hence derive optimal values for , βx ,θ' and Λ is as follows:
ΔC=(1/N)Σ( (δr)2+(δg)2+(δb)2) for all N
where δr = r - r: , δg = gλ - g3 , δb = bL - b3 represent the differences between the RGB colour components any given pixels i and j , and
N is the number of pixels for which the transformed coordinates fall inside the image frame boundary of the second image .
The minimisation process preferably uses the simplex method as is well known within the art (see, for example, "Numerical Recipes in C" , The Art of Scientific Computing, second edition, Press by Teukolsky, Vetterling, Flannery, CUP, IUSBN 0 521 431085, the entire content of which is incorporated herein by reference) .
Referring more particularly to figure 5, two images i and j are selected at step 500 from a sequence of images
200. A portion or sample, P, of pixels from the first image i of the two images is selected at step 502. The sample P preferably comprises 2000 pixels wnich are expressed in terms of pixel rows and columns .
The initial estimates of the camera image data parameters are set at step 504. Preferably, at least three of the parameters α, , βx ,Θ',Λ and μ should be initialised. The initialised parameters are used to establish a first estimate of the matrix T illustrated in figure 4.
At step 506, the pixels of the sample P are transformed, using the current estimate of matrix T, into
annex the second image to produce a second set of sample pixels P' .
The two images i and j are compared at steps 508 and 510 to determine the degree of similarity or alignment therebetween. Preferably, effect is given to the comparison by extracting image data, such as, for example, rgb colour data, from the pixels at sample points P in image i and at sample points P' in image j and determining the differences between the colour data. Still more preferably, the differences are evaluated using the function given above for calculating ΔC.
A determination is made at step 512 as to whether or not a minimum value of ΔC has been reached. If not, processing continues at step 514 in which the estimates or current values of αx , βL , Θ',Λ and μ are altered in accordance with the Simplex method and processing continues again at step 506. If a minimum has been reached, the current estimates for c^ , βx , θ', Λ and μ are stored for the pair of images i and j at step 516. This defines the relationship between corresponding portions of images i and j .
A determination is made at step 518 as to whether or not all images of the sequence of images have been processed. If not, processing continues at step 500 with the selection of another two images from the sequence of images. If all source images have been processed a subset of images or eligible images are selected from the sequence of images according to predeterminable criteria at step 520. Preferably, eligible source images are those pairs of consecutive images in which the relative motion, that is distance, therebetween is at least 0.2 in terms of frame coordinates and for which there are no closer pairs of frames in the sequence . The estimate of the distance between selected images utilised the estimated values of , β, θ, λ and μ.
The re-iteration of steps 500 to 518 for selected or
annex all pairs of eligible images is commenced at step 530. Once it is determined that all eligible images have been processed, average values for Λ and μ are determined at step
540. Preferably, the averages are weighted according to the reciprocals of the final colour difference ΔC between the pairs of images .
Once estimates for Λ and μ have been obtained, these values are fixed and a re-iteration of steps 500 to 518 is repeated for all or a selected portion of the images in the sequence of images using the fixed values of Λ and μ in order to produce refined estimates of α, β, and θ at step 550.
After the above iteration of the processing shown in the flow chart, each image pair has a set of camera image data governing the orientation, focal length and lens distortion of the camera at the instant the pair of images were captured.
For each pair of images a (3x3) relative rotation matrix, Ri: , is calculated, using the estimated values of α, β, θ, and λ. A relative rotation matrix governs the relative rotation between the two images. The orientation of the first or a reference image of the sequence of images is selected and the relative rotation matrices are used to produced a set of matrices of each image relative to the first or reference image by combining or compositing the relative rotation matrices .
The derived rotation matrices or the relative rotation matrices can be used to define the absolute or relative positions in a world co-ordinate system of corresponding images relative to a reference image or relative to an arbitrary common reference.
Preferably, once the relative rotation matrices Rι: have
annex been derived for each pair of images i and j , a set of noncontiguous images, which are preferably substantially evenly distributed throughout the sequence of images, is selected such that there is some overlap between pairs of the selected images. For each consecutive pair of selected images i and j an estimate of the camera image data, α, β, θ, λ and μ, is determined as above to produce rotation matrices S13 . The rotation matrix Sι:) between any two images i and j is calculated from the relative rotation matrices R of the images which are between images i and j in the sequence. The difference between Sι:j and R13 represents an error matrix Eι: such that Sι: =E1-,.RI . Therefore, S^.R^"1 = E13 The inverse matrix R^"1 is calculated by negating the value of θ in the matrix Rι: . The matrix Eι:) , that is the error, is then divided or spread across the relative rotation matrices for all or a selected number of the images which are between the two images i and image j . Assuming that the error is to be spread across n images, the error E13 is expressed in terms of α, β and θ and a rotation defined by α, β and θ/n is applied to each relative rotation matrix for all images between images i and j . The value of n may represent all images which are between the two images i and or n may present a selected number of the images between the two images i and j .
The resulting matrices can then be used for further image processing, such as producing a mosaic of images, clean plate generation, improving the image quality or resolution and the like.
Image capture includes capture of images using camera such as, for example, charge coupled devices, or computer generated images within, for example, a virtual reality context or computer animation context using a virtual camera .
annex
1. A method for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing said first and second images, said method comprising the steps of storing in said memory the first and second images; and deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.
2. " A method as claimed in claim 1, wherein the step of deriving comprises the steps of iteratively comparing selectable portions of the first and second images ; and producing, in response to the comparison, a first estimate of the camera image data;
3. A method as claimed in claim 2, wherein the camera image data comprises a plurality of parameters, and the method further comprises the steps of producing a fixed estimate of at least a selectable one, preferably, two, of the parameters; and re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data .
4. A method as claimed in claim 3 , wherein the step of producing comprises the steps of repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and calculating an average value, preferably a weighted average value, of said at least one of the parameters.
5. A method as claimed in claim 4, wherein the predeterminable criteria relates to the distance between the
annex centres of selected pairs of all or the selected portion of images .
6. A method as claimed in any preceding claim, further 5 comprising the steps of selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and re-iterating the step of deriving using the subset of images as the sequence of images from which the first and 0 second images are selected to produce for each pair or for selected pairs of images subset camera image data.
7. - A method as claimed in claim 6, further comprising the steps of 5 determining a set of error values from the subset camera image data; and refining the camera image data for the first and second images .
0 8. A method as claimed in claim 7, wherein the step of refining the camera image data for the first and second images comprises the step of dividing the error between any images which fall between the first and second within the sequence of images.
25
9. A method substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .
30 10. An image processing system for determining camera image data for at least one of first and second images of a sequence of images, the system comprising memory for storing said first and second images; means for storing in said memory the first and second 35 images ; and means for deriving, as a consequence of a comparison between the first and second images, camera image data for at least one of the first and second images.
annex
11. A system as claimed in claim 10, wherein the means for deriving comprises means for iteratively comparing selectable portions of the first and second images ; and producing, in response to the comparison, a first estimate of the camera image data;
12. A system as claimed in claim 11, wherein the camera image data comprises a plurality of parameters, and the system further comprises means for producing a fixed estimate of at least a selectable one, preferably, two, of the parameters; and
" means for re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data.
13. A system as claimed in claim 12, wherein the means for producing comprises means for repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and means for calculating an average value, preferably a weighted average value, of said at least one of the parameters.
14. A system as claimed in claim 13 , wherein the predeterminable criteria relates to the distance between the centres of selected pairs of all or the selected portion of images .
15. A system as claimed in any of claims 10 to 14, further comprising means for selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and means for re-iterating the step of deriving using the subset of images as the sequence of images from which the first and second images are selected to produce for each
annex pair or for selected pairs of images subset camera image data.
16. A system as claimed in claim 15, further comprising 5 means for determining a set of error values from the subset camera image data; and means for refining the camera image data for the first and second images .
10 17. A system as claimed in claim 16, wherein the means for refining the camera image data for the first and second images comprises
- means for dividing the error between any images which fall between the first and second within the sequence of
15 images.
18. An image processing system substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .
20
19. A computer program product for determining camera image data for at least one of first and second images of a sequence of images within an image processing system comprising memory for storing the first and second images,
25 the product comprising computer program code means for storing in said memory the first and second images; and computer program code means for deriving, as a consequence of a comparison between the first and second 30 images, camera image data for at least one of the first and second images .
20. A product as claimed in claim 19, wherein the computer program code means for deriving comprises computer program
35 code means for iteratively comparing selectable portions of the first and second images ; and producing, in response to the comparison, a first estimate of the camera image data;
annex
21. A product as claimed in claim 20, wherein the camera image data comprises a plurality of parameters, and the product further comprises computer program code means for producing a fixed estimate of at least a selectable one, preferably, two, of the parameters ; and computer program code means for re-iterating the steps of deriving, producing and comparing in order to produce a refined estimate of the camera image data.
22. A product as claimed in claim 21, wherein the computer program code means for producing comprises computer program code means for repeating the step of deriving for all or a selected portion of the images in the sequence of images, preferably, the selected portion of images which satisfy predeterminable criteria; and computer program code means for calculating an average value, preferably a weighted average value, of said at least one of the parameters.
23. A product as claimed in claim 22, wherein the predeterminable criteria relates to the distance between the centres of selected pairs of all or the selected portion of images.
24. A product as claimed in any of claims 19 to 23, further comprising means for selecting a subset of images from the sequence of images which satisfy further predeterminable criteria; and means for re-iterating the step of deriving using the subset of images as the sequence of images from which the first and second images are selected to produce for each pair or for selected pairs of images subset camera image data.
25. A product as claimed in claim 24, further comprising computer program code means for determining a set of
annex error values from the subset camera image data; and computer program code means for refining the camera image data for the first and second images.
26. A product as claimed in claim 25, wherein the computer program code means for refining the camera image data for the first and second images comprises computer program code means for dividing the error between any images which fall between the first and second within the sequence of images.
27. An image processing system substantially as described herein with reference to, and/or as illustrated in, the accompanying drawings .
Abstract
Imaσe Processing System, Method and Computer Program Product
The present invention relates to an image processing system, method and computer program product for processing or generating an camera image data from given at least two source images by iteratively estimating the parameters constituting the camera image data and using a predeterminable comparison between the first and second images as the basis for determining whether or not to continue said iterative estimating.