Multiple View Geometry Richard Hartley and Andrew Zisserman CVPR June 1999 Part I: Single and Two View Geometry The main points covered in this part are: • × A perspective (central) projection camera is represented by a 3 4 matrix. • The most general perspective transformation transformation between two planes (a world plane and the image plane, or two image planes induced by a world plane) is a plane projective transformation. This can be computed from the cor- respondence of four (or more) points. • The epipolar geometry between two views is represented by the fundamental matrix. This can be computed from the correspondence of seven (or more) points. Imaging Geometry Perspective projection Y X X x X y λ y = x Y x O f Z p Z where λ = /f . Z image plane This can be written as a linear mapping between homogeneous coordinates (the equation is only up to a scale factor): X x 1 0 0 0 Y y = 0 1 0 0 Z f 0 0 1 0 1 × where a 3 4 projection matrix represents a map from 3D to 2D. Image Coordinate System Internal camera parameters y cam − k x = x x x 0 cam − k y = y y y p y 0 cam 0 x cam y where the units of k , k x y are [pixels/length]. x x 0 x α x x x x 0 1 cam cam x = y = α y y = K y y 0 cam cam f 1 1 f f where α = f k , α = f k . x x y y Camera Calibration Matrix × K is a 3 3 upper triangular matrix, called the camera calibration matrix: α x x 0 K = α y y 0 1 • There are four parameters: (i) The scaling in the image x and y directions, α and α . x y (ii) The principal point (x , y ), which is the point where the 0 0 optic axis intersects the image plane. • The aspect ratio is α /α . y x World Coordinate System External camera parameters Y cam Z X (cid:13) (cid:14) X cam O Z R t cam cam Y Y cam R, t = (cid:4) Xcam 0 1 Z Z cam O Y 1 1 X Euclidean transformation between world and camera coordinates • × R is a 3 3 rotation matrix • × t is a 3 1 translation vector Concatenating the three matrices, (cid:13) (cid:14) X x 1 0 0 0 R t Y | x = y = K 0 1 0 0 = K [R t] X (cid:4) 0 1 Z 1 0 0 1 0 1 × which defines the 3 4 projection matrix from Euclidean 3-space to an image as (cid:4) | | x = PX P = K [R t] = KR[I R t] (cid:4) − (cid:4) Note, the camera centre is at ( , , ) = R t. X Y Z × In the following it is often only the 3 4 form of P that is important, rather than its decomposition. A Projective Camera The camera model for perspective projection is a linear map between homogeneous point coordinates X x Y × y P (3 4) Z 1 1 Image Point Scene Point x = P X • The camera centre is the null-vector of P | (cid:4) e.g. if P = [I 0] then the centre is X = (0, 0, 0, 1) . • P has 11 degrees of freedom (essential parameters). • P has rank 3. What does calibration give? • K provides the transformation between an image point and a ray in Euclidean 3-space. • Once K is known the camera is termed calibrated. • A calibrated camera is a direction sensor, able to measure the direction of rays — like a 2D protractor. x α x X x 0 cam x = y = α y = Kd Y y 0 cam 1 1 Z cam Angle between rays d 1 x 1 C d .d 1 2 θ cos θ = 1 1 x 2 (d .d ) (d .d ) 2 2 1 1 2 2 d 2 (cid:4) (cid:4) −(cid:4) − 1 d d x (K K )x 1 2 1 2 cos θ = = (cid:4) (cid:4) (cid:4) −(cid:4) − (cid:4) −(cid:4) − (d d )1/2(d d )1/2 (x (K K 1)x )1/2(x (K K 1)x )1/2 1 1 2 2 1 1 2 2 (cid:4) x ωx 1 2 = (cid:4) (cid:4) (x ωx )1/2(x ωx )1/2 1 1 2 2 (cid:4) − 1 where ω = (KK ) .
Description: