[48] Koji Wakimoto, Mitsuhide Shima, Satoshi Tanaka, and Akira Maeda. An intelligent user interface to an image database using a (cid:12)gure interpretation method. In 9th Int. Conference on Pattern Recognition, volume 2, pages 516{991,1990. 27 shape. SPIE 1993 International Symposium on Electronic Imaging: Science & Technology, Conference 1908, Storage and Retrieval for Image and Video Databases, February 1993. [35] J. Nievergelt, H. Hinterberger, and K.C. Sevcik. The grid (cid:12)le: an adaptable, symmetric multikey (cid:12)le structure. ACM TODS, 9(1):38{71,March 1984. [36] NSF. Workshop on Visual Information Management Systems, Redwood, CA, February 1992. Chair Prof. R. Jain, Co-Chair W. Niblack. [37] Michael Otterman. Approximate matching with high dimensionality r-trees. M.Sc. scholarly paper, Dept. of Computer Science, Univ. of Maryland, CollegePark, MD,1992. supervised by C. Faloutsos. [38] WilliamK. Pratt. Digital Image Processing. John Wileyand Sons, Inc, New York, NY,second edition, 1991. [39] C. R. Rao. Linear Statistical Inference and Its Applications. Wiley Series In Probability and Mathematical Statistics. John Wiley & Sons, New York, second edition, 1973. [40] G.SaltonandM.J.McGill.Introduction to Modern InformationRetrieval. McGraw-Hill,1983. [41] H. Samet. The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1989. [42] D.G. Severance and G.M. Lohman. Di(cid:11)erential (cid:12)les: Their application to the maintenance of large databases. ACM TODS, 1(3):256{267,September 1976. [43] Dennis Shasha and Tsong-Li Wang. New techniques for best-match retrieval. ACM TOIS, 8(2):140{158,April 1990. [44] SPIE. Storage and Retrieval for Image and Video Databases, San Jose, CA, 1993. Vol. 1908. [45] Michael J. Swain and Dana H. Ballard. Color indexing. International Journal of Computer Vision, 7(1):11{32,1991. [46] Hideyuki Tamura, Shunji Mori, and Takashi Yamawaki. Texture features corresponding to visual perception. IEEE Transactions on Systems, Man, and Cybernetics,SMC-8(6):460{473, 1978. [47] SatoshiTanaka,MitsuhideShima,Jun'ichiShibayama,andAkiraMaeda.Retrievalmethodfor an imagedatabasebased ontopologicalstructure. In Applications of Digital Image Processing, volume 1153, pages 318{327. SPIE, 1989. 26 [22] M. A. Ireton and C. S. Xydeas. Classi(cid:12)cation of shape for content retrieval of images in a multimedia database. In Sixth International Conference on Digital Processing of Signals in Communications, pages 111 { 116, Loughborough, UK, 2-6 Sept., 1990. IEE. [23] H. V. Jagadish. Spatial search with polyhedra. Proc. Sixth IEEE Int'l Conf. on Data Engi- neering, February 1990. [24] H. V. Jagadish. A retrieval technique for similar shapes. In International Conference on Management of Data, SIGMOD 91, pages 208{217, Denver, CO, May 1991. ACM. [25] T. Kato, T. Kurita, H. Shimogaki, T. Mizutori, and K. Fujimura. A cognitive approach to visual interaction. In International Conference of Multimedia Information Systems, MIS'91, pages 109{120. ACM and National University of Singapore, January 1991. [26] Yehezkel Lamdan and Haim J. Wolfson. Geometric hashing: A general and e(cid:14)cient model- basedrecognitionscheme. In2nd InternationalConferenceon Computer Vision(ICCV),pages 238{249, Tampa, Florida, 1988. IEEE. [27] Suh-Yin Lee and Fang-Jung Hsu. 2d c-string: A new spatial knowledge representation for image database systems. Pattern Recognition, 23(10):1077{1087,1990. [28] Suh-Yin Lee and Fang-Jung Hsu. Spatialreasoning and similarityretrieval of images using 2d c-string knowledge representation. Pattern Recognition, 25(3):305{318,1992. [29] M.D. McIlroy. Development of a spelling list. IEEE Trans. on Communications, COM- 30(1):91{99,January 1982. [30] Rajiv Mehrotra and William I. Grosky. Shape matching utilizing indexed hypotheses genera- tion and testing. IEEE Transactions on Robotics and Automation, 5(1):70{77,1989. [31] David Mumford. The problem with robust shape descriptions. In First International Confer- ence on Computer Vision, pages 602{606,London, England, June 1987. IEEE. [32] David Mumford. Mathematical theories of shape: Do they model perception ? In Geometric Methods in Computer Vision, volume 1570, pages 2{10. SPIE, 1991. [33] A. Desai Narasimhalu and Stavros Christodoulakis. Multimedia information systems: the unfolding of a reality. IEEE Computer, 24(10):6{8,October 1991. [34] W.Niblack,R.Barber,W.Equitz,M.Flickner,E.Glasman,D.Petkovic,P.Yanker,C.Falout- sos, and G. Taubin. The QBIC project: Querying images by content using color, texture, and 25 [9] Zen Chen and Shinn-Ying Ho. Computer vision for robust 3d aircraft recognition with fast library search. Pattern Recognition, 24(5):375{390,1991. [10] S. Christodoulakis, M. Theodoridou, F. Ho, M. Papa, and A. Pathria. Multimedia document presentation,informationextractionand documentformationinminos: amodeland asystem. ACM TOOIS, 4(4), October 1986. [11] R. Duda and P Hart. Pattern Classi(cid:12)cation and Scene Analysis. Wiley, New York, 1973. [12] W. Equitz. Retrieving images from a database using texture | algorithms from the QBIC system. Research report, IBM Almaden Research Center, San Jose, CA, 1993. [13] W. Equitz, W. Niblack, and R. Barber. Retrieving images from a database using color | algorithms from the QBIC system. Research report, IBM Almaden Research Center, San Jose, CA, 1993. [14] C. Faloutsos. Signature-based text retrieval methods: a survey. IEEE Data Engineering, 13(1):25{32,March 1990. [15] EdwardA.Fox.Advancesininteractivedigitalmultimediasystems.IEEEComputer,24(10):9{ 21, October 1991. [16] K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, second edition, 1990. [17] William I. Grosky, Peter Neo, and Rajiv Mehrotra. A pictorial index mechanism for model- based matching. Data and Knowledge Engineering, 8:309{327,1992. [18] A. Guttman. R-trees: a dynamic index structure for spatial searching. Proc. ACM SIGMOD, pages 47{57, June 1984. [19] KyojiHirataandToshikazuKato.Querybyvisualexample.InAdvancesinDatabaseTechonol- ogy EDBT '92, Third International Conference on Extending Database Technology, Vienna, Austria, March 1992. Springer-Verlag. [20] G.M.Hunter and K. Steiglitz. Operations on imagesusing quad trees. IEEE Trans. on PAMI, PAMI-1(2):145{153,April 1979. [21] MikihiroIoka. A method of de(cid:12)ning the similarityof images on the basis of color information. Technical report RT-0030, IBM Tokyo Research Lab, 1989. 24 so it is clear that to minimize z~tA~z~ we must choose the smallest of the candidate (cid:21)'s. In other words, we must choose (cid:21)= (cid:21)1, the smallest eigenvalue, which tells us that min z~tA~z~= (cid:21)1C: z~:z~tW~z~=C Now,forany z~, z~tW~ z~takeson some value ((cid:21)0), and when thatvalue is strictlygreater than 0 we use this value as our C, and by the above we know that z~tA~z~(cid:21) (cid:21)1z~tW~ z~: Whenz~tW~ z~= 0,werelyonA~beingpositivesemi-de(cid:12)niteandthe inequalitystillholds. Consequently, 2 2 dhist (cid:21) (cid:21)1davg; where (cid:21)1 is the smallest eigenvalue of the generalized eigenvalue problem A~z~= (cid:21)W~ z~. References [1] ACM SIGIR. Proceedings of International Conference on Multimedia Information Systems, Singapore, 1991. [2] Franz Aurenhammer. Voronoi diagrams- a survey of a fundamental geometricdata structure. ACM Computing Surveys, 23(3):345{405,September 1991. [3] D. Ballard and C. Brown. Computer Vision. Prentice Hall, 1982. [4] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The r*-tree: an e(cid:14)cient and robust access method for points and rectangles. ACM SIGMOD, pages 322{331, May 1990. [5] ElizabethBinaghi,IsabellaGagliardi,andRaimondoSchettini. Indexingandfuzzy logic-based retrievalof colorimages. In Visual Database Systems, II, IFIP TransactionsA-7, pages 79{92. Elsevier Science Publishers, 1992. [6] W. E. Blanz, D. Petkovic, and J. L. Sanz. Algorithms and Architectures for Machine Vision. 1989. [7] C. C. Chang and S. Y. Lee. Retrieval of similar pictures on pictorial databases. Pattern Recognition, 24(7):675{680,1991. [8] Chin-Chen Chang and Tzong-Chen Wu. Retrieving the most similar symbolic pictures from pictorial databases. Information Processing and Management, 28(5):581{588,1992. 23 and a~ij = aij (cid:0)aiN (cid:0)aNj +aNN: (32) Now, if we let W = VVt, and de(cid:12)ne W~ analogously to A~, we have 2 t dhist = ~z A~z (33) = z~tA~z~ (34) 2 t davg = ~zVV ~z (35) t = ~z W~z (36) = z~tW~ z~; (37) and we are ready for the following theorem: Theorem 7.2 With dhist and davg de(cid:12)ned as above, and with A~ positive semi-de(cid:12)nite, we know that 2 2 dhist (cid:21) (cid:21)1davg; where (cid:21)1 is the minimum eigenvalue of the generalized eigenvalue problem A~z~= (cid:21)W~ z~: Proof: Wewillshowthatz~tA~z~(cid:21) (cid:21)1z~tW~ z~,(asimilarrelationisposedasexercise22.1in [39]). For the sake of simplicity,we will assume A is symmetric (with real eigenvalues), since a non-symmetric A can be decomposed to a symmetric part and an asymmetric part, with the asymmetric part contributing nothing to the quadratic form. For any C > 0, we set up the constrained minimization problem min z~tA~z~: z~:z~tW~z~=C Use Langrange multipliers to convert this to the unconstrained minimization problem (cid:16) (cid:17) minz~tA~z~(cid:0)(cid:21) z~tW~ z~(cid:0)C : z Setting the derivatives equal to zero yields A~z~= (cid:21)W~ z~; so a necessary condition for minimizing the function is that (cid:21) and z~ be generalized eigenvalues and eigenvectors for the problem A~z~ = (cid:21)W~ z~. Now from this necessary condition we can see that when the function is minimized z~tA~z~ = (cid:21)z~tW~ z~ = (cid:21)C; 22 2 3 A1 A2 (cid:1)(cid:1)(cid:1) AN 6 7 t 6 7 V = 6 B1 B2 (cid:1)(cid:1)(cid:1) BN 7 (17) 4 5 C1 C2 (cid:1)(cid:1)(cid:1) CN = (~c1;~c2;:::;~cN); (18) average color distance: t ~xavg = V ~x (19) t ~yavg = V ~y (20) 2 t davg = (~xavg (cid:0)~yavg) (~xavg (cid:0)~yavg) (21) t t t = (V ~z) (V ~z) (22) t t = ~z VV ~z (23) and histogram distance: aij = similarity between color i and color j (24) 2 t dhist = ~z A~z: (25) To prove the bound, (cid:12)rst change the N (cid:2) N problem to an (N (cid:0) 1) (cid:2) (N (cid:0) 1) problem, P removing the constraint that zi = 0. De(cid:12)ne z~as the truncated version of ~z, 2 3 z1 6 7 6 7 6 z2 7 z~= 66 . 77; (26) 6 .. 7 4 5 zN(cid:0)1 and de(cid:12)ne AN(cid:0)1;~a(cid:3)N;~atN(cid:3), and~1 appropriately so that you can expand ~ztA~z as: 2 32 3 h i ~ztA~z = z~t zN 4 AN(cid:0)1 ~a(cid:3)N 54 z~ 5 (27) t ~aN(cid:3) aNN zN 2 32 3 h i AN(cid:0)1 ~a(cid:3)N z~ = z~t (cid:0)z~t~1 4 54 5 (28) ~atN(cid:3) aNN (cid:0)~1tz~ = z~tAN(cid:0)1z~(cid:0)z~t~a(cid:3)N~1tz~(cid:0)z~t~1~atN(cid:3)z~+z~t~1aNN~1tz~ (29) h i = z~t AN(cid:0)1(cid:0)~a(cid:3)N~1t(cid:0)~1~atN(cid:3)+aNN~1~1t z~: (30) Now we can de(cid:12)ne the (N (cid:0)1)(cid:2)(N (cid:0)1) matrix A~ such that ~ztA~z = z~tA~z~, with h i A~= AN(cid:0)1(cid:0)~a(cid:3)N~1t(cid:0)~1~atN(cid:3)+aNN~1~1t ; (31) 21 database indexing tools work together. The result of this joint e(cid:11)ort is an operational prototype that is both e(cid:11)ective and e(cid:14)cient: the features and the similarity functions exhibit very good normalized recall, and the response time is much better than straightforward alternatives and is expected to scale-up well for even larger databases, thanks to state-of-the-art multikey indexing. For our application, we have identi(cid:12)ed two stumbling blocks that real feature and distance functions often present: (a) the distance function involves cross talk among features and therefore is not Euclidean, and (b) the dimensionality of the feature space is high. We proposed solutions to each of these by using transformations that introduce false hits, but no false dismissals. This allowsus to use fast multidimensionalindexing methods, by paying a small post-processing cost to eliminate the false hits. From a practical point of view, our experiments on real databases showed the following: (cid:15) for the Karhunen Loeve (KL) transform, a good cut-o(cid:11) value m for practical cases may be very low { in our case it was 2. That is, the (cid:12)rst 2 features of the KL transform are enough for indexing purposes. (cid:15) byusingthe \QuadraticDistanceBounding" theorem,wereplaced adistancefunctionwhich wasquadraticinthe featuredimension withaconstanttime(cid:12)lteringfunction, followedby se- lectiveapplicationoftheoriginaldistancefunction. Forlargecolorhistograms,thistechnique drops the CPU time, making the process I/O bound again. Acknowledgments: We would like to thank Jim Hafner and Harpreet Sawhney for their help with the proof of the QDB theorem. Appendix | Proof of Color Distance Theorem De(cid:12)ne image histogram vectors: X ~x = (x1;:::;xN); 0 (cid:20) xi (cid:20) 1; xi = 1 (13) i X ~y = (y1;:::;yN); 0(cid:20) yi (cid:20) 1; yi = 1 (14) i X ~z = ~x(cid:0)~y; (cid:0)1 (cid:20) zi (cid:20)1; zi = 0; (15) i color values (for example, A=Red, B=Green, C=Blue): 2 3 Ai 6 7 6 7 ~ci = 6 Bi 7 (16) 4 5 Ci 20 The second experiment determines how our method scales with database size. In this exper- iment we took a random subset of the \Large" database. Again we inserted the truncated features (cid:3) into an m-dimensional R -tree and the KL transformed version of the features into another m- (cid:3) dimensional R -tree. 1041 range queries were made with tolerance (cid:15)=200. Figure 4 plots the I/O operations for the naive approach and the KL approach for several values of m. 2000 1800 1600 ’m=2’ d ’m=4’ e ’m=6’ av 1400 s O I/ 1200 f o r 1000 e b m nu 800 e g ra 600 e v A 400 200 0 2000 3000 4000 5000 6000 7000 8000 9000 10000 11000 Database size Figure 4: Average Saved Disk I/O's per query, vs. database size N Asshowninthegraph,byKLtransformingthedatawegainalinearadvantageinthenumber of I/O's per query. Also, as m increases the slope of our linear advantage decreases. This can be explained by the fact that as more features are used the index becomes larger and more expensive to traverse while there is a diminishing amount of information in the remaining features. 7 Conclusions Largeonlineimagecollectionsarebecomingmoreandmorecommon,andnovelmethodstomanage, organize, and retrieve images from these collections need to be developed. The goal of our QBIC project was to provide e(cid:11)ective and e(cid:14)cient query by image content on very large image database. The focus of this paper is the design and implementation of such a system, coupling a good set of features and similarity functions from machine vision, with fast indexing methods from the databasearea. Themajorcontributionistheintroductionoftechniquestomakemachinevisionand 19 (cid:3) dimensions and inserted them into an m-dimensional R -tree. We then performed range queries against the databases. For the small database the queries consisted of the database itself. For the large database the queries were 1041 random records in the database. Since the CPU time was negligible, compared to the I/O time we present number of disk accesses in our graphs. Figure 3 showsthe numberofdiskaccessesasafunctionofm,the numberofretainedfeatures. Thisnumber (cid:3) includes both the accesses for the R -tree pages plus the random accesses todo the post processing to eliminate false hits. The tolerance (cid:15) was set to 200, resulting in an average of 22 hits/query for the large database and 2 hits/query for the small database. The conclusions are: (cid:15) The optimal m is (cid:25)2. This is in good agreement with the heuristic presented (Eq. 12). In both databases, 75% of the energy is in the (cid:12)rst two components. For larger tolerances the optimal is not as well de(cid:12)ned, but generally there is a (cid:13)at region for m between 2 and 8. (cid:15) Transforming the data achieves signi(cid:12)cant savings over non-transformed truncation. In, fact performance is better even if we miss the optimal m, i.e. we are better o(cid:11) with a 12-m KL index than with any non-transformed index. Note also that the absolute performance is better on an optimally created database than naively created database 6 times smaller. 10000 ’Large’ ’Large_KL’ ’Small’ ’Small_KL’ y r e u q r pe 1000 s ’ O / I f o r e b m u n 100 e g a r e v A 10 0 2 4 6 8 10 12 14 16 18 20 Number of dimensions Figure 3: Average Disk I/O's per query, vs. dimensions kept m; (cid:15)=200 18
Description: