AN ABSTRACT OF THE THESIS OF Jeremy Gragg for the degree of Honors Baccalaureate of Science in Business Administration presented on August 22, 2008. Title: Exploration and Analysis of Information Visualization Techniques Applied to the TeachEngineering Digital Library. Abstract approved: ______________________________________________________ Byron Marshall The rapid development of computing technologies in recent decades has allowed for data to be generated at unprecedented rates. As a result, users are challenged with the task of finding the precise information that they are looking for. Visualization techniques provide the ability to view and analyze data in different ways, helping users to find what they are looking for and gain additional insights. TeachEngineering is a digital library system where K-12 educators can search for high-quality lesson plans that meet national and state-specific educational standards. Users have several different types of criteria to search by, including: grade level, cost, time required, group size, keywords, source state, and educational standards. Being able to see how individual documents compare to specific variables can help identify additional results that are worthy of user consideration. Surveying visualization techniques, specific methods (scatterplots, parallel coordinates, star glyphs and hyperbolic trees) are qualitatively analyzed with regards to how they could be applied to TeachEngineering search query results. Each visualization type has advantages and limitations, and no single technique can be considered the definitive solution. Recommendations on conducting future experiments and research are included. Key Words: Information Visualization, Digital Library, TeachEngineering, Search Relevance, Interface Design Corresponding e-mail address: [email protected] ©Copyright by Jeremy Gragg August 22, 2008 All Rights Reserved Exploration and Analysis of Information Visualization Techniques Applied to the TeachEngineering Digital Library by Jeremy Gragg A PROJECT submitted to Oregon State University University Honors College in partial fulfillment of the requirements for the degree of Honors Baccalaureate of Science in Business Administration (Honors Scholar) Presented August 22, 2008 Commencement September 2008 Honors Baccalaureate of Science in Business Administration project of Jeremy Gragg presented on August 22, 2008. APPROVED: ________________________________________________________________________ Mentor, representing Business Administration ________________________________________________________________________ Committee Member, representing Business Administration ________________________________________________________________________ Committee Member ________________________________________________________________________ MIS Option Coordinator, College of Business ________________________________________________________________________ Dean, University Honors College I understand that my project will become part of the permanent collection of Oregon State University, University Honors College. My signature below authorizes release of my project to any reader upon request. ________________________________________________________________________ Jeremy Gragg, Author i ACKNOWLEDGMENT This project has been a long and sometimes painful process, and I have several people to thank for its successful completion. Foremost, thanks to Professor Byron Marshall, who served as my mentor on this project and helped keep me focused on the task at hand; every time we had a discussion, I came out of it with optimism and motivation. Thanks to Professor Rene Reitsma for helping me identify my thesis topic, suggesting literature to start with, and sitting on my committee. Thanks to Richard Van Winkle for taking the time and effort to sit as the third member of my committee, particularly given the long distance that you had to travel. Thanks to all three of you for working with me on an accelerated schedule, providing feedback, and challenging me to create a quality product that I can be proud of. Thanks to the Honors College for their guidance during my five years at Oregon State University. I am truly glad to have been a part of the program and to not have quit, although it sometimes seemed much easier to do so. Nearly all of my favorite professors and courses were part of my Honors College experience. Lastly, thanks to my mother, Jane, for her unwavering support and patience throughout the years. If there is anyone I owe my success to, on this project or otherwise, it is you. ii TABLE OF CONTENTS Page INTRODUCTION ............................................................................................................ 1 TEACH ENGINEERING ................................................................................................. 4 General Considerations & Limitations ................................................................. 6 Database Size ........................................................................................................ 6 Quantity of Search Results.................................................................................... 7 Document Visualization vs. Information Retrieval Techniques ........................... 7 Variables ............................................................................................................... 9 VISUALIZATION TECHNIQUES................................................................................ 11 Scatterplot ........................................................................................................... 11 Scatterplot Matrix ................................................................................... 13 Hyperslice ............................................................................................... 15 Parallel Coordinates ............................................................................................ 16 Andrews Plot ........................................................................................... 19 Parallel Coordinates Box-And-Whisker Plot .......................................... 20 Star Plot ............................................................................................................... 21 Parallel Star Glyph .................................................................................. 23 Data Meadow .......................................................................................... 25 TileBar ................................................................................................................ 27 Tree Structures .................................................................................................... 29 Cone Tree ................................................................................................ 30 Tree Maps ............................................................................................... 31 Hyperbolic Browser ................................................................................ 33 Brushing .............................................................................................................. 34 iii TABLE OF CONTENTS (Continued) Page CASE STUDY ANALYSIS ........................................................................................... 36 Scatterplot ........................................................................................................... 37 Scatterplot Matrix ............................................................................................... 46 Parallel Coordinates ............................................................................................ 47 Starplot ................................................................................................................ 52 Parallel Star Glyph .............................................................................................. 55 Hyperbolic Browser ............................................................................................ 58 Additional Approaches ....................................................................................... 63 DISCUSSION & RECOMMENDATIONS ................................................................... 66 BIBLIOGRAPHY ........................................................................................................... 71 APPENDIX: DATA TABLES ....................................................................................... 75 iv LIST OF FIGURES Figure Page 1 Scatterplot matrix displaying six variables and brushing .................................. 14 2. Triangular scatterplot containing the iris dataset ................................................ 15 3. The concept of Hyperslice for N = 3 .................................................................. 16 4. The principle of parallel coordinate plots .......................................................... 17 5. The principle of parallel coordinate plots applied to six variables ..................... 17 6. Box plots graphed onto parallel coordinates ....................................................... 20 7. Six data lines plotted on parallel coordinates and representing box-and-whisker plots ......................................................................................... 21 8. Example of a starplot with eight variables .......................................................... 22 9. Star glyph with seven variables .......................................................................... 22 10. 3D star glyphs representing dimensions ............................................................. 24 11. Star Glyphs, each representing one object as opposed to one dimension ..................................................................................................... 25 12. Sample DataRose visualizations ......................................................................... 26 13. TileBar indicating the relative occurrence of topic segments within a single document. ................................................................................... 27 14. A typical set of TileBars for a collection of documents ..................................... 28 15. Terminology associated with trees...................................................................... 30 16. Cone-Tree representation of hierarchical data .................................................... 30 17. The mapping of a Tree Map from a tree ............................................................. 31 18. Tree map displaying the hierarchy of the NBA .................................................. 32 19. Results of manipulating the Hyperbolic Browser ............................................... 34 20. Brushed points are highlighted on all plots......................................................... 34 v LIST OF FIGURES (Continued) Figure Page 21. Scatterplot plotting keyword relevance against grade level ............................... 38 22. Scatterplot demonstrating relevant results outside of specified range. ............... 39 23. Brushing for additional dimensions on a scatterplot........................................... 40 24. Icon and color used to represent four dimensions in a 2-D scatterplot representation ...................................................................................................... 40 25. Clutter impacting the effectiveness of a scatterplot ............................................ 41 26. Filtering scatterplot results using criteria. ........................................................... 42 27. Attempt to encapsulate five-dimensions on a 2D scatterplot.............................. 42 28. Overlapping scatterplot points. ........................................................................... 44 29. Adjusting overlapping points to sit horizontal to each other in a data column ......................................................................................................... 44 30. Using whiskers on scatterplots to specify documents’ appropriate grade range .......................................................................................................... 45 31. Triangular scatterplot matrix using TeachEngineering variables ....................... 46 32. Three variables arranged in a scatterplot matrix ................................................. 47 33. Data represented via improper parallel coordinates ............................................ 48 34. Parallel Coordinates representation with axes scales adjusted ........................... 50 35. Fifteen data items plotted on a single parallel coordinates plot. ......................... 51 36. Star plot of a single data item representing perfect fit across six dimensions .......................................................................................................... 53 37. Star plot containing data of five series. ............................................................... 53 38. Overcrowded star plot containing fifteen data items .......................................... 54 39. Search results incorporating star plots ................................................................ 55 vi LIST OF FIGURES (Continued) Figure Page 40. Front and side views of parallel star glyphs........................................................ 56 41. Selecting a star glyph representing a document .................................................. 57 42. Mapping TeachEngineering variables into a tree hierarchy ............................... 59 43. Using color brushing on tree branches to demonstrate document relevance. ............................................................................................................ 60 44. User-defined hierarchy structure of a hyperbolic tree using Teach Engineering variables.......................................................................................... 62 45. Library of Congress structure and items represented in the Hyperbolic Browser ............................................................................................ 63
Description: