ebook img

Data Mining With SQL Server 2005 PDF

483 Pages·2005·6.35 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Data Mining With SQL Server 2005

01_462616 ffirs.qxd 9/2/05 4:43 PM Page i Data Mining with SQL Server 2005 ZhaoHui Tang and Jamie MacLennan 01_462616 ffirs.qxd 9/2/05 4:43 PM Page vi 01_462616 ffirs.qxd 9/2/05 4:43 PM Page i Data Mining with SQL Server 2005 ZhaoHui Tang and Jamie MacLennan 01_462616 ffirs.qxd 9/2/05 4:43 PM Page ii Data Mining with SQLServer 2005 Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2005 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN-13: 978-0-471-46261-3 ISBN-10: 0-471-46261-6 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 1O/SR/QZ/QV/IN No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copy- right Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Legal Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty:The publisher and the author make no repre- sentations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation warranties of fit- ness for a particular purpose. No warranty may be created or extended by sales or promo- tional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in ren- dering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an orga- nization or Website is referred to in this work as a citation and/or a potential source of fur- ther information does not mean that the author or the publisher endorses the information the organization or Website may provide or recommendations it may make. Further, read- ers should be aware that Internet Websites listed in this work may have changed or disap- peared between when this work was written and when it is read. For general information on our other products and services or to obtain technical support, please contact our Customer Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. Trademarks: Wiley, the Wiley logo, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc., is not associated with any prod- uct or vendor mentioned in this book. 01_462616 ffirs.qxd 9/2/05 4:43 PM Page iii To everyone in my extended family —ZhaoHui Tang To April, my kids, and my Mom and Dad —Jamie MacLennan 01_462616 ffirs.qxd 9/2/05 4:43 PM Page iv About the Authors ZhaoHui Tangis a Lead Program Manager in the Microsoft SQLServer Data Mining team. Joining Microsoft in 1999, he has been working on designing the data mining features of SQLServer 2000 and SQLServer 2005. He has spoken in many academic and industrial conferences including VLDB, KDD, TechED, PASS, etc. He has published a number of articles for database and data mining journals. Prior to Microsoft, he worked as a researcher at INRIAand Prism lab in Paris and led a team performing data-mining projects at Sema Group. He got his Ph.D. from the University of Versailles, France in 1996. Jamie MacLennan is the Development Lead for the Data Mining Engine in SQLServer. He has been designing and implementing data mining functional- ity in collaboration with Microsoft Research since he joined Microsoft in 1999. In addition to developing the product, he regularly speaks on data mining at conferences worldwide, writes papers and articles about SQL Server Data Mining, and maintains data mining community sites. Prior to joining Microsoft, Jamie worked at Landmark Graphics, Inc. (division of Halliburton) on oil & gas exploration software and at Micrografx, Inc. on flowcharting and presen- tation graphics software. He studied undergraduate computer science at Cornell University. iv 01_462616 ffirs.qxd 9/2/05 4:43 PM Page v Credits Acquisitions Editor Project Coordinator Robert Elliot Ryan Steffen Development Editor Graphics and Production Specialists Sydney Jones Carrie A. Foster Lauren Goddard Production Editor Jennifer Heleine Pamela Hanley Stephanie D. Jumper Copy Editor Quality Control Technician Foxxe Editorial Joe Niesen Editorial Manager Proofreading and Indexing Mary Beth Wakefield TECHBOOKS Production Services Vice President & Executive Group Publisher Richard Swadley Vice President and Publisher Joseph B. Wikert v 01_462616 ffirs.qxd 9/2/05 4:43 PM Page vi 02_462616 ftoc.qxd 9/2/05 4:40 PM Page vii Contents About the Authors vi Credits v Foreword xvii Chapter 1 Introduction to Data Mining 1 What Is Data Mining 2 Business Problems for Data Mining 5 Data Mining Tasks 6 Classification 6 Clustering 6 Association 7 Regression 8 Forecasting 8 Sequence Analysis 9 Deviation Analysis 10 Data Mining Techniques 11 Data Flow 11 Data Mining Project Cycle 13 Step 1: Data Collection 13 Step 2: Data Cleaning and Transformation 13 Step 3: Model Building 15 Step 4: Model Assessment 16 Step 5: Reporting 16 Step 6: Prediction (Scoring) 16 Step 7: Application Integration 17 Step 8: Model Management 17 vii

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.