EMC® Data Computing Appliance Appliance Version 2 / Software Version 3.0.0.0 Getting Started Guide PART NUMBER: 302-002-349 REVISION: 02 Copyright © 2015, 2016 EMC Corporation. All rights reserved. Published in the USA. Published June 2016 EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. EMC2, EMC, and the EMC logo are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. For the most up-to-date regulatory document for your product line, go to EMC Online Support (https://support.emc.com). Contents Preface................................................................................................5 Welcome...........................................................................................5 About This Guide ..............................................................................5 Document Conventions.....................................................................6 Text Conventions.........................................................................6 Command Syntax Conventions....................................................7 Getting Support................................................................................7 Product information.....................................................................7 Technical support........................................................................7 Chapter 1: About the DCA..............................................................9 About the DCA..................................................................................9 Two Appliance Versions...............................................................9 DCA Module Types.......................................................................9 Racking Guidelines....................................................................15 Rack Types................................................................................15 Rack Density.............................................................................17 About the Network Configuration...............................................18 DCA Modules and Master Servers....................................................20 Master Servers..........................................................................21 GPDB Modules...........................................................................21 Data Integration Accelerator Modules........................................23 HD Compute Modules................................................................26 Hadoop Master and Worker Modules..........................................27 GPDB Overview and Upgrade Tasks................................................28 About GPDB...............................................................................28 About the Master Servers..........................................................29 About the Segment Hosts..........................................................30 GPDB Upgrade Tasks.................................................................31 Chapter 2: Supported Software Applications.........................33 GPDB..............................................................................................33 Pivotal Greenplum Command Center...............................................33 Pivotal Hadoop................................................................................33 HAWQ.............................................................................................34 Pivotal HD with EMC Isilon..............................................................34 Pivotal Command Center.................................................................34 Supported software application versions.........................................34 Chapter 3: Preparing the Data Center Environment.............37 Confirming Site Requirements.........................................................37 Floor Space Requirements.........................................................37 DCA Rack Dimensions...............................................................38 Connecting New Racks to the Power Supply..............................41 Power Cord Specifications..........................................................41 Environmental Requirements.....................................................41 Air Quality Requirements...........................................................42 Optional Securing Brackets.............................................................43 Anti-Tip Bracket ........................................................................44 Anti-Move Bracket.....................................................................44 Seismic Restraint Bracket..........................................................45 3 Cabinet Positioning.........................................................................46 Package Dimensions and Clearance................................................47 Chapter 4: Planning for a Multiple Rack DCA.........................49 Chapter 5: Gathering Site-Specific Information....................51 Site Requirements Checklist............................................................51 Plan for Hadoop Networking............................................................53 VLAN Overlay..................................................................................53 Planning for Remote Support - ESRS and Dialhome........................54 Chapter 6: DCA Administration..................................................57 DCA utilities....................................................................................57 Description................................................................................61 Options .....................................................................................61 ConnectEMC Dial Home Capability.............................................64 Web-Based Management Options..............................................68 Pivotal Greenplum Command Center.........................................68 Pivotal Command Center...........................................................69 GPDB Email and SNMP Alerting .................................................69 SNMP on the DCA...........................................................................69 DCA MIB information......................................................................69 MIB Locations............................................................................70 MIB Contents.............................................................................70 View MIB...................................................................................72 Integrate DCA MIB with environment..............................................84 Change the SNMP community string..........................................84 Set an SNMP Trap Sink..............................................................84 General Database Maintenance Tasks.............................................85 Routine Vacuum and Analyze....................................................85 Routine Reindexing....................................................................86 Managing GPDB Log Files..........................................................86 Next Steps......................................................................................87 Chapter 7: Power Down the DCA...............................................89 Chapter 8: Next Steps...................................................................95 Documentation Resources...............................................................95 Providing User Access to GPDB.......................................................95 Creating Databases and Loading Data.............................................95 Appendix A: Red Hat Enterprise Linux End User License Agreement......................................................................................97 Glossary............................................................................................99 4 EMC DCA Getting Started Guide Preface This guide is intended for EMC personnel, partners, database and system administrators, and customers to plan for installing a new Data Computing Appliance (DCA) into a data center. This guide provides an overview of the system, information on data center requirements, a checklist of items needed for software configuration, and links to relevant documentation for use in the next steps of deployment. This guide also contains an overview of the appliance configuration. Make sure that you verify that the requirements listed in this document are satisfied before performing a DCA installation. • Welcome • About This Guide • Document Conventions • Getting Support Welcome Welcome to EMC and congratulations on your new acquisition of your EMC DCA product. To help you get started as a new EMC Customer, please visit our online support Welcome Center at http://www.emc.com/support/new-customers/index.htm. Here, you will find information to help you gain access to the tools and resources you need to successfully support your EMC products. In addition, you will be introduced to our Online Support site (Support.EMC.com) which is your single destination for support and online access to numerous resources including product-specific support information and downloads, software license activation, service request creation and management, self-help tools, and a single view of your entire EMC installed base. You can also access our lively Support Community and quickly connect with an EMC technical support specialist via Live Chat. About This Guide This guide assumes knowledge of Linux/UNIX system administration, database management systems, database administration, and structured query language (SQL). This guide contains the following chapters and appendices: • Chapter 1, “About the DCA” explains the architecture, components, and configuration of Pivotal™ Greenplum Database® (GPDB) on the DCA. • Chapter 2, “Supported Software Applications” describes the optional software applications supported by the DCA. • Chapter 3, “Preparing the Data Center Environment” describes site requirements for the DCA, securing brackets, cabinet positioning, and package dimensions and clearance. • Chapter 4, “Planning for a Multiple Rack DCA” contains information required to plan for a multiple rack DCA. • Chapter 5, “Gathering Site-Specific Information” contains a site requirements checklist, a plan for Hadoop networking, and information on remote support. • Chapter 6, “DCA Administration” describes the general database maintenance tasks and the tools available to diagnose, monitor, and troubleshoot a GPDB system running on the Data Computing Appliance. • Chapter 7, “Power Down the DCA” explains how to power down the DCA safely. • Chapter 8, “Next Steps” explains the next steps for implementing your data warehouse requirements in GPDB. 5 Document Conventions • “Glossary” defines DCA components and terminology. Document Conventions The following conventions are used throughout the DCA documentation to help you identify certain types of information. • Text Conventions • Command Syntax Conventions Text Conventions TablePreface.1 Text Conventions Text Convention Usage Examples bold Button, menu, tab, page, and field Click Cancel to exit the page without names in GUI applications saving your changes. italics New terms where they are defined The master instance is the postgres process that accepts client Database objects, such as schema, connections. table, or columns names Catalog information for GPDB resides in the pg_catalog schema. monospace File names and path names Edit the postgresql.conf file. Programs and executables Use gpstart to start GPDB. Command names and syntax Parameter names monospace italics Variable information within file paths /home/gpadmin/config_file and file names COPY tablename FROM Variable information within command 'filename' syntax monospace bold Used to call attention to a particular Change the host name, port, and part of a command, parameter, or database name in the JDBC connection code snippet. URL: jdbc:postgresql://host:5432/m ydb UPPERCASE Environment variables Make sure that the Java /bin directory is in your $PATH. SQL commands SELECT * FROM my_table; Keyboard keys Press CTRL+C to escape. 6 EMC DCA Getting Started Guide Getting Support Command Syntax Conventions TablePreface.2 Command Syntax Conventions Text Convention Usage Examples { } Within command syntax, curly braces FROM { 'filename' | STDIN } group related command options. Do not type the curly braces. [ ] Within command syntax, square TRUNCATE [ TABLE ] name brackets denote optional arguments. Do not type the brackets. ... Within command syntax, an ellipsis DROP TABLE name [, ...] denotes repetition of a command, variable, or option. Do not type the ellipsis. | Within command syntax, the pipe VACUUM [ FULL | FREEZE ] symbol denotes an “OR” relationship. Do not type the pipe symbol. $ system_command Denotes a command prompt; do not $ createdb mydatabase type the prompt symbol. $ and # # # chown gpadmin -R /datadir denote terminal command prompts. root_system_command => and =# denote GPDB interactive => SELECT * FROM mytable; => gpdb_command program command prompts (psql =# SELECT * FROM =# su_gpdb_command or gpssh, for example). pg_database; Getting Support EMC support, product, and licensing information can be obtained as follows. Product information For DCA product-specific documentation, release notes, or software updates, go to the EMC Online Support site at http://support.emc.com, click Support By Product, and search for Data Computing Appliance. Technical support For technical support, go to http://support.emc.com. The Support page includes several support options, including an option to request service. Note that to open a service request, you must have a valid support agreement. Please contact your EMC sales representative for details about obtaining a valid support agreement or with questions about your account. 7 Getting Support 8 EMC DCA Getting Started Guide About the DCA 1. About the DCA The Data Computing Appliance is a self-contained data warehouse solution that integrates all of the database software, servers, and switches necessary to perform big data analytics. The DCA is a turn-key, easily installed data warehouse solution that provides extreme query and loading performance for analyzing large data sets. The DCA integrates GPDB, data loading, and Hadoop software with compute, storage, and network components. The DCA is delivered racked and ready for immediate data loading and query execution. This chapter includes the following sections: • About the DCA • DCA Modules and Master Servers • GPDB Overview and Upgrade Tasks About the DCA This section explains the hardware components and specifications of the DCA. • Two Appliance Versions • DCA Module Types • Rack Types • Rack Density Two Appliance Versions The DCA 3.0.0.0 software supports all DCAv2 hardware and the new DCAv3 GPDB hardware. A DCAv3 System rack has two Python master servers and one to four GPDB modules, with each module comprised of four Hydra 24 segment servers. Each System rack also has an Arista administration switch and two Arista interconnect switches. Both server types have 256GB of memory and 1.8TB drives; the Python has six drives and the Hydra 24 has 24 drives. Aggregation and Expansion racks use subsets of the System rack components. DCAv2 System, Aggregation, and Expansion racks have the standard 2.x configurations of servers, switches, drives, and memory. A DCAv2 appliance can have GPDB modules, Data Integration Accelerator (DIA) modules, and Pivotal Hadoop (PHD) modules. Note: The DCA 3.0.0.0 software release provides separate sets of documentation for the DCAv3 and DCAv2 appliances. Both sets are available at http://support.emc.com. DCA Module Types The DCA is built from required switches, two master nodes for cluster management, and server increments called modules. DCA modules consist of either two or four servers. EMC-supported servers for the DCA are named Dragon 12, Dragon 24, or Kylin. This helps customers and EMC Support to easily identify servers. Read this section for server types that make up the three available modules: • GPDB Module 9 About the DCA • Data Integration Accelerator (DIA) Modules • Hadoop Modules GPDB Modules Server Types and Specifications Table 1.1 lists the server types and specifications for the GPDB modules. Table1.1 GPDB Module Specifications GPDB module type Server quantities / Drive Types / Memory Usage GPDB Standard Module This module is comprised of four Dragon 24 GPDB (Introduced in DCA version servers. 2.0.0.0) • Disks - Twenty Four 900GB drives per server • Memory - 64GB per server GPDB Compute Module This module is comprised of four Dragon 24 GPDB (Introduced in DCA version servers. 2.0.0.0) • Disks - Twenty Four 300GB drives per server • Memory - 64GB per server GPDB Hi-Memory Module This module is comprised of four Dragon 24 GPDB (Introduced in DCA version servers. 2.0.2.0) • Disks - Twenty Four 300GB drives per server • Memory - 256GB per server DIA Modules Server Types and Specifications Table 1.2 lists the server types and specifications for the DIA modules. Table1.2 DIA Module Specifications Type Server quantities / Drive Types / Memory Usage DIA-Kylin 300GB Disk Module This module is comprised of two Kylin Business Introduced in DCA version servers. Intelligence Tools 2.0.0.0 • Disks - Six 300GB drives per server • Memory - 64GB per server DIA 3TB Disk Module This module is comprised of two Dragon 12 Business Introduced in DCA version servers. Intelligence Tools 2.0.2.0 • Disks - Twelve 3TB drives per server • Memory - 64GB per server DIA Hi-Memory Module with This module is comprised of two Dragon 24 Business 24 HDDs servers. Intelligence Tools Introduced in DCA version • Disks - Twenty Four 300GB drives per 2.0.2.0 server • Memory - 256GB per server DIA-Kylin Hi-Memory Module This module is comprised of two Kylin Business Introduced in DCA version servers: Intelligence Tools 2.1.0.0 • Disks - Six 300GB drives per server • Memory - 256GB per server 10 EMC DCA Getting Started Guide
Description: