Technical White Paper Dell PowerScale: CloudPools and Alibaba Cloud Architectural overview, considerations, and best practices Abstract This white paper provides an overview of Dell PowerScale CloudPools software in OneFS 9.4.0.0. It describes its policy-based capabilities that can reduce storage costs and optimize storage by automatically moving infrequently accessed data to Alibaba Cloud. April 2022 H17745.6 Revisions Revisions Date Description April 2019 Initial release October 2019 Updated snapshot efficiency June 2020 Updated best practices October 2020 Updated CloudPools operations April 2021 Updated best practices October 2021 Updated performance April 2022 Updated reporting Acknowledgments Author: Jason He ([email protected]) Dell and the authors of this document welcome your feedback on this white paper. This document may contain certain words that are not consistent with Dell's current language guidelines. Dell plans to update the document over subsequent future releases to revise these words accordingly. This document may contain language from third party content that is not under Dell's control and is not consistent with Dell's current guidelines for Dell's own content. When such third party content is updated by the relevant third parties, this document will be revised accordingly. The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any software described in this publication requires an applicable software license. Copyright © 2019-2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, Dell and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners. [3/31/2022] [Technical White Paper] [H17745.6] 2 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Table of contents Table of contents Revisions............................................................................................................................................................................. 2 Acknowledgments ............................................................................................................................................................... 2 Table of contents ................................................................................................................................................................ 3 Executive summary ............................................................................................................................................................. 5 Audience ............................................................................................................................................................................. 5 1 CloudPools solution architectural overview .................................................................................................................. 6 1.1 PowerScale ......................................................................................................................................................... 6 1.1.1 SmartPools ......................................................................................................................................................... 7 1.1.2 SmartLink files .................................................................................................................................................... 7 1.1.3 File pool policies ................................................................................................................................................. 7 1.2 Alibaba Cloud ..................................................................................................................................................... 9 1.2.1 Cloud metadata object ........................................................................................................................................ 9 1.2.2 Cloud data object ................................................................................................................................................ 9 1.3 CloudPools operations ....................................................................................................................................... 9 1.3.1 Archive ................................................................................................................................................................ 9 1.3.2 Recall ................................................................................................................................................................ 10 1.3.3 Read ................................................................................................................................................................. 11 1.3.4 Update .............................................................................................................................................................. 12 2 CloudPools 2.0 ........................................................................................................................................................... 14 2.1 NDMP and SyncIQ support .............................................................................................................................. 14 2.2 Non-disruptive upgrade support ....................................................................................................................... 15 2.3 Snapshot efficiency .......................................................................................................................................... 15 2.3.1 Scenario 1 ......................................................................................................................................................... 16 2.3.2 Scenario 2 ......................................................................................................................................................... 16 2.3.3 Scenario 3 ......................................................................................................................................................... 17 2.3.4 Scenario 4 ......................................................................................................................................................... 17 2.3.5 Scenario 5 ......................................................................................................................................................... 18 2.4 Sparse files handling ........................................................................................................................................ 19 2.5 Quota management .......................................................................................................................................... 19 2.6 Anti-virus integration ......................................................................................................................................... 20 2.7 WORM integration ............................................................................................................................................ 20 3 Best practices for PowerScale storage and Alibaba Cloud ........................................................................................ 21 3.1 PowerScale configuration ................................................................................................................................. 21 3.1.1 CloudPools settings .......................................................................................................................................... 21 3 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 3.1.2 File pool policy .................................................................................................................................................. 21 3.1.3 Other considerations ........................................................................................................................................ 22 3.2 Alibaba Cloud configuration .............................................................................................................................. 23 3.3 Protecting SmartLink files ................................................................................................................................. 24 3.3.1 SyncIQ .............................................................................................................................................................. 24 3.3.2 NDMP ............................................................................................................................................................... 25 3.4 Performance ..................................................................................................................................................... 26 4 Reporting .................................................................................................................................................................... 27 4.1 CloudPools network stats ................................................................................................................................. 27 4.2 Query network stats by CloudPools account .................................................................................................... 27 4.3 Query network stats by file pool policy ............................................................................................................. 27 4.4 Query history network stats .............................................................................................................................. 28 4.5 Cloud statistics namespace with CloudPools ................................................................................................... 28 5 Commands and troubleshooting ................................................................................................................................ 29 5.1 Commands ....................................................................................................................................................... 29 5.1.1 CloudPools archive ........................................................................................................................................... 29 5.1.2 CloudPools recall .............................................................................................................................................. 29 5.1.3 CloudPools job monitoring ................................................................................................................................ 29 5.2 Troubleshooting ................................................................................................................................................ 30 5.2.1 CloudPools state............................................................................................................................................... 30 5.2.2 CloudPools logs ................................................................................................................................................ 31 A Step-by-step configuration example ........................................................................................................................... 32 A.1 Alibaba Cloud configuration .............................................................................................................................. 32 A.2 PowerScale configuration ................................................................................................................................. 32 A.2.1 Verify licensing .................................................................................................................................................. 33 A.2.2 Cloud storage account ...................................................................................................................................... 33 A.2.3 CloudPool ......................................................................................................................................................... 34 A.2.4 File pool policy .................................................................................................................................................. 35 A.2.5 Run SmartPools job for CloudPools ................................................................................................................. 37 A.2.6 SyncIQ policy .................................................................................................................................................... 39 A.3 SmartLink files protection ................................................................................................................................. 40 A.3.1 Fail over to the secondary PowerScale cluster ................................................................................................ 41 A.3.2 Fail back to primary PowerScale cluster .......................................................................................................... 42 B Technical support and resources ............................................................................................................................... 45 B.1 Related resources ............................................................................................................................................ 45 4 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Executive summary Executive summary This white paper describes how Dell PowerScale CloudPools in OneFS 9.4.0.0 integrates with Alibaba Cloud and it covers the following topics: • CloudPools solution architectural overview • CloudPools 2.0 introduction with a focus on the following improvements: - PowerScale NDMP and PowerScale SyncIQ support - Non-disruptive upgrade (NDU) support - Snapshot efficiency - Sparse files handling - Quota management - Anti-virus integration - WORM integration • General considerations and best practices for a CloudPools implementation • CloudPools reporting, commands, and troubleshooting Audience This white paper is intended for experienced system administrators, storage administrators, and solution architects interested in learning how CloudPools works and understanding the CloudPools solution architecture, considerations, and best practices. This guide assumes the reader has a working knowledge of the following: • Network-attached storage (NAS) systems • PowerScale scale-out storage architecture and PowerScale OneFS operating system • Alibaba Cloud The reader should also be familiar with PowerScale and Alibaba Cloud documentation resources including the following: • OneFS release notes available on Dell Support, containing important information about resolved and known issues • Dell PowerScale OneFS Best Practices • Alibaba Cloud 5 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Reporting 1 CloudPools solution architectural overview The CloudPools feature of OneFS allows tiering cold or infrequently accessed data to lower-cost cloud storage. It is built on the PowerScale SmartPools file pool policy framework, which provides granular control of file placement on a PowerScale cluster. CloudPools extends the PowerScale namespace to the public cloud, Alibaba Cloud, as illustrated in Figure 1. It allows applications and users to seamlessly retain access to data through the same network path and protocols regardless of where the file data physically resides. Extended OneFS Alibaba Cloud namespace Dell PowerScale Applications Clients SMB | NFS | HDFS | S3 CloudPools solution overview Note: A SmartPools license and a CloudPools license are required on each node of the PowerScale cluster. A minimum of Dell Isilon OneFS version 8.0.0 is required for CloudPools 1.0, and Dell Isilon OneFS version 8.2.0 for CloudPools 2.0. Policies are defined on the PowerScale cluster and drive the tiering of data. Clients can access the archived data through various protocols including SMB, NFS, HDFS, and S3. 1.1 PowerScale This section describes key CloudPools concepts including the following: • SmartPools • SmartLink files • File pool policies 6 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Reporting 1.1.1 SmartPools SmartPools is the OneFS data tiering framework of which CloudPools is an extension. SmartPools alone tiers data between different node types within a PowerScale cluster. CloudPools also adds to tier data outside of a PowerScale cluster. 1.1.2 SmartLink files Although file data is moved to cloud storage, the files remain visible in OneFS. After file data has been archived to the cloud storage, the file is truncated to an 8 KB file. The 8 KB file is called a SmartLink file or stub file. Each SmartLink file contains a data cache and a map. The data cache is used to retain a portion of the file data locally, and the map points to all cloud objects. Figure 2 shows the contents of a SmartLink file and the mapping to cloud objects. SmartLink file 1.1.3 File pool policies Both CloudPools and SmartPools use the file pool policy engine to define which data on a cluster should live on which tier or be archived to a cloud storage target. The SmartPools and CloudPools job has a customizable schedule that runs once a day by default. If files match the criteria specified in a file pool policy, the content of those files is moved to cloud storage during the job execution. A SmartLink file is left behind on the PowerScale cluster that contains information about where to retrieve the data. In CloudPools 1.0, the SmartLink file is sometimes referred to as a stub, which is a unique construct that does not behave like a normal file. In CloudPools 2.0, the SmartLink file is an actual file that contains pointers to the CloudPools target where the data resides. This section describes the key options when configuring a file pool policy, which includes the following: • Encryption • Compression • File matching criteria • Local data cache • Data retention 1.1.3.1 Encryption CloudPools provides an option to encrypt data before it is sent to the cloud storage. It leverages the PowerScale key management module for data encryption and uses AES-256 as the encryption algorithm. The benefit of encryption is that only encrypted data is being sent over the network. 7 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Reporting 1.1.3.2 Compression CloudPools provides an option to compress data before it is sent to the cloud storage. It implements block level compression using the zlib compression library. CloudPools does not compress data that is already compressed. 1.1.3.3 File matching criteria When files match a file pool policy, CloudPools moves the file data to the cloud storage. File matching criteria enable defining a logical group of files as a file pool for CloudPools. It defines which data should be archived to cloud storage. File matching criteria include the following: • File name • Path • File type • File attribute • Modified • Accessed • Metadata changed • Created • Size Any number of file matching criteria can be added to refine a file pool policy for CloudPools. 1.1.3.4 Local data cache Caching is used to support local reading and writing of SmartLink files. It reduces bandwidth costs by eliminating repeated fetching of file data for repeated reads and writes to optimize performance. Note: The data cache is used for temporarily caching file data from the cloud storage on PowerScale disk storage for files that have been moved off cluster by CloudPools. The local data cache is always the authoritative source for data. CloudPools looks for data in the local data cache first. If the file being accessed is not in the local data cache, CloudPools fetches the data from the cloud. CloudPools writes the updated file data in the local cache first and periodically sends the updated file data to the cloud. CloudPools provides the following configurable data cache settings: • Cache expiration: This option is used to specify the number of days until OneFS purges expired cache information in SmartLink files. The default value is one day. • Writeback frequency: This option is used to specify the interval at which OneFS writes the data stored in the cache of SmartLink files to the cloud. The default value is nine hours. • Cache read ahead: This option is used to specify the cache read ahead strategy for cloud objects (partial or full). The default value is partial. • Accessibility: This option is used to specify how data is cached in SmartLink files when a user or application accesses a SmartLink file on the PowerScale cluster. Values are cached (default) and no cache. 8 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Reporting 1.1.3.5 Data retention Data retention is a concept used to determine how long to keep cloud objects on the cloud storage. There are three different retention periods: • Cloud data retention period: This option is used to specify the length of time cloud objects are retained after the files have been fully recalled or deleted. The default value is one week. • Incremental backup retention period for NDMP incremental backup and SyncIQ: This option is used to specify the length of time that CloudPools retains cloud objects referenced by a SmartLink file. And SyncIQ replicates the SmartLink file or NDMP backs up the SmartLink file using an incremental NDMP backup. The default value is five years. • Full backup retention period for NDMP only: This option is used to specify the length of time that OneFS retains cloud data referenced by a SmartLink file. And NDMP backs up the SmartLink file using a full NDMP backup. The default value is five years. Note: If more than one period applies to a file, the longest period is applied. 1.2 Alibaba Cloud This section describes the following cloud objects in Alibaba Cloud: • Cloud metadata object • Cloud data object 1.2.1 Cloud metadata object A cloud metadata object (CMO) is a CloudPools object in Alibaba Cloud that is used for supportability purposes. 1.2.2 Cloud data object A cloud data object (CDO) is a CloudPools object that stores file data in Alibaba Cloud. File data is split into 2 MB chunks to optimize performance before sending it to Alibaba Cloud. The chunk is called a CDO. If file data is less than the chunk size, the CDO size is equal to the size of the file data. Note: The chunk size is 1 MB in CloudPools 1.0 and versions before OneFS 8.2.0. 1.3 CloudPools operations This section describes the workflow of CloudPools operations: • Archive • Recall • Read • Update 1.3.1 Archive The archive operation is the CloudPools process of moving file data from the local PowerScale cluster to cloud storage. Files are archived either using the SmartPools Job or from the command line. The CloudPools archive process can be paused or resumed. See the section 5.1 for details. 9 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6 Reporting Figure 3 shows the workflow of the CloudPools archive. Dell PowerScale Alibaba Cloud 1. A file matches a PowerScale file pool policy. AlibabaCloud 2. The file data is split into chunks (CDO). 4 SmartLink CMO CDO … CDO 3 1 4 CDO CDO File pool policy 2 3. The chunks are sent from the PowerScale cluster to the Alibaba Cloud. PDF 4. The file is truncated into a SmartLink file and a CMO is written to Alibaba Cloud. Archive workflow More workflow details include the following: • The file pool policy in step 1 (see section 1.1.3) specifies a cloud target and cloud-specific parameters. Example policies include the following: - Encryption (section 1.1.3.1) - Compression (section 1.1.3.2) - Local data cache (section 1.1.3.4) - Data retention (section 1.1.3.5) When chunks are sent from the PowerScale cluster to Alibaba Cloud in step 3, a checksum is applied for each chunk to ensure data integrity. 1.3.2 Recall The recall operation is the CloudPools process of reversing the archive process. It replaces the SmartLink file by restoring the original file data on the PowerScale cluster and removing the cloud objects in Alibaba Cloud. The recall process can only be performed using the command line. The CloudPools recall process can be paused or resumed. See the section 5.1 for detailed instructions on commands. 10 Dell PowerScale: CloudPools and Alibaba Cloud | H17745.6
Description: