Storage usage optimization

HCP uses these features to reclaim and balance storage capacity:

Compression service — The compression service makes more efficient use of HCP storage by compressing object data, thereby freeing space for storing more objects.

For more information on the compression service, see Compression service.

Duplicate elimination service — A repository can contain multiple objects that have identical data but different metadata. When the duplicate elimination service finds such objects, it merges their data to free storage space occupied by all but one of the objects.

For more information on the duplicate elimination service, see Duplicate elimination service.

Disposition service — The disposition service automatically deletes objects with expired retention periods. To be eligible for disposition, an object must have a retention setting that’s either a date in the past or a retention class with automatic deletion enabled and a calculated expiration date in the past.

For more information on the disposition service, see Disposition service. For more information on retention classes, see Managing a Tenant and Its Namespaces or Managing the Default Tenant and Namespace.

Version pruning — An HCP namespace can be configured to allow storage of multiple versions of objects. Version pruning is the automatic deletion of previous versions of an object that are older than a specified amount of time.

For more information on versioning and version pruning, see Managing a Tenant and Its Namespaces and Using a Namespace.

Garbage collection service — The garbage collection service reclaims storage space both by completing logical delete operations and by deleting objects left behind by incomplete transactions.

For more information on the garbage collection service, see Garbage collection service.

Capacity balancing service — The capacity balancing service ensures that the percentage of space used is roughly equivalent across all the storage nodes in the system. Balancing storage usage across the nodes helps HCP balance the processing load.

For more information on the capacity balancing service, see Capacity balancing service.

Service plans — Each namespace has a service plan that defines both a storage tiering strategy and a data protection strategy for the objects in that namespace. At any given point in the lifecycle of an object, its storage tiering strategy specifies the types of storage on which copies of that object must be stored and specifies the number of object copies that must be stored on each type of storage.

By default, throughout the lifecycle of an object, HCP stores that object only on primary running storage, which is storage that’s managed by the nodes in the HCP system and consists of continuously spinning disks. However, you can configure HCP to use other types of storage for tiering purposes.

Every service plan defines primary running storage as the initial storage tier, called the ingest tier. The default storage tiering strategy specifies only that tier.

Primary running storage is designed to provide both high data availability and high performance for object data storage and retrieval operations. To optimize data storage price/performance for the objects in a namespace, you can configure the service plan for that namespace to define a storage tiering strategy that specifies multiple storage tiers.

Storage tiering service — HCP uses the storage tiering service to maintain the correct number of copies of each object in a namespace on the storage tiers that are defined by the storage tiering strategy for that namespace. When the number of object copies on a storage tier goes below the number of object copies specified for that tier in the applicable service plan, the storage tiering service automatically creates a new copy of that object on that tier. When the number of copies of an object on a storage tier goes above the number of object copies specified for that tier in the applicable service plan, the storage tiering service automatically deletes all unnecessary copies of that object from that tier.

Primary spindown storage — On a SAIN system, HCP can be configured to use primary spindown storage, which is primary storage that consists of disks that can be spun down when not being accessed, for tiering purposes. You can then configure the service plan for any given namespace to define primary spindown storage as a storage tier for the objects in that namespace. Using primary spindown storage to store object data that’s accessed infrequently saves energy, thereby reducing the cost of storage.

HCP moves object data between primary running storage, primary spindown storage, and other types of storage that are used for tiering purposes according to rules that are specified in storage tiering strategies defined by service plans.

For more information on primary spindown storage, see Storage for HCP systems. For more information on service plans, see About service plans.

Economy storage — HCP can be configured to use economy storage, which is storage on external HCP S Series Nodes that are separate from the HCP system. The S Series Nodes are used for tiering purposes, and the HCP system communicates with them through the HS3 API and management API.

Extended storage — HCP can be configured to use extended storage, which is storage that’s managed by devices outside of the HCP system, for tiering purposes. HCP can be configured to use up to six different types of extended storage:

oNFS — Volumes that are stored on extended storage devices and are accessed using NFS mount points

oAmazon S3 — Cloud storage that’s accessed using an Amazon Web Services user account

oGoogle Cloud — Cloud storage that’s accessed using a Google Cloud Platform user account

oMicrosoft Azure — Cloud storage that’s accessed using a Microsoft Azure user account

oS3-compatible — Any physical storage device or cloud storage service that’s accessed using a protocol that’s compatible with the Amazon S3 access protocol

Moving object data from primary storage to extended storage frees up HCP system storage space so that you can ingest additional objects.

Note: While all of the data for an object can be moved off of primary running storage and stored only on extended storage, at least one copy of the system metadata, custom metadata, and ACL for that object must always remain on primary running storage.

In addition, you can optimize data storage price/performance for the objects in a namespace by configuring the service plan for that namespace to define a storage tiering strategy that defines storage tiers for multiple types of extended storage.

HCP moves object data between primary running storage, primary spindown storage (if it’s used), and one or more types of extended storage according to rules specified in the storage tiering strategies defined by service plans.

For more information on extended storage, see Extended storage components. For more information on service plans, see About service plans.

Metadata-only objects — With multiple HCP systems participating in a replication topology, you may not need to store object data in every system. A metadata-only object is one from which HCP has removed the data, leaving the system metadata, custom metadata, and ACL for the object in place. HCP makes an object metadata-only only if at least one copy of the object data exists elsewhere in the topology.

Metadata-only objects enable some systems in a replication topology to have a smaller storage footprint than other systems, even when the same namespaces are replicated to all systems in the topology.

HCP makes objects metadata-only according to the rules specified in service plans. If the rules change, HCP can restore data to the objects to meet the new requirements.

For more information on metadata-only objects, see Making objects metadata-only. For more information on service plans, see About service plans.

Trademarks and Legal Disclaimer

© 2017 Hitachi Data Systems Corporation. All rights reserved.