Microsoft Storage Spaces Direct virtualizes all storage devices attached to a server to create a highly available/scalable clustered software defined storage solution. Each server runs Windows Server 2016 Data centre or Windows Server 2019 Data centre and SSD manages all of storage attached to that server including NVMe, SSD, SAS, SATA drives to create a high performance clustered pool of storage that can be used to provide Hyper-V VM’s flexible storage at a fraction of the cost of traditional SAN or NAS arrays.
In essence Microsoft Storage Spaces Direct is a hyper-converged infrastructure with the ability to provide up to 13.7 million IOPS! The Microsoft Storage Spaces Direct architecture allows you to deploy a cluster in under 15 minutes. The architecture is completely self healing with the ability to add more nodes as and when required, with support for 16 servers and over 400 drives, for up to 1PB of storage per cluster!
Microsoft Storage Spaces Direct features a built-in server-side cache to maximise storage performance. It is a large, persistent, real-time read and write cache. The cache is configured automatically when SSD is enabled. In most cases, no manual management whatsoever is required. How the cache works depends on the types of drives present.
Resilient File System (ReFS) is Microsoft’s newest file system, designed to take advantage of new drive technologies, scale efficiently to large data sets across diverse workloads, provide data protection and self-healing against possible data corruption.
Once these tiers are configured, ReFS use them to deliver fast storage for hot data and capacity-efficient storage for cold data:
All writes will occur in the performance tier, and large chunks of data that remain in the performance tier will be efficiently moved to the capacity tier in real-time.
If using a hybrid deployment (mixing flash and HDD drives), the cache in Storage Spaces Direct will help accelerate reads, reducing the effect of data fragmentation characteristic of virtualized workloads. Otherwise, if using an all-flash deployment, reads will also occur in the performance tier.
In addition to providing resiliency improvements, ReFS introduces new features for performance-sensitive and virtualized workloads. Real-time tier optimization, block cloning, and sparse VDL are good examples of the evolving capabilities of ReFS, which are designed to support dynamic and diverse workloads:
Mirror-accelerated parity delivers both high performance and capacity efficient storage for your data.
To deliver both high performance and capacity efficient storage, ReFS divides a volume into two logical storage groups, known as tiers. These tiers can have their own drive and resiliency types, allowing each tier to optimize for either performance or capacity. Some example configurations include:
ReFS is designed to support extremely large data sets–millions of terabytes–without negatively impacting performance, achieving greater scale than prior file systems. SSD utilizes ReFS provides storage management across a maximum of 16 nodes and up to 400 drives. ReFS supports data deduplication with volumes up to 64TB and file sizes up to 1TB.
Supported drives:
Deploying ReFS on Storage Spaces with shared SAS enclosures is suitable for hosting archival data and storing user documents:
Integrity-streams, online repair, and alternate data copies enable ReFS and Storage Spaces to jointly to detect and correct corruptions within both metadata and data.
Storage Spaces deployments can also utilize block-cloning and the scalability offered in ReFS.