This content is brought to you from vRetreat day 2018.
Cohesity started their session with the title “Data Protection and then some!” which for me started off the remainder of the session…. the next statement was that Cophesity is an enterprise product, built for enterprise businesses. This was then followed by the breakdown of customer logos.
My personal view on this was, “what is classed as an Enterprise business?”… a business that has low user count but high data protection requirements may not fall into that bracket. Also showing big customer logos is great, but show some smaller niche businesses, otherwise you could alienate customers and automatically make them think you are over priced for the Enterprise customers specifically!
As you can see from the following Cohesity timeline image, the business was founded by the same guy who was involved at building the webscale architecture at Google and Nutanix! Based on that fact, if you look closely at the Nutanix GUIs and Cohesity, they are very similar. Obviously the scale out architecture is the same as it used webscale!
Cisco & HP both invested into Cohesity in round C funding!! First time in the market, as far as Cohesity are aware for the two vendors to back a technology.
Version 5 covers any hypervisor and any NAS, whereas version 4 did not.
What is the focus for Cohesity:
Firstly, they don’t do primary storage! Cohesity are focussing on secondary storage. Secondary storage being test/dev, backup, archive, file shares, analytics, cloud repositories. The biggest market for Cohesity right now is from the backup market, but they are much more. Biggest competitor they could put themselves against is Rubrik. Cohesity at the heart is to bring simplicity.
The image above outlines nicely what Cohesity see as their market share within the secondary storage market.
The traditional secondary storage implementation is something like:
Consolidation with Cohesity begins with phasing out other secondary requirements.
Cohesity consolidation continues:
You can take it as far or as limited as you see fit for your existing investments, and Cohesity support cloud integration with any NFS or S3 target. Data held on Cohesity is instant cloneable/mountable due to its snapshotting functionality.
Cohesity version 4.0 doesn’t have a plugin for antivirus, whereas with 5.0 release this supported AV integration.
The hardware is built and manufactured by Intel. Very low failure rate on hardware, supposedly. A little more high-end quality than other providers using supermicro (in their opinion).
Each node has 2 CPUs, a full box has 64cores of CPU and 256GB per node of memory. Each node has around 1TB of flash per node. Have spinning disk in the background with flash as a tiering layer. (2 flash cards and 3 spinning disks per node)
Architecture video: https://www.youtube.com/watch?v=juprGdkgaiA
Scale & Snapshot
Cohesity provide in line dedupe and compression, included and built into the product. Dedupe will be for all data on Cohesity no matter of the protocol (S3, SMB, NFS etc). by design all metadata is in flash for quick rehydration and recovery rates. There is no penalty like products such as data domain, as its using webscale. More available CPU, memory etc.
Cohesity have tested up to 256nodes and no performance hit has been seen!! Therefore unlimited growth using the webscale architecture.
As you can see on the charts, there is a slight dip in performance and this is supposedly down to the way rebalancing occurs on the webscale architecture.
Cohesity have no limit on snapshots, they have customers with 20,000 snapshots!!!!! (dear god that sounds bad) With no degradation on performance. Snapshots with Cohesity uses Snap Tree
Snaptree removes the need to read the whole chain for recovery with backups as an example. It will allow you to parallel check snaps. You backup your data once as a full backup, then when you do this it creates a snaptree, this then means each incremental doesn’t depend on the snapshot before, just the full backup. Recovering a file from a snap means it only needs to read the full backuop and 1 incremental and not all incrementals in a chain. As no chain exists with snaptree. The snap tree youtube video outlining this can be found here: https://www.youtube.com/watch?v=W14xr06oXh0
Cohesity provide QoS to allow tiering of access to the storage. For example; prioritise your backup jobs or indexing/searching tasks.
Typical support is 5 years on the hardware. You can mix models into a cluster and it will auto-tier. Cohesity currently don’t advise mixing the archive unit and backup units, but in version 5.0 release I believe this is fine in the same cluster.
As always with most HCI platforms the minimum cluster node count is 3, and as its webscale based architecture no more controller replacements or implementation required, you scale per node/per 4 node box. Unlimited scale.
Compress, Dedupe & Encrypt
Cohesity use 7 different compression algorithms and based on the data the system will use the best algorithm for that dataset. Compression is the big win! Not Dedupe! But Cohesity will dedupe first, then compress, then encrypt. You can disable dedupe and compression on specific data sets if required for better performance.
Handling source encryption isn’t possible, but unencrypt and re-encrypt on the Cohesity is the route around this. Default is using Cohesity certificates / KMS. But you can use your own if you want. Cohesity will replicate the key to the cloud storage too so if your local Cohesity fails you can still access your data. Roughly 2% penalty on performance when encrypting.
All management is HTML5 based. Looks very much like Nutanix Prism… kind of makes sense that the founder is the same for both organisations.
SQL server any point in time mount is possible
SQL adapter info https://www.youtube.com/watch?v=f33U3tZH1rE
Cohesity secondary storage silos..
Cohesity and Cloud Architectures
Cohesity and cloud integration info can be found here in a nice video: https://www.youtube.com/watch?v=VlP1MSUtDAs
Cohesity is available in AWS, Azure and GCP. It can archive to cloud providers, tier storage to the cloud where appropriate. Please note that tiering to the cloud is not really advised for now, as it will move the coldest data to the cloud. You don’t know what data is where at this point!
Asyncrho replication… this Is not like Zerto replication! Backup > replicate, Backup >archive.
Site to site architecture with Cohesity for private cloud can be seen below:
For ROBO deployments, Cohesity support the following:
Run virtual or low end robo boxes in robo then backup to datacentre Cohesity box. Means local backup with replicated backup for resilience and protection.
Legacy vs Cohesity
The Cohesity view of legacy secondary storage approaches vs their approach are:
A great overview session from Cohesity, some points around where they play best to note but overall I think it provides a comprehensive platform. And who doesn’t want unlimited scalabilty with 20,000 snapshots!!!!!! :s