Editing Storage (section)

== Storage Options ==
There are a number of key alternative storage solutions available to you in your homelab, some of which depend on your compute choices, and some of which do not. They are listed below, from roughly cheapest to most expensive (depending on your drive choices and number of nodes!):

# [[Storage#Local Storage|Local drives]] in your [[VMware vSphere|ESXi]] or [[vInception]]/[[vTARDIS]] host
# [[Storage#Whitebox Storage|Whitebox storage]], typically running on a dedicated physical host (a perfect example of which would be an [[HP MicroServer]]), or in some cases another VM! This typically runs open source storage software such as [[FreeNAS]] or [[OpenFiler]].
# [[Storage#Software Defined Storage and HCI|Software Defined Storage (SDS)]] platform, such as [[VMware VSAN]] or [[Nutanix CE]]
# [[Storage#Vendor NAS|Vendor NAS]], e.g. [[Synology]], [[QNAP]], [[ReadyNAS]], etc.
# [[Storage#Vendor Array|Actual vendor array]] (aka the eBay special)
# [[Storage#Server Side Caching|Server Side Caching]]
# [[Storage#Cloud Storage|Cloud Storage]]

Vendor NAS and Software Defined Storage will vary in terms of which is most expensive, depending on which drive types you buy, how many bays you have in your NAS, and which flash drive types you use (if any).

=== Local Storage ===
This is the simplest, least expensive option, and can start from something as simple as running a lab from your local laptop drive on a 5400 RPM SATA drive, all the way to doing nested VSAN in a single host, with a load of flash drives for optimum performance.

This kind of configuration is often associated with a vInception-style setup, such as when you install a [https://en.wikipedia.org/wiki/Hypervisor#Classification Type 2 Hypervisor] onto a Windows, Linux or Mac operating system, then run VMs inside that, some of which may even be hypervisors themselves!

Taking this a step further you could even install a [https://en.wikipedia.org/wiki/Hypervisor#Classification Type 1 Hypervisor] such as ESXi / Hyper-V onto the physical server and run as many VMs as you can fit. Alternatively if you want to test out running multiple datastores over iSCSI or NFS, then you could even install a VM on top of your physical host, and have that present out datastores / LUNs to your other VMs using some [[Storage#Whitebox Storage|whitebox storage software]]. It will definitely have a performance hit, but if logical testing is all you want to do, then it is perfectly acceptable and will run well with the addition of some flash.

[[File:SharkoonQuickPort4Bay.jpg|250px|bortder|right]]The key benefits to using local storage are:
* Simplicity - You don't need to use any storage protocols, you don't need any additional network configuration, etc. You simply start carving VMDKs from your local drives. Couldn't be easier!
* You are free to buy whatever case and compute you like, so can choose how big you want to scale your storage node, from a single drive, up to potentially a dozen or more in a single chassis!
* Cost; this is the lowest cost solution, as you have to buy the drives anyway, so you have virtually zero further overhead.

The main drawbacks are:
* Although cost is cheap, bear in mind that you are limited to the number of bays in the single host. It is possible at add additional disk bays depending on your physical chassis, for example Icy Dock have a wide range of add-on SATA disk cages which typically fit into one or more 3.5" or 5.25" slots and allow you to add more [http://www.icydock.com/category.php?id=117 2.5"] or [http://www.icydock.com/category.php?id=113 3.5"] drives. Other vendors are of course available, including for example the [https://www.sharkoon.com/product/1686/12640#desc Sharkoon Quickport] line! :)
* To provide any decent resilience or scale significantly beyond your motherboard's maximum number of SATA slots, or even protect yourself from motherboard failures with something you can easily replace, you may require an additional SATA array card. These can be reasonably expensive, but decent enterprise ones such as [http://www.dell.com/learn/uk/en/ukbsdt1/campaigns/dell-raid-controllers Dell PERC] (which are usually rebranded [https://www.adaptec.com/en-us/support/raid/ Adaptec] cards anyway) can even be purchased on eBay and retrofitted in a PCI slot. They typically come in 4/8/16 port configurations, but remember that the secret is to have as much cache in them as possible if you want to have decent performance.
* Lack of flexibility and scalability; both are limited to whatever you can achieve in your single host. That said, it is still possible to add further physical hosts to your cluster as long as you then accept the single point of failure of your primary host, and you have some method for sharing that storage (e.g. by installing some storage software either in the OS or in a VM on that host. Which leads us neatly onto [[Storage#Whitebox Storage|whitebox storage]].
* Lastly, if you like to keep your lab running 24/7 you will need to take your lab down to patch and reboot the underlying OS, which is obviously a bit of a pain. No such thing as non-disruptive upgrades on local storage!

=== Whitebox Storage ===
At this point, things start getting interesting, and closer to the type of configuration you are likely to see in the real world!

Using [[Homelab Storage Software|storage software]] installed into on either a virtual or physical server, you can present NFS or iSCSI LUNs / datastores to your hypervisor hosts, and even object storage to your virtual machines. Once this [[Homelab Storage Software|shared storage]] is in place, it opens up a whole load of new testing and availability possibilities, including VMware vMotion and Microsoft Live Migration.

[[File:whitebox.jpg|200px|border|left]]One of the biggest benefits to running whitebox storage is that it doesn't cost you a huge amount over just doing local storage, especially if you are using an inexpensive storage box (options such as [[HP MicroServer]]s are absolutely ideal for this, as they can hold at least 4 and up to 8 drives when you add a 5.25" disk expansion unit. However, any vendor or whitebox server with a handful of bays will work just fine.

Once you have your kit and your drives, the next key thing is to simply choose and install your software. Here you have a mahoosive range of software options. We have listed a number of them on the '''[[Homelab Storage Software]]''' page.

Another key benefit of whitebox storage (assuming you are using VMware vSphere as a hypervisor) is the ability to take advantage of VAAI. Many of the storage software vendors including, FreeNAS, NexentaStor, etc support it. VAAI has a load of very useful features (known as primitives) which will help your lab to fly, the most useful of which is the ability to offload cloning of VMs to your storage, and as such clone VMs from templates or other VMs in a few seconds!

The main drawback to running a custom storage software stack is that it is in direct contravention off the [[Keep It Simple Stupid|KISS principle]] of reducing complexity. You may end up spending a significant amount of time managing your storage software, when you could be doing other things in your lab. Nothing wrong with this of course, if you want to learn more about that storage software, and some of the vendor trialware / free restricted software (e.g. [[EMC vVNX]], [[NetApp Data ONTAP]] sim, etc) can be absolutely ideal for this.

Similarly to the [[Storage#Local Storage|local storage]] method, downtime will be required to all of your lab VMs every time you need to patch the storage host or storage software.

Finally, because this is basically just adding some more intelligent software over the top of your physical kit, most of the same benefits and drawbacks as [[Storage#Local Storage|Local Storage]], with a lot more scalability and flexibility, but an equal helping of complexity to go with it!

=== Software Defined Storage and HCI ===
There are two key categories of software defined storage which you could feasibly use in your homelab, though in reality, you are most likely to use the latter.
* Scale-Out Software Defined Storage
* Hyper-converged Infrastructure

A number of Software Defined Storage and HCI solutions are listed on the '''[[Homelab Storage Software]]''' page for further reading.

==== Scale-Out Software Defined Storage====
Unless you have a seriously impressive budget, standard SDS solutions are likely to be something you play with in the lab, more than something which you run your lab on. The main reason for this is that it would require you to run multiple physical nodes dedicated to storage only, which most of us don't / cant afford to do. In an enterprise solution SDS is awesome as it allows you to scale as you grow, and make incremental hardware investments as you require additional capacity. Unless you are running many terabytes of data in your homelab, you are simply not going to need the scalability which SDS afford you.

The one exception to this is probably object storage, where most object storage software is based on scale-out by default, and as such would be appropriate in a homelab environment, even if in reality you actually ended up virtualising it anyway.

==== [[Hyper-converged Infrastructure]] ====
For more detailed information, see the '''[[Hyper-converged Infrastructure]]''' article.
[[File:vsan.png|350px|border|right]]There are a great many benefits to running [[Hyper-converged Infrastructure]] ([[HCI]]) for small businesses, ROBO, etc, and these use cases can be directly equated to the requirements of many homelab users.

If you have sufficient budget and space to run multiple physical chassis in your lab, then perhaps HCI is an ideal solution for you as it comes with the following key benefits:
* No need to invest in a separate physical storage device, saving on budget, power/cooling, and noise.
* Using a mixture of flash and spindle drives, for typical homelab workloads you can expect to get excellent performance as most of the working set will live in flash (for which a reasonable rule of thumb is around 10% of your RAW spindle capacity).
* Many of the [[HCI]] solutions include full support for all of the latest storage enhancements to hypervisors, such as VSAN which supports both VAAI and VVols. This is ideal for helping you to learn these technologies early on in their product lifecycles.
* Assuming you have a reasonable number of bays in each physical host, HCI can potentially scale ''mahoosively''. For example even using small towers with just 4 bays per host, would allow up to 36-40TB of raw space in a 3-node cluster using relatively inexpensive 4TB drives! Even assuming the use of 1x2TB drive and 1x 250GB flash device per host you still end up with 6.75TB of raw space which is more than enough to run a very decent homelab!
* Lastly one massive benefit if you like to keep your lab running 24/7, is the ability to take down individual nodes for maintenance, patching, etc, whilst your lab stays up! With most [[Storage#Local Storage|local storage]], [[Storage#Whitebox Storage|whitebox]], and even [[Storage#Vendor NAS|vendor NAS]] solutions are going to be built on a single controller architecture, meaning to complete patching of your storage software you have to take down all of your lab VMs. For many of us this is a right pain in the rear, and use of HCI avoids this!

[[HCI]] in a homelab is not without its drawbacks in the homelab environment of course:
* It is generally best practice to keep capacity across all nodes roughly the same, so assuming a minimum of 3 nodes in a cluster, as you scale capacity in future you will need to buy at least 3 drives at a time
* You will require chassis with sufficient drive bays to accommodate typically a minimum of two drives.
* To get decent scalability you probably wont want to use an ultra-SFF chassis, though people are already running VSAN on [[Intel NUC]]s. You just have to remember that with a maximum of two drives, if you want to increase storage capacity you either need to replace drives, or add nodes to your cluster.
* There are fewer options available for HCI and SDS than other solutions, however as the [[HCI]] market grows this can be expected to increase both through additional competitors entering the market, and incumbents introducing free tiers in the same fashion as Nutanix did with [[Nutanix CE]] in recent times.
* Most [[HCI]] solutions require reasonably durable flash devices. On a consumer budget you are at a greater risk of needing to replace drives if you use your lab a lot. If you are reasonably conservative in workloads, and use decent consumer drives such as those tested and recommended in the [[VMware VSAN|Open Homelab VSAN]] article, you can expect to get a decent lifetime out of your flash devices and this becomes a non-issue.
* [[HCI]] can be reasonably intensive on your network, so if possible, it is worthwhile considering the use of a dedicated NIC / port for your storage traffic.
* Some [[HCI]] solutions can require a minimum of 1-2 vCPUs and 2-8GB RAM from every host in your cluster. If you are using small hosts with minimal resources, you can end up dedicating significant capacity to your storage software and losing capacity for running VMs. Ideally for an HCI solution you would probably want to run a minimum of 32GB per host to counteract this.

=== Vendor NAS ===
The current most popular vendor in this category is probably Synology, but other popular vendors include [[QNAP]], [[ReadyNAS]], etc. One of the main reasons for using this method (assuming you have the budget), is [[Keep It Simple Stupid|KISS]]. NAS devices are generally very simple to setup and maintain, so minimal feeding and watering required.

Contrary to popular belief, it is actually possible to get started in the NAS market for a relatively low cost, depending on your requirements, and how much capacity you require.

[[File:ds415plus.jpg|250px|border|right]]The key benefits to using a vendor NAS include:
* Typically very quick and easy to setup. Plug in the drives, run a wizard, and you can tart presenting NFS shares or iSCSI LUNs within a few minutes.
* Easy to support as any major issues can be escalated to the manufacturer
* Generally very stable, meaning that you spend more time working in your lab than fixing storage issues

The main drawbacks to vendor based NAS can be:
* Cost, they are not the cheapest devices on the planet, especially compared to local storage where you have bought the drives anyway!
* Limited quantity of drive bays. Most NAS devices are 2-5 bays max. The more bays you want, the more the price will go up too!
* Most vendor NAS boxes are single controller, and as such you will typically need to take down all of your lab VMs every time you need to do a software or firmware update of your vendor box. Many vendors release patches on a monthly basis, so you can choose to either take down your lab regularly, or run with known security holes for periods.

=== Vendor Array ===
Not for the faint hearted, a full on vendor array is rare in the wild, but they are out there! Commonly procured from eBay, they are also occasionally provided by vendors for PoCs or businesses throwing out their old kit.

If you are lucky enough to get your hands on one of these, and have the space to rack it where the noise won't drive you mad, then you can look forward to personal service from your account manager at the power company as money starts leeching from your account to theirs in huge quantities! A typical enterprise array will drain around 200-300w per shelf, and another 300-600w for the head units! It's no wonder we keep having to upgrade power and cooling capacity to data centres!

The biggest benefits are:
* Very similar to a real enterprise environment, so excellent for learning
* Devices are usually dual controller, so for 24/7 labs, very good uptime can be maintained
* Usually a rich feature set and broad set of [[Storage#Data Services|data services]]

Biggest drawbacks are:
* Noise
* Vibration (can be as annoying as noise if you have it sitting in the loft or similar)
* Power consumption (ouch!)
* May even require specific power connections or three phase power, and could exceed your breaker limits on your consumer unit
* Cost of spares if anything fails
* Usually requires that you have an actual rack to house it in
* Usually requires that it is run 24/7 - enterprise arrays don't often take kindly to being turned off, especially if they are a bit older

Using a real vendor array is usually not for the faint hearted, but there are plenty of people out there using them today! Ideally if you have managed to get hold of some real vendor enterprise storage to run in your homelab, you would be best off buying a big box of chocolates for your boss or DC manager, and asking if your company can colo it for you for free in a comms room or data centre! If not, you better hope you have a well ventilated basement or garage with space for a rack!

=== Server Side Caching ===
Although not strictly a standalone storage technology, server side caching can be used to both massively boost  performance in your homelab, and make sure that any existing investments you have in (for example) your whitebox or vendor NAS / array is maximised, and you get the most longevity out of your existing kit.

Server side caches typically come in two flavours, read only and read/write, the former of which is becoming more and more rare.

Some example server side caching vendors are:
* [[PernixData]]
* [[Intel Cache Acceleration Software]]
* [[VMware vSphere Flash Read Cache (vFRC)]]

The biggest benefits with server side caching in the homelab are
* You simply provision a single SSD in each physical compute host in your lab, and you will immediately get some significant performance benefits. No need to do all-flash at your shared storage (or even use up expensive NAS slots with SSDs), just fill that full of a few spindles for plenty of capacity.
* Due to the fact that you are unlikely to be running all of your homelab VMs are full utilisation all of the time, the VM storage working set it likely to revolve mostly around whatever you are testing at the time, so you don't have to have massive SSDs to achieve generally decent performance.
* You can retro-fit server-side caching to an existing lab very easily with the addition of an SSD to each host, and a bit of software!

Disadvantages include:
* Missed read cache hits. 90% or more of your IOs may hit the server side cache and have sub millisecond latencies, but anything which doesn't will be performance limited by the maximum latency of your spindle-based storage. In the real world this can be a significant issue for business critical applications, but in the homelab this is not such a big deal! The same applies to any form of cachine or even scale-out distributed multi-tier storage as well, if your working set exceeds your flash size.
* You typically require a minimum of 2-3 compute nodes to support server side caching, each of which will require an SSD, so this is not a low-budget option.

=== Cloud Storage ===
We go into this in more depth in the [[Cloud Labs]] section of the site, but cloud-based storage is useful for a multitude of functions, from primary storage for a cloud lab, to an inexpensive and flexible off-site backup target for your backup and replication solution.

'''NEEDS EXPANSION'''