A couple years ago, Ivan Pepelnjak wrote an interesting blog showing how all you need are two top-of-rack switches. Iwan Rahabok followed up on the systems side, with the perspective that 1000 VM per rack is the new minimum.
Now in mid 2017, I'd like to show you how we can fit 2M network routes, 1PB of storage, and over 1000 VM's in a half rack. Don't worry, I included redundancy as well.
Disclaimer: My examples use specific vendor equipment to show how this is possible, but I'm sure you can find different devices to acheive the same goal. I am not employed by any of these vendors, and only picked this equipment because it fits the design goal well.
(2) Arista 7280CR2K-30
(10) Netapp HCI 2U Servers - 13x Large Compute Nodes & 25x Large Storage Nodes
My network choice is fairly straight-forward. Dual-purpose switches with high-capacity ports that support 10/25/40/50/100 GbE, and also function as edge peering routers that handle 2M routes each.
For compute/storage, I chose to go hyper-converged, as I believe the era of big storage controllers is over. Distributed, fast storage is here to stay, and using something like Netapp HCI provides a controller-less architecture that allows you to add capacity and performance without worrying about your controller throughput. To be fair, I've been wary of hyper-converged products, but Netapp HCI retains the benefit of separate compute/storage upgrades like a traditional compute/storage stack.
I currently run enterprise, dev, qa and prod workloads in 100% virtualized environments. I have VM's that range from 1 vCPU and 512 Mb RAM to 16 vCPU and 64 Gb RAM. I took my total provisioned vCPU/RAM and divided by total VM's to get my average use, which is 2 vCPU and 4Gb RAM per VM.
1000 VM = 2000 vCPU and 4 Tb RAM
I do not overcommit RAM, but I overcommit 5:1/vCPU:pCPU with good results. This may vary in your environment.
2000 vCPU ÷ 5 = 400 pCPU
Each Netapp HCI large compute node contains 36 pCPU.
400 pCPU ÷ 36 = 11.1 Netapp large compute nodes, lets round up to 12. N+1 means we need 13 large nodes. This will also provide almost 10 Tb of RAM, which is more than enough. Note: If you can push 10:1 overcommit on your compute, you can run 2000 VM's per half rack!
Netapp claims 44 Tb of storage capacity with 10x space efficiency (dedup/compression/compaction) per large storage node. This may seem like a stretch, but it really depends on your workloads. My datastores see anywhere from 1.5x to 38x efficiency. My average is 23x, so 10x is actually a very reasonable number for me.
1 Pb ÷ 44 Tb = 23.2 Large storage nodes, round up to 24. N+1 means we need 25 large nodes.
Compute and Storage Totals:
13 Compute + 25 Storage = 38 Total Nodes. Each 2U Netapp HCI server holds 4 nodes, so divide 38 by 4 = 9.5 2U servers. 10 servers will give us 2 free node slots for future use, and we use a total of 20U.
Each Netapp HCI node has 2x or 4x 10/25Gb ports (depending if its a compute or storage node). For our configuration, we will just use 2x 25Gb ports per node, for a total of 76 25Gb ports for the servers. We also need 2x ports for internet peering, and another 2x ports for WAN connectivity for a total of 80 ports. Each Arista 7280CR2K-30 gives us 120 10/25Gb ports, so we have plenty of capacity. The 7280CR2K-30 also handles 2M routes, so we can take full internet BGP feeds with no problem. Just separate your internal/external route tables appropriately.
To keep everything to a half rack, we can use VMware NSX with distributed Palo Alto firewalls. If you prefer a physical firewall, the Arista switches have plenty of room, you will just go over our half rack goal :)