First let me tell you by no means i am a SAN administrator, however i was lucky to work on a project on my new job where we were in search for all flash SAN to replace our FUSION-IO cards, this project involved lot of testing ,analysis and negotiation. During this project i have learned quite few things, most of it is related to dealing with vendors. While there are some great articles which covers most of the technical details, i had tough time finding info on what i should be looking for when purchasing a SAN. Returning two major SAN’s just because they didn’t meet our expectations,I am going to list out things that i have learned through out the journey, hopefully it will be helpful when you are in market for a new SAN. I like to get deep dirty into details, this list might be an overkill for some institutions, but i think if you need most bang for your buck you really need to make sure that you understand how SAN works. There are 4 pieces to this puzzle:
- Gather metrics, have a very good understanding of what your current environment is, have your numbers backed up by good collection of perfmon data.
- Write down features that are must to have and good to have. Like you might need excellent performance but might not need dedup, adding dedup to the mix might degrade your performance or you could say you need to have lot of storage then compression and dedup is necessary as long as you are OK to take some hit on performance. Some of the main features are like Compression,Snapshot,Replication,Dedup,QoS,Backups,Encryption and Clones, there could be more features.
- Bench mark existing storage, i have used sqlio before but with newer “intelligent” SAN’s sqlio might not be good tool as it writes all zero’s so most likely you would be testing just the controllers. Diskspd is good replacement for sqlio, David Klee has excellent post here related to diskspd. You need benchmark your existing storage to understand what you would be expecting from new SAN, this will help you in understanding how much you would gain or loose. You can also provide the same sample script to the SAN vendor so that they can test there system and provide there results.
- Shop around for SAN, this is basically research, ask friends from other organizations what they use, attending events like PASS and Super Computing, if you are in SQL Server world most of leading vendors will be present at PASS events. Super Computing is also very good place to learn about newer SAN’s .
- Have a very good understanding of what you are looking for, capture baseline using perfmon. Let perfmon run for couple weeks on the server or servers that you would be using SAN for, few key metrics that helped me:
- Physical Disk – Avg. Disk sec/Read ( Read Latency)
- Physical Disk – Avg. Disk sec/Write (Write Latency)
- Physical Disk – Disk Reads/sec ( IOPS) – Use this to understand what % is read activity
- Physical Disk – Disk Writes/sec (IOPS) – Use this to understand what % is write activity
- Physical Disk – Disk Transfers/sec (IOPS)
- Physical Disk – Disk Bytes / sec (Throughput)
After reviewing the perfmon data you should be able to answer few important questions about your environment:
- What is average IOPS ( in total for all drives , if you different flavor’s of drive then get it for each individual), what is peak and how long does it stay at peak .
- What is average read/write latency? What is R/W latency at peak IOPS.
- What is avg throughput, what are peaks.
The goal here is you should have good understanding of your averages but also need to know when you hit the peak is the SAN going to crash or just be fine.
2. So far you have spent enough time gathering details about your environment,by now you should have a very good idea what you are looking for. Just like any other product every vendor will say there SAN is best, at the end of the day you really need to know what is $/Gb value given SAN meets your requirements. When you actually get to $/GB i felt it was really tricky, because each vendor would calculate it different. Some calculate with estimated compression ( what they think they can compress it to , some claim 8X, i have never seen more than 3 X for SQL Server) and dedup, some calculate $/GB before RAID. I was very thrilled when a upcoming SAN vendor told me i could get $3/GB, i was jumping up and down, after 1 month of testing and lot of back and forth we realized for us it would be $12/GB ( as matter of fact anyone using SQL Server on there SAN would be paying $12/GB) . You might hear a lot of SAN vendors claiming 250M IOPS with sub milli second latency, yeahhh they get those numbers with 4KB block size, in SQL Server everything is between 8KB and 64KB. There are some really good SAN vendors, you just need to do your research and ask lot of questions. After dealing with so many vendors i realized had i asked them to fill out the excel sheet, i would have saved 100 hours of my life… ok i might be exaggerating a little bit but for sure many hours. I made a simple excel sheet, i filled out the columns for each vendor as i went through my POC , this helped me in making my decision.