By Pat Cimprich, Avanade
This post is a response to the following blog: http://www.communities.hp.com/online/blogs/datastorage/archive/2008/11/08/making-sense-of-wafl-part-2.aspx.
In that post, Karl Dohm of HP raised some criticisms and questions regarding the tests and paper I wrote about NetApp in April 2007. (Link here on Netapp.com and here on Avanade.com).
As Karl points out Avanade and NetApp did indeed establish a partnership prior to the release of this paper. However the paper was not created as part of a marketing campaign to push the relationship. The reality is that the paper was the results of extensive testing we undertook to validate NetApp storage would indeed stand up to the punishing demands of Exchange workloads (in this case Exchange 2003).
We performed this testing to determine whether we were comfortable recommending NetApp solutions to our customers. It’s our job to make sure we're informed about the technologies we recommend to our customers. If these tests did not turn out positively, we would have not agreed to the partnership with NetApp.
Another data point I'd like to mention is that the version of the whitepaper available on NetApp's web site is a condensed version of a much longer paper. The original paper is 45 pages long and contains a much more in-depth review of the tests including full array configurations, Iometer configurations, I/O distributions, etc. That original paper dates from Q4 of 2006.
That paper also contains performance information for another storage vendor. The format of the test was a comparison of NetApp to another leading storage vendor. That full paper isn't released to the public as the intent of our testing wasn't to discredit one vendor vs. another; it was to confirm NetApp would work as compared to a platform we'd used successfully in Exchange 2003 deployments for years.
Unfortunately the condensed version of the paper available on netapp.com doesn't include full test disclosure. I wanted it in there, but I'm a techy and the marketing folks won this particular battle :-).
In the interest of full-disclosure, here is the Iometer Access Specifications used during those tests.
|
Transfer Size Request |
% Access Specification |
% Random Distribution |
% Read Distribution |
|
0.5 KB |
10 |
100 |
80 |
|
1 KB |
5 |
100 |
80 |
|
2 KB |
5 |
100 |
80 |
|
4 KB |
60 |
100 |
80 |
|
8 KB |
2 |
100 |
80 |
|
16 KB |
4 |
100 |
80 |
|
32 KB |
4 |
100 |
80 |
|
64 KB |
10 |
100 |
80 |
As you can see - this configuration is in no way representative of any type of Exchange workload (5.5, 2000, 2003, 2007). It is more of a wide-distribution of requests intended mimic file server workloads. For the record I took this configuration from here www.bluesmoke.net/viewArticle.cgi?id=g6&page=6.
Let me also address your other questions about MPIO policy or HBA queue depth and NetApp extents. No MPIO was used for any connectivity - Fibre Channel or iSCSI. Good question on HBA queue depth - it’s been a few years since this testing, but I believe we had it configured at a middle of the road value at around 32 or 64. Extents weren't available in ONTAP when this test was run so that's an easy one - no extents.
Looking at your original post you comment on performance degradation of NetApp systems. In simple terms - yes - this is accurate. NetApp volumes can appear exceedingly fast - to the point of defying physics sometimes - when they are brand-spanking new. Its also true that the performance of those volumes does indeed degrade over time. However, that performance does not trend down indefinitely. It will flatten out and establish a very consistent level of performance.
This is a known phenomenon in NetApp and if you talk to their performance engineers (as I have), they will acknowledge that this happens. However as with everything, knowledge is power, and with this knowledge the NetApp folks are able to establish sizing tools and guidelines that yield the appropriate FAS configuration for a given workload. Now if NetApp were to ignore this issue and try to sell the performance of the array based on that initial performance, there'd be a problem. However their sizing models are based on the long-term equilibrium performance, not the overachieving fresh volume performance.
At the beginning of your previous post you also focus on what I agree are the important factors for storage administrators: performance, space efficiency, and ease of use. My personal opinion is that NetApp does a very good job in all of these categories.
I'll start with performance - we could all spend enormous amounts of time in the quest for perfection, but we almost always need to get back to what's 'good enough' for a given performance requirement. Am I capitulating by saying that the performance of NetApp is simply 'good enough'? No. I'm only stating things in pragmatic terms: NetApp works and performs well. This isn't based only on my observations nor solely on NetApp's literature; it’s also based on seeing what Avanade's done with our customers using NetApp technology over the past 3 years. I'm fortunate enough to be in a role where I get to see a good spectrum of what Avanade's global consulting force is doing for our customers and in our collective experience, good performance from NetApp systems is not an anomaly.
With respect to space efficiency, WAFL does have overhead. There's no denying that. We're not talking huge percentages here though and NetApp would contend, and I'd agree, that the benefits outweigh the costs. Let's look at a more comprehensive view of space efficiency though and consider the impacts of snapshots and de-dupe. NetApp doesn't have the market cornered on snapshots, but their approach is compelling and the results are impressive with space for only changed blocks required and effectively no performance overhead. The benefits obtained through de-dupe are also impressive. Bear in mind this is de-dupe of live data and that this functionality is available for free from NetApp. The space savings realized from both of these technologies more than makes up for the space overhead inherent to WAFL.
Now we get to ease of management - an area where I think NetApp shines. NetApp arrays are very simple to manage. This perspective comes not from just running a simple test in a lab. This comes now from 3+ years of running live systems on NetApp storage. Among other things I run Avanade's global labs used for all internal software development, hosted demo systems, customer proof-of-concept systems, hosted collaboration systems, etc. It’s not an enormous 24x7 production operation with thousands of servers, but we do have storage from multiple vendors in the lab totaling around 200-TB of space. Of all of the storage arrays, the NetApp systems see the bulk of the work because they are so easy to manage. Adding new hosts, reconfiguring systems for different workloads, supporting parallel performance tests, supporting simple backup and recovery; all of these are important in an environment that is fast-paced and constantly evolving. These systems are so easy to manage that I'm able to have college interns on my team manage day-to-day operations of these systems… I'd never attempt that with arrays from a number of other vendors.
Now I'm not saying NetApps are the only storage arrays that are easy to manage. I'm only saying that in my experience NetApp does a great job here and this is a significant differentiator for them.
Back to the test results. For grins I re-executed this test using a NetApp 3070 I currently have in my lab. This time I again used 20 spindles in a single aggregate (300-GB, 15k FC). The RAID config was RAID-DP and configured such that 18 spindles were for data and 2 were parity. I again created a 1-TB LUN on a Windows host (the LUN resided on a 2-TB NetApp Volume - the only volume on the Aggregate). I mounted that LUN in a Hyper-V VM running Windows Server 2008 and used iSCSI as the storage protocol. The VM connected to the array using a single 1-Gbps Ethernet link.
I let Iometer create a 100-GB test file and used the I/O access specification listed above. To allow the volume to become fragmented, I let Iometer run with that access spec at a queue depth of 64 for 6 hours. At the end of that 6 hour period a reallocate measure reported that the volume had an optimization measurement of 22 - which I'm sure you'll agree is woefully fragmented.
After all that I re-ran the test using the Iometer Cycling Option "Cycle # Outstanding I/Os -- run step outstanding I/Os on all disks at a time." I configured 30 second warm-up periods, 3 minute run times, and an exponential stepping for the # of Outstanding I/Os starting at 1 and going up to 256. I did this with a single Iometer worker.
Here are the results of the test:
|
Iometer Queue Depth |
Throughput (MBs) |
IOPS |
Avg. Response Time (ms) |
|
1 |
2.5 |
221 |
4 |
|
2 |
3.7 |
327 |
6 |
|
4 |
7.4 |
654 |
6 |
|
8 |
11.8 |
1038 |
8 |
|
16 |
19.3 |
1715 |
9 |
|
32 |
28.8 |
2541 |
13 |
|
64 |
37.7 |
3323 |
19 |
|
128 |
44.1 |
3885 |
33 |
|
256 |
44.2 |
3884 |
66 |
Those numbers were lower than I was hoping for so I ran a reallocate to get fragmentation back to a controllable level. After the cleanup, reallocate reported an optimization level of 1 - which is optimum. I then ran the test again and here are the results.
|
Iometer Queue Depth |
Throughput (MBs) |
IOPS |
Avg. Response Time (ms) |
|
1 |
2.7 |
241 |
5 |
|
2 |
4.2 |
374 |
5 |
|
4 |
8.9 |
788 |
5 |
|
8 |
14.6 |
1280 |
6 |
|
16 |
26.3 |
2317 |
7 |
|
32 |
41.5 |
3663 |
9 |
|
64 |
58.1 |
5109 |
13 |
|
128 |
70.6 |
6230 |
29 |
|
256 |
68.8 |
6088 |
42 |
Quite a difference indeed. However that run was more of an academic interest because we all know the likelihood of running with the system completely unregimented like this is unrealistic. I fired Iometer up again and let it run for an hour to try and get a more moderate and realistic level of fragmentation in the volume. At the end of the hour I ended up with an optimization level of 10 - perfect. Here are the test results from that configuration:
|
Iometer Queue Depth |
Throughput (MBs) |
IOPS |
Avg. Response Time (ms) |
|
1 |
2.7 |
232 |
4 |
|
2 |
3.9 |
347 |
6 |
|
4 |
8.1 |
712 |
6 |
|
8 |
13.2 |
1161 |
7 |
|
16 |
22.6 |
2008 |
8 |
|
32 |
34.9 |
3075 |
10 |
|
64 |
48 |
4220 |
15 |
|
128 |
57.1 |
5048 |
25 |
|
256 |
56.5 |
4985 |
50 |
Right in the middle of the road.
These numbers are better than those published in the report. However a lot's changed between the two data points: storage array, drive size, drive age, virtual vs. physical server, operating system, etc. I also don't have the optimization level recorded from the original test volume. All-in-all though, these numbers are pretty closed to the original and had I the exact same gear I used during the test I'm sure they'd be even closer. I'd call that reproduced.
What this test also highlights though is the importance of properly maintaining a NetApp volume. Yeah - there's some pretty stark differences between optimized and highly fragmented. However 22 is way, way out of alignment and hopefully not something someone would see in normal day-to-day operations. Like maintenance required for any system, reallocate jobs should be appropriately scheduled to keep things optimized. This is standard NetApp practice.
You are right that the summary paper that's on netapp.com is loose in its details. For that I apologize and hope this response provides some of the missing information.
For Avanade our interest in this testing isn't to stand on a soapbox and proclaim that NetApp is the best thing since sliced bread. Our interest is validating that we can recommend systems for our customers and know they're going to work. Avanade is a partner of NetApp's, but we also have partnerships with many technology companies (including HP :-). We undertake this sort of testing with gear from lots of these partners - all with the intent to better equip ourselves to make informed decisions when we're building solutions for our customers.
As for the validity of the results of the Exchange tests, they're certainly valid and we've repeated them multiple times. We've also performed similar tests with Exchange 2007 and those tests yielded similarly positive results. However I'd rather look at production deployments, not tests. Over the course of the past couple years, Avanade has deployed more than a few production Exchange systems on NetApp storage. These include Exchange 2003 as well as Exchange 2007 systems. Sizes range from 4,000 users on the low end to well over 100,000 users on the high end. Performance has been great and our customers have been pleased. In my book, these production deployments are worth far more than some tests done in a lab