By Pat Cimprich, Avanade
This post is a response to the following blog: http://www.communities.hp.com/online/blogs/datastorage/archive/2008/11/22/making-sense-of-wafl-part-3.aspx.
This is a continuation of some clarifications regarding a paper written in April 2007 describing storage testing Avanade completed with NetApp storage. In this most recent post, Karl had some questions about the test re-execution and clarifications I posted a few weeks ago: http://blog.avanadeadvisor.com/blogs/patc/archive/2008/11/10/12107.aspx.
One clarification was around storage interconnect. I had actually described the server and I/O configuration I used in my prior blog post. Here's a copy of that text "I mounted that LUN in a Hyper-V VM running Windows Server 2008 and used iSCSI as the storage protocol. The VM connected to the array using a single 1-Gbps Ethernet link."
Karl was speculating that I possibly had configured Iometer to use multiple paths somehow or that I was using a single 2Gbs FC connection. We already clarified that I was using iSCSI and that that single link was a mundane 1Gbs Ethernet connection. I'll also plainly state that there was no special configuration of Iometer done to multi-path the storage traffic in any way.
I didn't set up MPIO because I was lazy :-). From a technical perspective though, it’s not necessary. I didn't capture network counters in my test a few weeks ago, but I did monitor during test execution for bottlenecks and there were none with respect to network bandwidth. This makes total sense to me given my peak storage traffic was 70.6 MBs at a queue depth of 128. A 1-Gbs Ethernet connection has a maximum theoretical pipe speed of 125-MBs so at 70.6 MBs, there's plenty of bandwidth left.
I hosted this virtual machine (again - Hyper-V) on a Sun X4600 M2 server. The VM had 4 Virtual Processors, and 2-GB of RAM. The VM was running Windows Server 2008 x64 and I used the native Windows iSCSI stack. The LUN was created and prepared with SnapDrive 6.0.1. ONTAP is 7.2.5.1.
One further clarification regarding the NetApp Aggregate configuration: I used 20 spindles in my test Aggregate. I also changed the default RAID group size to 20 disks (the default is 16). This configuration is mentioned in both the whitepaper and my previous blog post, but I'm taking special effort to call it out here because differences in this configuration could have significant impact on performance.
On another note - in addition to the NetApp FAS 3070 array that I have in my lab, I also have a FAS 2050 - one of NetApp's lower-end arrays. Out of curiosity I replicated the 20-drive configuration described in previous posts and re-executed the same Iometer test. To keep things simple, I ran the tests with no fragmentation on the volume. Drives were again 300-GB, 15k rpm however they were a combination of SAS and Fibre Channel (drives in the 2050 frame are SAS and those in external shelves are FC); the drives in the 3070 (as they are in the 3050) are all FC.
Here are those results.
|
Iometer Queue Depth |
Throughput (MBs) |
IOPs |
Avg. Response Time (ms) |
|
1 |
2.9 |
253 |
4 |
|
2 |
4.1 |
360 |
5 |
|
4 |
8.1 |
720 |
6 |
|
8 |
13.2 |
1165 |
7 |
|
16 |
22.2 |
1953 |
8 |
|
32 |
32 |
2839 |
11 |
|
64* |
39.1 |
3467 |
19 |
|
128* |
39 |
3447 |
38 |
|
256* |
38.8 |
3423 |
76 |
* CPU bound - see more below
These numbers start out staying pretty close to the results in my prior blog for the un-fragmented FAS 3070 test results. They then begin to fall off at around a queue depth of 16.
When comparing these results at a queue depth of 128 to the numbers Karl obtained from his 3050, we see these results far surpass those of the 3050 (39 MBs here vs. 25.5 MBs from Karl's 3050). This doesn't really make sense as the 2050 is significantly lower in power as compared to the 3050. I agree that there definitely appears there must be a deviation somewhere that is preventing parity in results.
Some elaboration on the numbers from the 2050 above: The curve these results follow is very similar to the results delivered by the 3070. These numbers fall far short of the peak 70.6 MBs (at a Q of 128) that the 3070 delivered with a maximum of only 39.1 MBs. This is very much due to the power of the 2050. The 2050 is much less powerful than the 3070 with only 1 proc (vs. 4), 1/4 the cache, etc. As denoted by the asterisk in the table above (*), the processor on the 2050 became pegged at 100% starting at a Q depth of 64. This explains why the results flatten at 64 and beyond. Also at a Q depth of 32, the CPU hovered around 97%, which no doubt began impacting performance. The point I'm trying to make is that this relatively low-end array produced some respectable results.
Back to matters at hand. I agree with Karl - there must be some other factors that are preventing us from ending up with similar results. I've produced nearly identical results with Fibre Channel in my original set of tests, so that doesn't seem like a likely culprit. I have also used Windows 2003 vs. Windows 2008; both 32-bit and 64-bit versions of Windows; tested with 3 different server; and have used 2 different version of ONTAP. None of those have made a difference.
So what are we left with? Driver versions are one. I suspect that's not a problem give Karl's attention to detail. MPIO might be another. It might be worth trying a single link to see if things change, and at these levels bandwidth isn't an issue. If Karl's interested, he might try iSCSI… I honestly don't expect this to make a difference, as I've compared FC to iSCSI head-to-head for 3 years now and unless bandwidth is constrained, I've never seen a significant difference in performance. Our servers are also different, but I've used plenty of HP servers and I'm wholly confident that's not a problem. Our NetApp Aggregate configuration may be different and if indeed it is, that could have a substantial impact on performance. Hopefully the Aggregate configuration provided above will allow us to confirm we're using the same settings.
Beyond these things, the only thing I can think of is some difference in Iometer config. I did my best to state the configuration I used in my prior post. To possibly help clarify the config, I just copied the Access Specification definition lines from my Iometer config file:
'Access specification name,default assignment
FileServerConfig,NONE
'size,% of size,% reads,% random,delay,burst,align,reply
512,10,80,100,0,1,0,0
1024,5,80,100,0,1,0,0
2048,5,80,100,0,1,0,0
4096,60,80,100,0,1,0,0
8192,2,80,100,0,1,0,0
16384,4,80,100,0,1,0,0
32768,4,80,100,0,1,0,0
65536,10,80,100,0,1,0,0
I appreciate the back and forth here and do honestly hope we can get some clarity and closure. I spend significant time and effort validating our test results and we take the results of our tests very seriously. I do want to close though again by stating that we think that NetApp makes great storage and that our customers not only get systems that perform well for Exchange configurations, they also do so without requiring excessive iron (to use a word Karl used). This statement addresses a question that Karl posed in his last post: do customers get good value running Exchange on NetApp? What I am saying is yes they do.
Value is absolutely a significant factor that customers care about when deploying new systems. To be clear, Avanade is a consultancy and we have partnerships with many difference hardware vendors and we gladly work with any technology our customers want us to. If our customers are open to considering alternatives though, we might present different options. In those instances there absolutely must be a compelling value component to the story or our customers simply wouldn't listen. Our experience has been that NetApp solutions for Exchange not only meet or exceed performance needs, but also do so with a clear value story that our customers appreciate.