Archive
Solution of the Clariion plaid issue
The problem was not with plaids as such, but rather with network congestion easily triggered by plaids.
The default configuration of PowerPath is to make all available paths active. With two iSCSI ports per SP, there are four paths, each path 1Gbps wide, 4Gbps total. However, that’s true only for the array side. On the host, we only have two 1G NICs. So, every time the array starts firing on all cylinders (and using plaids built from LUNs owned by different SPs is guaranteed to make it so), it is pumping twice as much data as the host interfaces can handle. Therefore, network congestion, frame discards, and severely impacted throughput.
Solution: change the mode of two of the four paths from active to standby, choosing them so that there’s one active path on each SP to each NIC. Alternatively, add two more NICs so that the host bandwidth is equal to the array bandwidth (though this may require four NICs on all other hosts, since using multiple host NICs on the same VLAN is not recommended).
Expect a new Primus KB and some changes to EMC iSCSI documentation.
Clariion plaids on RHEL 4
We will begin with a quick mention of the problem currently under investigation with EMC support:
- Host: Dell PowerEdge R710, PowerPath, RHEL 4.8.
- SAN: Clariion CX4-120
- Connectivity: iSCSI – two ports per SP, two ports on the host, two VLANs on Nortel ERS5520.
Problem: if we present two LUNs to the host, put them into a single VG, then create an LV with no striping (LUNs get concatenated on that volume), everything is good. If we take the same LUNs and put them into a striped LV per various EMC Best Practices documents (since the LUNs themselves are striped across physical disks EMC calls this layout “plaids”), read performance suffers in a very bad way, dropping to about 4MB/sec. Write performance stays perfect at over 100MB/sec.