I have found the following statement from the EVA4000/6000/8000 Configuration Be
st Practices HP document. On page 13, it states --
Leveling and reconstruction. Leveling and reconstruction performance can be opt
imized with a minimum of 5 GB of free space per disk group.
Best practice to optimize availability. Set the occupancy alarm to the larger o
f the capacity required for PDM or the total HP Continuous Access EVA write hist
ory log capacity, plus 5 GB. This capacity is converted into a percentage of the
raw capacity and then rounded to the next largest whole number. The pseudo-Exce
l formula would be (see footnotes for description of functions):
On the above formula, what is the meaning/equivalent of ceiling and PDM_capacit
y? For the following example in one of our EVA6000 disk group (taken from Comman
dview EVA):
Total Capacity : 8473 GB
Vraid5 Capacity : 1471 GB
Total Occupancy : 6633 GB
Protection Level : 1
Size of disks used : 146GB
Would you be able to calculate the above Occupancy Alarm? We don't use Continuous Access.
What's the implication of using the minimum of 5GB as mentioned above? In REALity, is 5GB ok?
Many Thanks,
Charlie Bulosan
Note: If you are the author of this question and wish to assign points to any of the answers, please login first.For more information on assigning points ,click
here
5gb is not enough, you should reserve space for PDM events, too. Each PDM event needs twice the amount of space of the largest disk in the diskgroup. In your case that would be 146gb * 2 = 292gb Plus the 5gb extra 297gb.
Applying the HP formula this would lead to an occupancy alarm level of 96% The reason for this alarm level is when a disks fails the EVA needs enough free space to reconstruct it so that vraid1/vraid5 protection is still in place. This is compareable to other storage systems using a dedicated hot spare drive for the same reason.
PDM is proactive disk management. I have no idea why it is called proactive. It works reactive normally when a disk fails or when you add/remove disks and so on
Have you experienced a situation on your EVA Disk Group where the required by Leveling and Reconstruction/Occupancy Alarm is similar to that of 96% ... and an event of failed disk on that DG?
the alarm level itself harmless, it only triggers an event message when the occupancy level reached the alarm threshold, nothing more.
We once had such a situation with a very occupied eva. Usage was about 95% (alarm level 96%) and then a disk failed. So when reconstructing the disk the EVA reached the alarm level (due to the missing failed disk) and sent an event message. As soon as we replaced the failed disk the occupancy alarm went off again.
The only important thing to keep in mind is to always have enough free space to allow reconstruction of a failed disk otherwise you would have no raid protection until the disk is replaced.
with occupancy at 98%, I had a 2 disk fail simutaneously. It was so bad the hosts lost access to the EVA for just over 2 days while leveling occurred. I now run 90% and have a lot less issues with leveling times.
I have raised a case with HP about this and their answer was if there are two simultaneous disk failure, the Disk Group will not survive. Two non-simultaneous will be fine.
That's not necessarily the case. I've had two simultaneous drive failures in the same disk group. In that situation the VRaid-5 disks in the group did indeed fail. The VRaid-1 disks in the group were OK once the disks had been replaced.
This does of course depend on how the RSS is distributed, so YMMV !