Cluster systems also have scenarios where the central system can fail and IT services no longer work.
Cluster systems require shared storage with synchronised data mirroring in two locations to ensure the highest possible level of availability. If hardware components fail, then the applications running on them at the second location can still be operated. However, there are still scenarios within the field of cluster technology where the central system can fail, with the result that business-critical IT services no longer function.
Did you know that if a power failure takes place at a location, your storage cluster is no longer able to automatically transfer services to its cluster partners?
The storage cluster can not differentiate between an overall failure of the cluster partner or an interruption of the interconnects. In case of doubt, no services will be taken over.
This is down to technological reasons: in case of power failure, the storage cluster is unable to differentiate between whether the interconnects have been temporarily interrupted between the locations, or whether a complete failure of the cluster partner has occurred. To rule out the risk of cluster inconsistency, no services are taken over from the other location.
When the interconnects between two cluster partners are interrupted, this is referred to as split brain syndrome. If a cluster transfer is carried out in this state then an identity can be duplicated in the same cluster, meaning that transactions are processed in different volumes. Later on, these cannot be recombined, which leads to a loss of data. No automatic cluster transfer takes place under split brain conditions and your business-critical IT services fail.
A potential remedy would be to create a third data centre which monitors both locations and, in case of failure, initiates service transfer to the remaining cluster partner. However, this causes issues, including high costs due to additional space requirements as well as redundant LAN and SAN connections and increased complexity. The residual risk of inconsistency due to split brain syndrome still exists.
Do you operate online shops or production control systems, carry out online banking or use other business-critical applications which must be available around the clock? You're looking in the right place, as ClusterLion improves the availability of your essential applications!
Cluster technology offers excellent availability, but power failure scenarios can also occur even in mirrored cluster systems. You can protect yourself from these incidents. Find out how, here.
The availability of your storage cluster and critical corporate applications is improved.
Only two data centre locations are required, which reduces costs and complexity. Installation takes place with no interruptions.
We develop innovative solutions for your security. Take a decisive step to ensure the availability of your cluster.
The ClusterLion concept is the only one of its kind on the market with only two data centre locations, and even in a split brain scenario, data consistency within the cluster is guaranteed at all times.
ClusterLion can be retrofitted at any time with no interruption to your existing storage cluster during operations.
After commissioning, ClusterLion permanently monitors the power supply, the interconnects and selected services of the storage cluster.
If cluster services are affected by a failure, then this is recognised instantly and reacted to accordingly.
Firstly, ClusterLion cuts off the power supply to the affected cluster nodes, thereby creating a consistent state within the cluster. This also prevents unintentional restart of the cluster node.
The transfer of cluster services to the still-functioning location via in-house UMTS connections is initiated by ClusterLion as a second step.
Your services continue to run securely and in an orderly manner at the still-functioning location – excellent availability, courtesy of ClusterLion.
Due to a power failure in the mid-voltage grid, both of our data centres were without electricity for half an hour. Unfortunately a battery damage occurred simultaneously in one of the UPS which reduced the emergency power down to 5 minutes… that was to short!
For our central data storage we rely on a NetApp MetroCluster which is monitored and controlled by ClusterLion.
Because of ClusterLion the MetroCluster switchover happened fully automatically! Due to redundant GSM / TLE communication within ClusterLion the MetroCluster status was always visible, even though we also had a partial network outage during this time.
The MetroCluster switchback was quick and easy! ClusterLion provided clear instructions (ONTAP commands) which was very helpful and gave security to our storage team. ProLion supported us from beginning to end and we are very satisfied with their great support.
Result: No downtime on our mission critical applications and only very little manual work during switchback!
Steffen Loos, Helmholtz-Zentrum Potsdam