Practical, High Availability Strategies for Regulated Industries – Futurum Tech Webcast

Practical, High Availability Strategies for Regulated Industries - Futurum Tech Webcast

On this episode of the Futurum Tech Webcast, hosts Randy Kerns and Krista Macomber are joined by SIOS Technical Evangelist Dave Bermingham, for a conversation on the importance of high availability solutions for regulated industries such as financial services.

Their discussion covers:

  • The complexity and criticality of ensuring uptime for applications and data in regulated industries
  • Exploring high availability (HA) options for regulated industries, including financial services, with a focus on cost and performance considerations
  • An introduction to SIOS DataKeeper as a solution for high availability needs

Learn more at SIOS, or request a demo here.

Watch the video below, and be sure to subscribe to our YouTube channel, so you never miss an episode.

Or listen to the audio here:

Or grab the audio on your streaming platform of choice here:

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.


Krista Macomber: Hello and welcome to this Futurum Tech Webcast. I’m Krista Macomber, Research Director on our team here at The Futurum Group, and we have a very exciting edition for you. Today we’re going to be talking about some practical high-availability strategies for regulated industries today. So I have two very special guests with me. The first is Randy Kerns, who we actually were able to encourage out of his semi-retirement here to join us today. Randy, welcome.

Randy Kerns: Thank you.

Krista Macomber: And we also have Dave Bermingham with us today who is a technical evangelist with SIOS. Welcome, Dave. Thank you so much for joining.

Dave Bermingham: Yeah, thanks for having me, Krista.

Krista Macomber: Of course, of course. So I’m really looking forward to digging into this topic. I think when we think about high availability and that business continuity, certainly it’s always been a very pervasive issue, especially for mission-critical applications. And as we alluded to in our title for this webinar, regulated industries, as well. And we will certainly dig into that. But the first general topic that I’d like to start the conversation with today is this concept of the fact that it is becoming even more complex than ever before to guarantee that uptime for critical applications, and also access to data, as well. Randy, I know you and I were chatting a little bit about this yesterday when we had a little pre-show call. What are you seeing from that perspective?

Randy Kerns: Well, certainly the HA requirements have been somewhat pervasive, and they’ve gotten more complex as we’ve had different models coming onto the scene, people operating where their DR site is in a public cloud or maybe reducing the number of data centers they had. It’s also one of the issues that I think we don’t pay attention to enough. It’s no longer just very high-end enterprises. This is across the spectrum now, mid-size companies, et cetera, that operate in these environments. They have the same requirements, and yet they don’t necessarily have the resources to build these very complex environments.

One of the things I spend a lot of time working with the large companies in, is that they have staffs and infrastructures and things. When you work with other companies that are either trying to save money or change their models or whatever, they may not have the expertise required in managing infrastructure where a lot of the resiliency is based externalized from the application. They would really like to have the application owner or that administrator manage everything.

And it becomes a whole different argument for them about how to build the HA, yet the requirements are still there. In looking at these different HA requirements, even though it’s pervasive across different organizations, we looked at SIOS and it’s a solution that’ll be very economical and much less complex in some of these larger enterprise infrastructure elements that they have in place. This will be really important as dynamics change, as you move around to using the public cloud more for HA or DR-type capabilities.

Krista Macomber: Absolutely. There’s more options than ever before, which is great, but I think we’re still seeing that there’s a lot of solutions out there, as you alluded to, Randy, that are still very complex and very costly as well. And Dave, I know when we were chatting backstage, we were talking about the fact that this is a pervasive requirement. At the same time, it only further ups the ante when we start thinking about these highly regulated industries that have to answer to these regulatory bodies. You had actually brought up a couple of examples that had come across your radar just in your role at SIOS. So this might be a good time to talk about some of those.

Dave Bermingham: Yeah, absolutely. Of course, high availability and just keeping things online and available is important to just about any business. But when we talk about the regulated industry, it’s not only about serving their customers. They have regulations that require that they have a reliable and continuous access to their operations, to their data, and as well as data protection. So the financial services will have the Sarbanes-Oxley, the Payment Card Industry Data Security Standard, where healthcare, you’re looking at HIPAA, or Health Insurance Portability and Accountability Act. And even our utilities will have, here in the U.S, the North American Electric Reliability Corporation Standards, which ensures reliability of the electrical grid and protects the infrastructure from cyber threats.

So they have these responsibilities to ensure that they are compliant with these companies. What happens when things don’t go to plan? A couple examples come to mind. Back in 2014, Morgan Stanley had a $4 million penalty to the Security and Exchange Commission, and this was… The exact penalty was due to their failure to adopt written policies and procedures to protect customer data. But what really happened was that someone had copied customer data to their own personal server and that server was hacked. And so 730,000 customer records were lost to these hackers.

So they were fine, but then they were also required to enhance their cyber security measures and compliance protocols. So providing high availability is certainly an important part of the overall picture, but there are so many more things that go into protecting your data, starting with having the right policies in place and making sure they’re implemented and that your employees are trained on those policies and are following those policies and are constantly reviewed and updated to assure compliance takes place.

Krista Macomber: Yeah.

Randy Kerns: There’s one important thing that you’re hitting on, is that when you have some incident such that maybe data’s not available or data’s been exposed or whatever, everything comes under scrutiny. So if you haven’t taken the time to be diligent about setting up these environments, you’ll be called to account at some point.

Dave Bermingham: Exactly. Yeah. And data is so important to be protected and made available, but that data can simply be paper, as well. And so we saw that in the CVS Pharmacy, they had a 2.25 million HIPAA violation because they weren’t disposing of information that contained patient identifiable information. They’re throwing it in the trash out back. So we think of these mega servers and you got to have firewalls and all that, but it comes down to what are you doing with your trash, as well? So it really… You have to think about it from top to bottom. Beyond the security data protection, but just the availability as well, that can also result in downtime and fines.

I think the airline industry, they had a really bad couple of years back in 2016 with Delta Air Lines, and this was more of a traditional… I think of what we at SIOS would get involved with, where they had a power failure at their data center as a surge to a transformer, and then the backup system didn’t switch over because the power control module was corrupt. And that resulted in what they estimated to be about $150 million due to the outage. Right after that was British Airways the following year, almost identical. They had an improperly malfunctioning UPS that went offline, and then came down to again, procedures.

There was someone… There was human error, so the proper procedures for restarting all the systems were not followed, and that led to a total meltdown across British Airways’ global network, and that affected estimated over 75,000 passengers and cost upwards of 80 million pounds in compensation and additional expenses. So that just leads to really the need, make sure your IT infrastructure is robust with proper backup systems and redundancies. And at SIOS, we’re often doing multi data center failover, so even if your entire data center went away, we have a secondary data center with all your data ready to go in an instant.

Krista Macomber: And Dave, that might be a great segue into the second point that I really wanted to hit on. I think that we’ve really painted this picture around why is a high availability solution really needed, especially in these industries. So we talked about not only the fines and that component of the equation, but also just the ability to operate. So obviously there are a number of options that customers might have if they do choose to implement a solution for high availability. Some of the challenges are that some of these solutions, for example, might require additional software that might have added costs and administration requirements coming along with it, or maybe even some additional hardware that again, also might have additional specialized administration and complexities of the year. So maybe I’ll kick it back over to Randy and say, what are some of the options that we’re seeing customers are typically, and certainly Dave, we’d love for you to chime in as well with what you’re seeing in terms of, again, some of those different options that might be out there for customers.

Randy Kerns: Well, those in the very large enterprise space typically have more than one data center, and they have some type of specialized hardware that does things like active-active stretch clusters and other things that are obviously expensive, but they also involve another dynamic around having different administrators with different sets of requirements. And the complexity factor goes way up with the more people you involve in that. And it’s very complex, very expensive in those systems. Software-wise, sometimes they’ll add in separate software or specialized software that they’ll have, very customized to their environments. Number of times I’ve gone in and worked with customers like Dave was talking about, that had issues, and we’ll talk about what they’re doing, and I am stunned about how complex the environment is and what they end up telling me is that they’ve iterated on things, they’ve added things, and that they’ve just grown more complex over time. And so a lot of times we try to unwind it and look for something that’s much simpler, but in those environments, very expensive and the complexity is off the charts, so to speak.

So really want to look at things that are much simpler. You can limit the number of administrative tasks to a smaller set of people and doesn’t have the… I don’t want to say the ability, but doesn’t have the necessity to keep adding more complexity to it. So very important. Now the other part of it, and what you were getting at and what we talked about earlier, is you want to be able to do this across a different spectrum of sizes of customers. You want to get down to the point where, “Hey, I’m a small enterprise, but I’m in this industry so I need this environment.” Or “I may be a very large organization and I want to isolate my particular part of that to something simpler.” So it’s got to be pervasive across different sizes of organizations. I think understanding all that will help in reducing the complexity. You’ve got to work hard to make it simple.

Krista Macomber: Yep, absolutely. Dave, any comments from your end? I know that SIOS has been really working hard to address some of these challenges. So what are some of the alternatives? Because we do want to reserve some time to talk about your solution, DataKeeper in particular, but what are some of the other solutions? Maybe just building on Randy’s comments there.

Dave Bermingham: Like Randy was mentioning, traditionally the multi-data center cluster was reserved for the top-tier customer that had the resources to build out two data centers and manage those data centers and the expense involved with that. With the cloud, that’s open to anyone and everyone. The cloud has multiple regions and availability zones, and so when you’re looking at building an HA and disaster recovery solution, you want something that can take advantage to have the ability to automatically fail between availability zones within a region to take advantage of the cloud’s SLAs of four nines of availability. And in addition to protect against more natural disasters, something that might take out a whole cloud region, have the ability to stretch also to a remote region using the asynchronous replication so that you have minimal to no data loss, and then also layer in some failover mechanisms so that that remote data center, in the event of a disaster, you can be up in minutes rather than days.

We’ve seen some of the disasters we talked about earlier in the airlines. It’s hard for me to imagine that some power failure could bring things down for multiple days and to not have the ability to fail over to their backup data center if they even had one in place at the time. That’s the solutions that SIOS provides, and that’s the kind of solutions our customers are looking at now. They’re looking for that ultra resiliency. It’s all about redundancy, redundant systems, and then the data mirroring and then layering failover mechanisms to minimize the downtime. So high availability, we’re talking about four nines. You want to have less than five minutes of downtime per month, and then those systems have to be scalable as well. You could have the best plan for replication of failover, but if your systems can’t scale to meet unexpected workloads, then that’s also doesn’t matter how much planning you put into your HA and DR, your systems aren’t scalable enough. You’re going to have downtime as well.

Krista Macomber: Yep. Absolutely, Dave. And so I had a number of conversations with customers immediately following the pandemic a few years ago, and when organizations had to shift almost overnight to working practically a hundred percent remotely, and that was one thing that just continued to come up, was we did not have the sufficient scale in place to be able to support this. So I think this would be a great segue to talk about DataKeeper. Dave, can you introduce us quickly to the solution, and maybe some of the key ways that SIOS has been really looking to some of these problems?

Dave Bermingham: Yeah, so SIOS DataKeeper… Well, first off, SIOS is a software company. We have fully focused in high availability disaster recovery space, and I’ve been there 20 years, so I’ve talked to a lot of customers and seen a lot of things in my time here. But DataKeeper is our product that runs on Windows. We also have solutions on Linux, but DataKeeper specifically will run on Windows. And DataKeeper does two things: it does block level volume replication, either synchronously or asynchronously, to keep that data in sync. So as we were talking, if you want to have failover across multiple data centers, you’re going to have copies of your data in all the different locations you want to be able to fail over to. Synchronous is going to be great for the high availability. You’re not going to have any data loss. But those data centers have to be pretty well geographically located close to each other.

But with the asynchronous option that gives you the ability to replicate greater distances across the country or wherever your DR site is to protect against more regional disasters. But that’s just one component of the solution. The other part of the solution is that it integrates with Windows failover clustering. So in the Windows world, high availability has been taken care of by Windows server failover clustering since the early NT 4.0 days, and a traditional failover cluster would have multiple cluster nodes, but they all share one copy of the data that might reside on your SAN and of course that SAN is going to be highly redundant, and it’s going to have RAID and multiple controllers and multiple power supplies. It’s not necessarily a single point of failure, but it is located in a single data center in a single rack. So the concept of failing across data centers becomes a much more complicated, much more expensive solution if you talk about array-based replication solutions.

And what DataKeeper allows is to have a much more flexible configuration, because we eliminate the need for that SAN, and instead let people use just locally attached disk. And whether it is physical servers or virtual machines or cloud instances or any combination of those, and all the various storage devices that can be attached, whether it’s virtual disk or EBS volumes in AWS or your managed disk in Azure or any of the disk solutions in any of the cloud providers, we can replicate those disks between all the nodes in your cluster. And so you think about traditional workloads in the Windows world, SQL Server comes to mind as one of the probably most prevalent application that we’re protecting, but SAP components, even Oracle on Windows and in our Linux solution, we have solutions for Oracle… And so any enterprise-based application will typically plug into the Windows failover clustering, and DataKeeper will allow you to build what we call a SANless versions or a SANless cluster, but still 100% Windows failover clustering and all the features that you know and love, but eliminating the need for a shared disc.

Krista Macomber: Got it. Okay. So I know that two of the big themes that we’ve really been double-clicking on in this conversation have been the cost and the complexity, right? Can you maybe help us put a little bow on that? When we talk about DataKeeper, can we talk about really how it maybe addresses some of those costs and complexities that are inherent in some of these other approaches that we’ve been talking about?

Dave Bermingham: Right, right. Yeah. So, of course anytime you’re talking about high availability and disaster recovery, there’s a cost associated with it. You need redundant systems, there’s going to be an initial investment, but you have to look at the initial investment versus the long-term savings. We already saw examples where people didn’t have adequate HA or disaster recovery systems in place. It’s a high price to pay. You don’t want to find out later that your HA or DR solution wasn’t able to recover when needed. DataKeeper helps minimize the impact of that overall cost. So we mentioned just a second ago about array-based replication solutions, which can be great, but if you’re talking about a million dollar SAN or whatever the price is, hundreds of thousands of dollars, SAN in data center A, and then you want to replicate to data center B.

Typically you’re going to be within the same vendor, probably the same exact array, and then add on additional cost for the array-based replication. And then that just really is data protection. How do I integrate with something like failover clustering for an automatic failover, not only my data, but the applications that are using that data? So what DataKeeper does is eliminates the need for the array-based replication using our software-based replication, and also has the additional functionality to not only reverse the mirror direction so that now the data is active in your DR site or your secondary availability zone, but also through the integration with failover clustering. The applications will come online automatically. And very cost-effective solution.

Really only other we talk about in the SQL Server world, it’s very similar to a feature within SQL Server called Availability Groups. And Availability Groups, very similar in functionality, but it requires the enterprise edition of SQL Server. If you don’t need any of the other features of SQL Server Enterprise and you’re just looking at availability groups, you’re going to save about, depending on how many cores you’re protecting, anywhere from about 58% to 72% on the cost of not only the SQL Server licensing, but that’s including the additional cost of DataKeeper. So if you look at a 24-core system with availability groups in SQL Server Enterprise, and then the same 24-core system with DataKeeper and a SQL Server standard, that solution’s going to be 72% less overall cost in this software licensing.

Randy Kerns: Let me add one more cost factor that I think you tangentially got to here, but from a long-term standpoint, if I can isolate the administrative effort for that and not involve all these other infrastructure elements and administration, I can simplify that, but also the cost is contained, because once you get outside the application and application owner, there’s a lot of other charges associated with that that have to be there forever to maintain that high availability. So you really need to look at this from a TCO standpoint over time and say, “How much is this going to cost me one way versus another,” and pick a ten-year window.

Dave Bermingham: Yeah. SAN comes with SAN Administrators, right? This is always a running joke. Having a solution like DataKeeper puts you in charge of the SQL Server admin or the Windows admin of what storage do I need? Whereas if you have SAN, you might be saying, “Nope, this is what you’re going to get and you’re going to love it and in five years we’re going to renew. Talk to me then.” When you are in charge of it, especially in the cloud, you can provision and change on the fly instantly to make sure that you have not only the availability protection, but the performance, as well. If it’s not performing, it doesn’t matter how available it is, if it doesn’t meet the customer’s needs.

Krista Macomber: Absolutely. A lot more flexibility there. Yeah. Well, Randy, Dave, unfortunately we’re just about at the end of our time here. Any other closing comments from either of you?

Randy Kerns: One for me, and I think you hear us echo it, organizations really need to think about the long-term implications of what they do. You have the here and now, the HA, but some of the things you do can have financial impacts or one of the things Dave brought up is the ability to adapt, or “adopt” is a better word, new technologies as they go along. And so if you do some type of design that depends upon a particular technology, you may be limited in the future. So that’s why I really like what SIOS has done here about moving these with the Windows cluster example he was giving. It’s a great way to do that and move that into a more contained area, make it simpler, and I don’t really need to think about these long-term implications now.

Krista Macomber: Absolutely.

Dave Bermingham: Yeah. I’d say in addition to having a great plan and great policies and procedures, but really the ongoing maintenance, the training. It’s so often where I’m talking to a customer and they’re new and they might call and say, “What is this DataKeeper I see running on my system?” And so to leverage professional services or whatever it might be to come in and at minimum annually, just do a health check on my availability systems, make sure everything is still configured properly, things haven’t changed, there’s no issues that you need, and then training the staff to make sure, in the event of a disaster, your most junior staff member can open up the disaster recovery manual and get things back up and running because in a disaster, you can’t imagine what that disaster might be and who’s going to be able to bring things back online in those situations.

Krista Macomber: Absolutely. Well, Dave, Randy, thank you so much. I know I’ve really enjoyed the conversation, and I’m sure that our audience has enjoyed it as well. We also want to thank our audience for joining. Again, this is a Futurum Tech Webcast. Please make sure to comment and engage with us. We love to hear your feedback, and make sure to like and subscribe so that you don’t miss the next episode. Thank you so much, and we will see you then.

Author Information

With a focus on data security, protection, and management, Krista has a particular focus on how these strategies play out in multi-cloud environments. She brings approximately a decade of experience providing research and advisory services and creating thought leadership content, with a focus on IT infrastructure and data management and protection. Her vantage point spans technology and vendor portfolio developments; customer buying behavior trends; and vendor ecosystems, go-to-market positioning, and business models. Her work has appeared in major publications including eWeek, TechTarget and The Register.

Prior to joining The Futurum Group, Krista led the data center practice for Evaluator Group and the data center practice of analyst firm Technology Business Research. She also created articles, product analyses, and blogs on all things storage and data protection and management for analyst firm Storage Switzerland and led market intelligence initiatives for media company TechTarget.

Krista holds a Bachelor of Arts in English Journalism with a minor in Business Administration from the University of New Hampshire.

Randy draws from over 35 years of experience in helping storage companies design and develop products. As a partner at Evaluator Group and now The Futurum Group, he spends much of his time advising IT end-user clients on architectures and acquisitions.

Previously, Randy was Vice President of Storage and Planning at Sun Microsystems. He also developed disk and tape systems for the mainframe attachment at IBM, StorageTek, and two startup companies. Randy also designed disk systems at Fujitsu and Tandem Computers.

Prior to joining The Futurum Group, Randy served as the CTO for ProStor, where he brought products to market addressing a long-term archive for Information Technology and the Healthcare and Media/Entertainment markets.

He has also written numerous industry articles and papers as an educator and presenter, and he is the author of two books: Planning a Storage Strategy and Information Archiving – Economics and Compliance. The latter is the first book of its kind to explore information archiving in depth. Randy regularly teaches classes on Information Management technologies in the U.S. and Europe.


Latest Insights:

Steven Dickens and Paul Nashawaty at The Futurum Group highlight the strategic significance of Cisco's acquisition of Splunk, emphasizing the seamless integration of AI-enhanced tools to bolster cybersecurity and observability. They note that this collaboration is set to revolutionize IT operations by providing comprehensive insights and improving digital resilience.
Olivier Blanchard, Research Director at The Futurum Group, shares his insights from Computex 2024 about the transition from traditional PCs to AI PCs, the market opportunity for Windows Copilot+ PCs, how Qualcomm, AMD, and Intel already look to be positioning themselves in the Copilot+ market, key benefits of Copilot+ PCs and individual platforms, and what to expect for the PC segment in the short and mid term.
Paul Nashawaty, Practice Lead at The Futurum Group, shares his insights on the recent announcements of Anomalo’s enhancements in data quality management.
Camberley Bates and Steven Dickens, Practice Leaders at The Futurum Group, highlight Broadcom’s unwavering commitment to customer value through innovative programs and strategic initiatives. They emphasize how Broadcom seamlessly integrates business objectives with customer benefits, showcasing their dedication to delivering tangible results and fostering long-term partnerships.