The Main Scoop, Episode 28: Protect Your IT Investments With an Observability Strategy

The Main Scoop, Episode 28: Protect Your IT Investments With an Observability Strategy

With an observability strategy, businesses can quickly cut through the noise to see key actionable insights. In this episode of The Main Scoop, Greg Lotko and Daniel Newman are joined by Brett Dawson, Principal Mainframe OS Systems Programmer at Fidelity Investments, to discuss observability and how it fits in with AI and hybrid IT strategies.

It was a great conversation and one you don’t want to miss. Like what you’ve heard? Check out all our past episodes here, and be sure to subscribe so you never miss an episode of The Main Scoop™ series.

Watch the video below:

Listen to the audio here:

Disclaimer: The Main Scoop™ Webcast is for information and entertainment purposes only. During this webcast, we may discuss publicly traded companies and even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we do not ask that you treat us as such.

Transcript:

Greg Lotko: Hey folks, welcome back to the next episode of The Main Scoop. I’m thrilled to be back with you. I’m Greg Lotko joined here by Dan Newman. How you been, Dan?

Daniel Newman: Greg, it’s good to be back. It’s always good to be side by side. It’s been a minute, but it’s good to see you and all is going pretty well. Today’s show’s going to be great. I am really excited. Observability has been one of those topics that’s really risen to the top of many enterprises, many companies, and as we have so much data in our enterprise estates, there’s a lot of complexity. We hear a lot about AI. But in the end, if you’re running enterprise IT, you really have to pay attention to all those streams of data, all those sources of data, and you have to really figure out how do you manage it, right?

Greg Lotko: It is about paying attention. When you think about it, you want to know what’s going on. You want to be able to act, you want to be able to react, and how can you do that if you’re not aware of what’s going on? And that’s what observability is all about.

Daniel Newman: Well, what do you think? A lot of people out there like hearing from us, but I think we have a great guest today.

Greg Lotko: Let’s bring in Brett.

Brett Dawson: Hey.

Greg Lotko: Hey. We’ve got Brett Dawson with us from Fidelity. And actually and also from the farm. I know we’ve gotten to talk before about this. So first of all, why don’t you introduce yourself, tell folks a little bit about you and what it is that you do.

Brett Dawson: Sure. My name is Brett Dawson. I’m a principal mainframe operating systems programmer at Fidelity Investments. I’ve been there for 10 years this year.

Greg Lotko: Wow. Congratulations.

Brett Dawson: And I, on the observability train got the fortune of growing into my career during this big socialization of observability process that has been occurring for effectively the last 10 years.

Daniel Newman: Sorry, I want to hear about the farm. Can you go back for a second?

Greg Lotko: And I was going to tie those two together. So I want to hear on a more basic, on a fundamental level, because observability is a concept that doesn’t have to just apply to IT. And I know you think about observability throughout your life, and I know you’ve got a farm with a bunch of cattle running around. So what does observability mean to you?

Brett Dawson: Yeah, so observability to me means that I am able to look around at the stuff that I’m managing and I can actually take action on problems that are coming up before they become real problems. So on the cattle front, so I run, my family runs a show cattle business, Silent Night Farms. So we’re getting ready for the Fort Worth Stock Show coming up in January, and two of our main cattle that we’ll be showing there this year have been showing a little bit lean. And so in this whole pursuit of observability, we’ve been noticing that, and we’re working to correct that already by providing them with a higher fat diet so that they are less lean during the show and they look, right there, right?

Greg Lotko: It’s interesting. So you talk about observability, and we opened this by saying you can relate this across a bunch of things. So you talked about being able to realize that that cow isn’t really where I want them to be as far as their sense of being or their fitness. So you got to know the types of things you’re looking for. You have to be able to see it, observe it, but then you got to know what it means in order to do something about it. So I tell you honestly, I could look across your field of cows and I would be able to observe, well not observe, I would be able to see everything you could see, but I wouldn’t know that I was observing, that cow isn’t quite fit. So there’s intelligences into understanding what you’re looking for and then there’s the reaction, the plan, what do you do about it? And that transcends IT.

Daniel Newman: Yeah, I mean I could think of a lot of analogies where you look at your environment and being able to decide, you’re maybe looking at how you would rate property, stuff like that. Someone that has an eye for it. Well, observability is doing, a lot of it is doing it very quickly. But you have a thesis on the key business aspects of observability. So let’s take you off the farm for a moment.

Greg Lotko: Off the cattle business.

Daniel Newman: No, it’s okay. This is all in good fun. Let’s take you off the farm. I’ll take you back, put you back in the server room for a minute. And doing that job. What are the business aspects when you look at observability?

Brett Dawson: So what I notice a lot whenever I’m looking at a lot of the issues that go on within our shop, and even what you might read in newspapers for other organizations where it’s clearly an observability miss of a problem, is that today there’s a heavy reliance on a lot of the small team coordination of people that really, really understand data. But they’re getting asked by too many people to provide reports and provide back information on the data that they understand, and not having a really good socialized way of providing those kinds of reports back to them so that they can get the answers that they need in a timely manner in order to help stop a lot of problems that could come up.

Greg Lotko: So with that, when you talk about specialization or expertise, there’s this pattern that they seem to be asked, what’s the state of your niche? What’s the state of your territory? So observability is kind of taking a step back and seeing the whole forest. You want to observe what’s going on and have an understanding of the relationship across those things. So when you’re asking those questions, it’s not that somebody wants to know exactly necessarily what’s going on here, they’re asking about what’s going on here to think about how it relates to something else or how that aspect might impact something else.

Brett Dawson: Right. And a lot of the stuff that we encounter is trying to marry data between different platforms, different services, different applications, different programs in the same application in order to drive a lot of contextual information of all of this data that we get today. And try and make something sensible out of that data so that we can take action on things that show up that are strange.

Daniel Newman: Well, you’re looking at the performance of applications, you’re looking at root cause. If something’s been down, what does that mean? You’re looking at the ROI that you’re getting out of certain tools and applications that you’re invested in. These are all the typical areas. And of course there’s all kinds of anomalies. And then observability flows into security, which is another, you do see a lot of ties together to everything from how is the app running and then of course is there vulnerabilities, and how are we addressing those vulnerabilities? You’ve been at this, you said a decade. But you’ve probably seen, look, the sprawl. And you have data all over the place now. You’ve got your data at the Edge, you got your data on device. But then of course you have your data centers, your cloud. We’ve seen multi cloud proliferate. We’re seeing that go even faster with all that’s going on with AI. We’re trying to ingest more. And then of course we’re trying to build custom models, small language, we’re trying to train and tune, we’re trying to run RAG. All these things create new observability challenges. So talk about the evolution over a decade and what you’re seeing and how observability, you ended up in this role doing this and how much has it changed?

Brett Dawson: Well, so I will say the biggest thing was whenever this observability socialization really started taking off was that all of a sudden there was nine or 10 different observability solutions, and all of a sudden all your different business units said, “Oh, but I want to use this one and not this one.” And there was no real standardization of being able to pull that data together and really be able to look at it. It’s only been in the last 18, 24 months where you heard this small term of OTEL, open telemetry, where all of a sudden now it’s basically becoming a de facto standard of how you’re going to feed data so that everybody knows how to look at the data the same way so that there’s no confusion between two data points that’s coming in because it’s all known through an open format. And I think that’s been a really big trend over the last 10 years.

Greg Lotko: Speaking the same kind of language or context so that you can compare across the hybrid environments.

Brett Dawson: Right? Yeah. Because that’s been a big thing. And it’s easy to go look at all that kind of stuff where you move from Splog to ELK became a thing, to then moving to now OpenSearch. And you’re just adding more tools of how do I look at my data versus how am I ingesting that data in a known format?

Greg Lotko: And then making sense of it. Earlier on you were talking about all this data that’s coming at you and it made me think of the fire hose. So you got the fire hose coming at you, you want to be able to categorize it and standardize it so that you know you’re looking at things and being able to compare them the same way. But then you got to kind of pick through the needles in the haystack and figure out what’s important to you and then how those elements relate.

Daniel Newman: Yeah, it’s interesting too to hear about the examples of the different tools and the different business units. It reminds me of are we using Zoom, are we using Webex? Are we using teams. Except for IT nerds that are trying to figure out what they’re going to use for these solutions. But how much more work is it, how much more focus is on it? How much more time is spent? Now, again, I’m not asking you to share anything proprietary, I know you can’t, but for those, it’s really big fund. I mean a number. It’s a very big business. The data has to be mountainous, the amount of data. Is the effort, the work, is it becoming more complicated or is that the observability tools, are they actually making this easier to handle and are you able to get them all into one? Are you able to do it all with one solution or are you still running multiples?

Greg Lotko: Yeah, so is it just expanding or is it narrowing and you’re getting more focus?

Brett Dawson: I think that we’re in that, or over the hill of that beginner stage of a lot of the trends that come along, where now we’re starting to see a lot of focus into, okay, now we’ve experienced every product because every product under the sun came out within a two-year period, and there was no real governance around a lot of things. To where now, okay, now we know where the strengths and weaknesses are of the different platforms that we’re seeing to where now we can start driving. And you see it all over the place. People find their favorites now after 10 years and they start to just pick them and they’re the ones that serve them best.

Greg Lotko: And you also see fundamentally I could say what works and what doesn’t work, but that’s a little too over the top. It’s what provides value and what’s meaningful and what’s actionable versus yeah, I can see that. You can tell me that. And I can see that I have that information, but it doesn’t really provoke me to do anything. So you kind of figure out, there’s all this stuff that I can observe, but what are the things that are most important? And I would imagine as you get the experience within your business is seeing where that leads you. You go, okay, this tool may be more well-suited more to what it is we’re looking for versus something else. Even though everything across the board is claiming to the goal of getting to the same endgame.

Daniel Newman: So resiliency has been a really important topic as well. We all know there’s sort of this motion of proactivity and this motion of a lot of what observability is about is actually being able to identify an issue ahead of the problem before it takes down an application.

Greg Lotko: Identify that you’re trending to a state that would cause an outage or an issue.

Daniel Newman: Correct.

Greg Lotko: So let’s address it before we get there.

Daniel Newman: And so we’ve all seen resiliency issues. Sometimes they seem more controllable than others. But do you see the tie together? Does observability and resiliency, are they looked at in the same conversation or are they treated sort of differently?

Greg Lotko: Are they the milk and the beef?

Brett Dawson: So I tend to look at them in the same light, because from the Venn diagram of things, there’s a lot of things about resiliency concepts that lend themselves well to observability concepts. The faster I can observe that a problem is about to occur, the faster I can shore up and make my platform more resilient to that issue actually occurring. So there’s a lot of overlap there, but from a resiliency perspective, it’s still playing with the unknown. How am I supposed to actually prevent any and all problems from occurring, which is the effective in game of resiliency.

Greg Lotko: You’re trying to avoid a problem that might not have occurred or an environment or a situation that might not have occurred. There’s this cold front coming in, this warm front, a dust cloud, and that situation may never have presented itself before. A drought with something else going on. Fox or coyotes in your cattle field. Right? Yeah.

Daniel Newman: We’re going to keep coming back to that.

Greg Lotko: Yeah, it’s fun.

Daniel Newman: All right, cool. Seriously though, you say that it may not be, but to me a path of value is how AI can do this at scale. You can’t monitor and look at every single problem and have hands-on resolutions for every single problem, anomaly detection, root cause analysis, early detection, all seem very AI-centric problems. Machine learning, algorithms. How are you thinking about that? How much is that optimizing? How much are you investing in that?

Brett Dawson: That is a big trend that we’ve been playing around with and noticing is that machine learning specifically has been something that should and can bring value to an organization, as long as you’re using it for the correct data points. Not all data is important enough to indicate that a problem is going to occur. Right? That doesn’t mean that I’m not interested in the data. Contextually, it could be very insightful if we see an anomaly occurring, because it could help us determine why the anomaly is occurring, but that’s not necessarily going to tell us that anomaly is occurring, right? So yeah, I think it’s definitely right there in lockstep with observability, especially as you see a lot larger scale of data that’s coming in. Think of solutions around from a supermarket type of perspective where you could be getting data back from your point of sale machines all the way to your back end systems, and all of the data that traverses all of that. There are not enough humans with enough knowledge in the world that can take that kind of data and roll with it. But you throw that through some really good machine learning and anomaly detection engines and all of a sudden you’re able to see that, yeah, the 5% of failed transactions that occurred in this region of the Northeast was because we saw a weird internet outage. And whatever, to where having that kind of observability at that scale lends itself to a required assistance of machine.

Greg Lotko: We’re in a world where we’re instrumenting everything. So you can effectively monitor everything. You could know everything that’s going on everywhere with everything, but then saying, hey, let’s apply AI to determine what are the most important things that I know and how do they relate? And then if you use AI and machine learning to say, here are the patterns or the things that are leading to something that would be bad or that I don’t want to have happen, or even opportunistically identify a pattern of something that I want to take advantage of, you can certainly do that with applying technology. And more and more it’s then a learning of having AI then recommend, okay, now what do I want to do about it? Not only do I see where it’s trending and that this could be good, bad or indifferent, but what’s the action that I want to take?

Daniel Newman: I certainly could kick it off. I certainly like NBA. I certainly like the idea of next best actions. I certainly like the idea of automations. These are pragmatic AI use cases. We love all the generative stuff, AGI and all that stuff. But there’s things where you build algorithms, you get recommendations, you get actions, you get automations. This is things that bring a lot of value to platforms, a lot of value to keeping IT systems running.

Greg Lotko: All right, so when we think about the world today, we think about all the observability tools we have. And I just think back, go five, 10 years ago, it used to be you’d have senior executives or operations people checking in saying, “Hey, is everything running okay? Are we all right? Are we set for the weekend? Are we set for this big holiday?” With the tools and the capabilities out there, is there a higher expectation or a new expectation on how often people can check in and when they can get data and know things are okay, or hey, where are they trending?

Brett Dawson: Oh, definitely. Right? Especially coming into the mainframe space where traditionally it was, hey, how were we a week ago to a few years ago, a decade ago? How were we yesterday? That’s great. I need to know how we were an hour ago. Or what about right now? And that’s kind of, that’s the trend.

Greg Lotko: And comparatively, right?

Brett Dawson: And comparatively. Exactly. I want to know what are we like right now.

Greg Lotko: So it’s not just are we okay? But how okay are we?

Brett Dawson: Exactly. I want to know how we are right now and how does that compare to how we were an hour ago, or yesterday, or a week ago. Help me understand where I’m trending to so that I can understand where we might be going.

Greg Lotko: So it used to be that you would think, wow, we were processing this many transactions and it’s okay because we have enough capacity. Now it’s kind of, oh wow, we’re processing this many transactions and it’s 30% more than we were doing an hour ago or last week. But with dynamic capacity being turned on, still have 20% headroom or 30% headroom. And don’t worry, we have enough hardware on the floor that we can grow that much more. So it’s the fan out of the questions in time and dimensions.

Brett Dawson: Or even in the reverse. We’re all of a sudden seeing a 20% reduction in transactions. Why do we have so much headroom? Why can’t we draw that back so we can save some money? Right?

Greg Lotko: Right.

Brett Dawson: That sort of thing. So yeah, I think there definitely has been a huge trend into I need as close to real time as I possibly can get. And traditionally on the mainframe, that just wasn’t really a thing. You really relied on Jeff in the back room who could analyze and honestly read SMF binary and tell you that data, right?

Greg Lotko: So dynamically, and I know I keep bringing it back to the cattle, right, but it’s like you’ve got that spigot open and you’re filling up the trough with water, but it’s overflowing. Why keep that spigot open? You’re going to be paying for the water. So in a world today where you have dynamic hardware, you have dynamic software, hey, if my transaction rate is going down and I don’t need the capacity sitting there idle, I might as well turn down the machine and that’ll lower my software bill. Or at least leave me headroom that I can roll over when I do hit that spike.

Brett Dawson: Yeah, exactly.

Greg Lotko: Yeah, makes a lot of sense.

Daniel Newman: And Brett, I want to thank you so much for joining us. I think it was really interesting to hear such a large organization, institution, so much data.

Greg Lotko: And I learned about cattle.

Daniel Newman: That seemed to be a focus for you. But in serious, it was a really fun conversation.

Greg Lotko: Absolutely.

Daniel Newman: We know observability is a big trend line and this is something that every organization is really up against. And of course we’ve had this conversation on The Main Scoop I don’t know how many times, but the hybrid path that everyone is on, the amount of data, it’s only going to grow exponentially. Companies trying to find a way to call that data and utilize it and of course make sure that systems are running, that applications are running, that their analytics are running. It’s complicated. And so Brett, it’s a big job for him.

Greg Lotko: And hybrid I think should be expansive. As we went through this conversation and we talk about the totality of a lot of the things we’ve been talking about. We’ve used hybrid to talk about platforms and technology and environments that way, but it’s really hybrid beyond that, using the best tool for the right problem, the best platform and the right person.

Daniel Newman: It’s the genesis of the show is that hybrid isn’t just hybrid cloud. It’s so much more.

Greg Lotko: Absolutely.

Daniel Newman: And that’s why by the way, the mainframe remains an institution within the institutions.

Greg Lotko: It’s part of that mix, an important part.

Daniel Newman: And as an analyst that focuses a lot on the future, I genuinely believe what you do and what the mainframe does has a long and healthy future helping companies achieve most with AI and every other technology disruption that’s coming. For this episode, though, Greg, we got to stop talking. We got to stop. No more cows.

Greg Lotko: Thanks again, Brett. Appreciate having you.

Daniel Newman: No more observability.

Greg Lotko: Yep.

Daniel Newman: No more nothing. And everybody hit that subscribe button. Join us here on The Main Scoop for all of our episodes. We are almost 30 episodes in now. We’ve been doing this for a while and I look forward to doing this some more. We appreciate the community. We appreciate you being part of the show. We’ll see you all soon. Bye-bye now.

Greg Lotko: See you next time.

Author Information

Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.

From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.

A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.

An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.

SHARE:

Latest Insights:

On this episode of The Six Five Webcast, hosts Patrick Moorhead and Daniel Newman discuss Meta, Qualcomm, Nvidia and more.
A Transformative Update Bringing New Hardware Architecture, Enhanced Write Performance, and Innovative Data Management Solutions for Hyperscale and Enterprise Environments
Camberley Bates, Chief Technology Advisor at The Futurum Group, shares insights on VAST Data Version 5.2, highlighting the EBox architecture, enhanced write performance, and data resilience features designed to meet AI and hyperscale storage environments.
A Closer Look At Hitachi Vantara’s Innovative Virtual Storage Platform One, Offering Scalable and Energy-Efficient Storage Solutions for Hybrid and Multi-Cloud Environments
Camberley Bates, Chief Technology Advisor at The Futurum Group, shares insights on Hitachi Vantara’s expanded hybrid cloud storage platform and the integration of all-QLC flash, object storage, and advanced cloud capabilities.
Dipti Vachani, SVP & GM at Arm, joins Olivier Blanchard to discuss how Arm is revolutionizing the automotive industry with AI-enabled vehicles at CES 2025.

Thank you, we received your request, a member of our team will be in contact with you.