High-performance computing has some serious power requirements and finding sustainable solutions for this is a theme at this year’s SC24. Hosts David Nicholson and Keith Townsend are joined by Barcelona Supercomputing Center’s Operations Director Sergi Girona and Lenovo’s VP, ISG Product Group Scott Tease for this episode of Six Five On The Road at SC24. They discuss their exciting partnership to advance high-performance computing (HPC) and address global sustainability challenges.
Tune in for details on:
- The impactful collaboration between Barcelona Supercomputing Center and Lenovo on HPC advancements
- The role of Lenovo’s Neptune Water Cooling Technology in driving sustainability in supercomputing
- An overview of BSC’s MareNostrum 5 and its contribution to HPC and research
- Insights into Lenovo’s partner ecosystem in the European public sector and their efforts towards solving global challenges
- Future directions in HPC, demonstrating progress through sustainable technologies and partnerships
Learn more at Lenovo.
Watch the video below at Six Five Media at SC24 and be sure to subscribe to our YouTube channel, so you never miss an episode.
Or listen to the audio here:
Disclaimer: Six Five On The Road is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors, and we ask that you do not treat us as such.
Transcript:
Keith Townsend: All right. Dave, we’re in Hotlanta, but it’s not so hot. But is it hot because we’re at SuperCompute 2024 and we’re trying to figure out how to liquid cool, air cool, all the great things, Sergi, you and your team are doing at Barcelona’s Supercompute. Scott from the noble ISG VP. Welcome to Six Five on the Road.
Scott Tease: Good to be here with you guys. Thank you.
Keith Townsend: All right. So Sergi, let’s start out with you. What is Barcelona’s Supercompute?
Sergi Girona: So, Barcelona Supercomputing Center is a national facility for Spain. We aggregate more than 1,100 employees, researchers that are trying to get the solutions for society on science and industry. So we have strong corporations with industry, but one of the highlighted ones is the one for Lenovo. They are financing our research that we are commonly investing in research in personalized medicine, in climate change, in processor design, in power savings on energy consumption of the systems. So we are one of the references worldwide of supercomputing on research and services.
David Nichsolson: Is this a government private sector collaboration or funding-wise?
Sergi Girona: In our case, we’re a public institution composed by the Spanish Ministry of Science and Innovations, the Catalan Ministry of Science and Innovations, and The Technical University of Catalonia. We are publicly funded, so all our funds except that those coming from industry that want to cooperate with us are public money. So that’s the reason we are driven by the interest of society.
David Nichsolson: So from a Spanish perspective, sort of federal and state, if you will, funding? Yes?
Sergi Girona: Yes.
David Nichsolson: Yeah. Okay.
Sergi Girona: So we have these strong corporations in the Barcelona Supercomputing Center. You may heard about the news of Spain and Catalonia discussion some years ago, but no, we have strong cooperation in science for BSE for more than 20 years.
Keith Townsend: So Scott, talk to us about this special relationship you have with Barcelona Supercompute Center.
Scott Tease: Yeah, so it’s been a long relationship. I’ve known this guy for a very, very long time. We’ve installed several computers there, including the new one, the newest one that just went in this year. The reason we love working with Barcelona is they do amazing things with the compute, but their researchers are working on all kinds of fields that make lives better for everybody. So that’s what drives our team to want to put the systems in there. The funding that he talked about, the research funding, it’s helping us look at all different kinds of things from medicine to how we take things to market and improve local economy, so really exciting to be a part of that.
Keith Townsend: So Sergi, talk to me about some of the users on this system. What are they doing?
Sergi Girona: Users, we’re covering all areas of science from artificial intelligence to medicine to biology to chemistry to astrophysics, cancer drugs, dissipations, automotive, wind energy, anything we are covering. So our access, as you mentioned, is public access. So they are competing to get access to the systems. They get access for free and they’re running for a year, for two years on the systems on the most powerful computing facilities to get the results.
David Nichsolson: So there’s a famous saying from the part of Spain that Sergi is from, todo bajo el sol, everything under the sun. How do you keep these things cool in the Supercomputing data center?
Scott Tease: I like that.
David Nichsolson: You like that?
Scott Tease: That was a good one, man. That was a good one.
David Nichsolson: Yeah. But seriously, what are you doing at Lenovo to make sure that the amazing systems that are running here that are essentially transmogrifying electricity into heat, how do you do that.
Scott Tease: That’s exactly what they’re doing. So the systems are, they’re power intensive, but the amount of work they turn out is truly amazing like head and shoulders better than what we had just a few years ago on a performance per watt or however you want to measure it, but it is still a lot of energy. So one of the big consumers of energy in a typical data center is the air conditioning. We’ve got to keep these systems cool. As they run, they give off heat. Doing that with traditional air movement and things like that is very, very costly. It’s also kind of bad for the planet because of all the power it uses. We’ve turned to liquid cooling. Our Neptune systems use warm water cooling actually, so we don’t have to chill the water that goes into the systems. It’s going in warm. In fact, it could be up to 45 degrees Celsius, which means I’m never chilling that water. I don’t need any fans. I don’t need traditional air conditioning. So we’re driving the power consumption down used by the data center dramatically versus an air cooled system. So your PUE is running at 1.
Sergi Girona: 1.06.
Scott Tease: … 1.06, which is industry. I mean, that’s where the industry needs to be to effectively run this gear and be efficient.
Keith Townsend: So a lot of this has been driven by accelerated compute. We’re moving away from CPU focused compute to accelerated compute. How has this transition impacted BCC?
Sergi Girona: So what we are having is we are serving all the scientific community. So we want to have systems for all the workloads, for all the different applications. So we are having a real compensated system, including accelerated partitions and general proposed partition that is driving the capacity to perform the analysis of ancillary codes that are necessary because not everything can be solved today in accelerated computing systems.
Keith Townsend: And where are you seeing workloads balance out? Where are the workloads that are CPU driven versus the ones that are accelerated compute driven?
Sergi Girona: It really depends on the domain. So for example, for climate change, the codes for doing the analysis on the weather predictions, on the transformation, on the movement of the air, also for the fluid dynamics on the airplanes. There are many codes that have been there for many years and especially those codes are approved by industry for certifications and those are not adequate.
Scott Tease: Accelerators are kind of dominating the top 500 when we look at it. That’s the newsmakers. But a lot of science is still being done on general CPU computing. Still a lot of research is being done on that.
Keith Townsend: So you mentioned earlier that your users compete for resources and the ability to run workloads. What is that process?
Sergi Girona: No, we have an open call. The call is closed in specific deadlines and then we have a committee external to us, which is analyzing all the proposals. So we have another subscription, three to one. Only two. One is sent in. Two are discarded. So we get the best research across Europe and Spain.
David Nichsolson: In North America, in particular, there’s a lot of discussion about power grids, availability of electricity moving forward, the insatiable appetite for electricity that AI represents. Are you facing similar constraints in Europe?
Sergi Girona: So those systems are very power consuming, but we need the systems. So we need to move science forward for solving climate change, for solving the personalized medicines, for solving the smart city problems. For solving anything, this is required. So we don’t know, have the means to solve this problem, so we need to have limitations and restrictions on using any bot properly, but we have to make sure that those are available to scientists and industry.
Scott Tease: Yeah, yeah, these AI systems and the HPC systems that we’ve been investing in, they’re doing amazing work. So short term, yes, they consume power. Long term, the benefits for the planet for better health, better healthcare, more efficient buildings, more efficient cars is going to outweigh the power consumption we’re doing in the systems right now today. That’s the exciting part of what we’re working on.
David Nichsolson: There’s an interesting map here where people who come in are asked to put a decal of where they’re from and where they’re working. And you see the truly global nature of this conference. Although it’s here in North America, it’s been here in North America for a long time. I know that Lenovo has a global reach just like the conference does. Tell us about how that kind of changes things from a European perspective in terms of where the systems are manufactured.
Scott Tease: Oh yeah.
David Nichsolson: You have facilities in Hungary. What’s the story?
Scott Tease: Yeah, so we are a multinational company. We do business. We’re fairly fortunate. We’re big in the East. We’re big in the West, big in Europe. In fact, Europe is our largest single region. Wherever we compete, wherever we’re doing business, we want to be as local as we possibly can and part of being local is manufacturing the systems close to our customers, so we do. We manufacture, since 2022, we’ve been manufacturing in Hungary. We’ve shipped over a million servers that sport over 1,000 different customers, but it’s nice to be able to do it. You visited for the build, so a pretty short flight from Barcelona into Hungary. So he was able to be there for the build out of those first racks, but it also means things like less shipping. So we’re being less impactful on the environment for all the shipment and the freight we’re doing from Hungary to Barcelona.
Keith Townsend: So talk to me about lessons learned for Lenovo as you service this huge use case. How has this helped you with down market, even up market solutions?
Scott Tease: Well, the best thing we knew is listen to him. [inaudible 00:09:23] I mean, we listen to him and we’re always in good shape. But what we’ve tried to do our very best to do is design systems that can be used as the highest end of supercomputing, like at BSC. But those same exact systems can be used by any university, any corporate environment with very little effort. So very small increments. You don’t need to have huge budgets. We’re fortunate, BSC has a very large budget. They buy lots of systems. In this case, what do we have? 6,400 Gen 4 Xeons in there? Really, really master system. That same system can be bought by any user in an increment of one or two nodes. So that’s a big part of what we’re trying to do is bring exoscale power to every scale, to every customer.
Keith Townsend: So we haven’t talked numbers and this is Supercompute and while this system may not be in the top 500, what are the numbers around this system? How big is it?
Scott Tease: It is in the top 500.
Sergi Girona: A big system. It’s in the top 500.
Keith Townsend: Okay, there you go. I’m corrected. How big is the system?
Sergi Girona: The system… big is relative there, okay?
Keith Townsend: Yes, it is.
Sergi Girona: It is using a total of 90 racks. It is placed in a location with 1,000 square meters. The importance is not the size but the density. So we have racks. Those racks are using 60 to 70 megawatts each. So all the racks are connected with a networking NDR200 that allows cooperation because supercomputing is cooperation. You want to perform one part of the problem, here’s another one, and you want all the 6,400 cooperating together. So you need fast networking. So this is part of the file system. So this is the largest system of this technology worldwide. There’s no other this size.
Scott Tease: Yeah. When you talk about non-accelerated compute, which again still supports a huge amount of research, the system hosted at Barcelona is the largest non-accelerated computer in the world.
David Nichsolson: Is that all water-cooled at this point or is it a mix? In your particular case, are all of these systems water-cooled or is there a mix of water cool, air cool, or-
Sergi Girona: All the system, except some small components, for example, the networking, is based on DLC and this was a requirement for us for the installation. So we said that the RFP, the Tender documents, we require that the system is at the minimum 85% DLC system. Otherwise, we cannot afford to do air cooling because it’s too expensive.
David Nichsolson: Now with the warm water coming relatively, it’s all relative, right? Just like big, it’s relative.
Scott Tease: It’s warm. 45 is very warm. It’s hot.
David Nichsolson: And so you have a temperature gradient of what, 10 degrees C in and out?
Scott Tease: Yeah.
David Nichsolson: Is that still enough of a gradient to start considering recapturing that energy, recycling that energy at some point in the future, or is that gradient not enough to worry about? Do you start… yeah.
Sergi Girona: In the short distance, you can reuse. So we are in a building of four, 500 people. In that building, we need to have hot water at 45, so we can reuse it. Easy, direct. For the amount, for the capacity, may not. For long distribution, you need to higher the temperature of the water, but that’s still doable and we are doing this in Europe many places.
Scott Tease: And the concept that the heat in the data center is waste, we really want to combat that. We want to turn it into something that we can recycle. It is energy, as you said, it’s been converted from electrical energy to heat energy, but it’s valuable to us and we want to find a way to recycle it. Starting to do that, the first thing is start hot, make it even hotter, and that’s what we do with Neptune. No. Yeah.
Keith Townsend: Well, we really want to thank you two for stopping by, sharing your story. It is always amazing to see the work being done at the highest levels because it does work its way down into whether we’re talking about data centers, smaller research facilities. As you’ve mentioned, Scott, all the way down to two nodes. Supercompute HPC does amazing things and this balance between the energy costs and the benefits for society. We’ll be having this conversation all week here in Atlanta, SuperCompute 2024. For me, Keith Townsend, and my co-host, Dave, we’d like to thank you and continue to please watch our coverage here from SuperCompute 2024.
Author Information
Daniel is the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise.
From the leading edge of AI to global technology policy, Daniel makes the connections between business, people and tech that are required for companies to benefit most from their technology investments. Daniel is a top 5 globally ranked industry analyst and his ideas are regularly cited or shared in television appearances by CNBC, Bloomberg, Wall Street Journal and hundreds of other sites around the world.
A 7x Best-Selling Author including his most recent book “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor.
An MBA and Former Graduate Adjunct Faculty, Daniel is an Austin Texas transplant after 40 years in Chicago. His speaking takes him around the world each year as he shares his vision of the role technology will play in our future.