Oxide and Friends | Transcript: AMD's MI300 and the Future of Accelerated Compute

AMD's MI300 and the Future of Accelerated Compute

December 11, 2023 / 01:14:01/S3 E34

Speaker 1: 00:00

Got a notification that's like, hey, do you wanna, join? Oxide and Friends? I was like, I guess. I don't know. What are they talking about?

Speaker 2: 00:08

I don't know, man. Those guys are kinda turkeys. They never intro their guests. That's what I've heard about that.

Speaker 1: 00:13

It's a Right. Persistent theme.

Speaker 2: 00:15

I gotta tell you that I feel disorient I I'm just gonna a little bit of a confession. Adam, as you know, I recently celebrated bit of a milestone birthday. I am I'm 50, and, boy, I generally do not feel 50, except when I am in someone else's Discord when I feel like I'm 95. I feel like I I feel like I'm in a high school party looking for my teenagers and unable to find them. And I'm like, I I should

Speaker 1: 00:43

I I getting looks.

Speaker 2: 00:44

Are you getting looks like I should not be here? I am seeing things that I cannot unsee, and I really just wanna, like, do my business and get out of here. So, I I apologize to all those, So this alright. Well, you know, not Adam, I'm doing exactly the thing that you say I do.

Speaker 3: 01:01

How many okay boomer jokes are allowed in the next hour?

Speaker 1: 01:04

That okay. All of them.

Speaker 4: 01:06

Oh, that's see, that's that's already a that's already a hot button. I know. Take right there.

Speaker 2: 01:11

Okay. All of them because I would just like for the record, we there's a generation between millennials and baby boomers. It's the one that Adam and I are in. So I don't know.

Speaker 4: 01:20

Which is just that's the whole joke already. We've had the joke now. It's excellent.

Speaker 5: 01:24

The forgotten generation.

Speaker 1: 01:26

It's not the fault. Not that one.

Speaker 4: 01:27

Not that one.

Speaker 3: 01:29

But those that's No.

Speaker 2: 01:30

No. No. No.

Speaker 5: 01:30

There's a silent generation. And then there's you guys, the forgotten generation. Yes. What are you

Speaker 4: 01:34

what are you saying? They talk too much? Jeez.

Speaker 2: 01:37

Listen, in an effort to not forget ourselves, we actually gave our whole generation a name. A name that was then crib by subsequent generations. Classic gen x. Anyway, I I'm I again, I'm I'm doing the thing. I'm so sorry.

Speaker 2: 01:47

So alright.

Speaker 3: 01:48

I would love Put an intro.

Speaker 2: 01:50

Yeah. What an intro. Welcome to Oxide and Friends, everybody. So, really, excited to have, some terrific guests here. So we've got, we've got, yesterday last week, we built the launch of the Mi 300 and a bunch of folks, in town.

Speaker 2: 02:07

And among them were George Cosma, of Chips and Cheese, and Jordan Reyna from Storage Review. So, welcome, guys. Was great to see you last week. Both of you were up at Oxide, nerding out with some, bunch of computers. And it's great to have you here to talk about this m I three hundred launch.

Speaker 2: 02:27

So thanks for coming by.

Speaker 3: 02:29

Yeah. Appreciate you having us. This is, a lot of fun to be here. It was a lot of fun to come out and visit too and see what you guys got going on firsthand. It was, recommended by our Discord actually.

Speaker 3: 02:39

And, they said, hey. Who are these guys? I came and found out.

Speaker 2: 02:46

What what a what a modern love story. It's our Discords got us together. You know? Isn't that that's so, it's great. Adam, you were informing me that that in in Twitch parlance, this has, like, a name when when you get

Speaker 1: 02:58

so what what is This is called this is a raid. So we've got

Speaker 2: 03:01

Raid, r a I d.

Speaker 1: 03:03

There you go. A whole crew of of someone's fan base showing up. Usually, it's like kind of a passing of a baton, not not sort of the the, Jetsons meet the Flintstones like we're doing here. But I think the ratings still applies.

Speaker 2: 03:17

More more Jetsons plus Flintstones. I feel I feel this is more like a crossover.

Speaker 1: 03:20

Yeah. But that's but that's a Gen x reference that only you and I get. I'd like some clarification

Speaker 3: 03:25

on which one of

Speaker 5: 03:26

us is the Jetsons. Do you understand that reference?

Speaker 2: 03:31

Wait. Okay. We you can make a Jetsons reference though. Surely. Surely.

Speaker 2: 03:34

Did you I why am I doing this? Why am I doing this?

Speaker 5: 03:37

I need so

Speaker 2: 03:39

you're frustrated. So okay. So and I guess I can make a kiss on Led Zeppelin reference. I'm trying to think like the super group, I think. Anyway, whatever.

Speaker 2: 03:46

It's great to have you here. It's great to have our 2 tribes. Our 3 tribes, I guess, because you guys that George got the the chips and cheese crew as well. So we got bringing everyone together, to talk, Mi 300. So, it's pretty interesting launch, something we've had our eye on at Oxide for a long period of time.

Speaker 2: 04:06

But, Jordan, do you maybe wanna kick us off in terms of, of what brought you down here for it and what you saw that was interesting, in the in in the launch on whenever it was, Wednesday?

Speaker 3: 04:19

Oh, man. There was actually a lot of really there were some subtle points that were really interesting. We, Kevin and I, Kevin's our lab director, he came out with me to San Jose for the AMD event. We went into it kind of expecting less than was actually delivered, in in in that's not a dig at AMD. It was more of a, you know, AMD's always had kind of a an an an underdog, good good value, you know, good product for the money.

Speaker 3: 04:54

And we we weren't expecting them to come out of the gate swinging like that with such a robust, I mean, between the X and the A, right? So I think it's important for the audience to understand when we say Mi 300, we're talking more about a series of stuff. We're not talking about one specific product because in my head at least are 2 very distinct things with 2 very distinct applications. The x is more of what you would think of as a traditional GPU. Right?

Speaker 3: 05:19

The, the h one hundred, the a one hundred, the h two hundred, that that sort of thing, where the a is closer to what would be, the NVIDIA analog of a Grace Hopper, right, a a CPU and GPU combined. And, you know, they they came out of the gate with that. They came off the top rope, and they really they they lag they they laid down something that was, it wasn't expected. Let's just say that. It's it's looking really good on paper.

Speaker 3: 05:51

All the things that AMD is telling us about it, all the things that we're hearing from partners are really good. It's it's it's gonna be yeah. This is gonna be really interesting to see as as another competitor enters the space on it.

Speaker 2: 06:05

Yeah. It's gonna be really interesting. And I and I'm sure you've heard of the same things we've heard about, like, no. No. It's gotta be NVIDIA.

Speaker 2: 06:10

It's gotta be NVIDIA. Anything other than NVIDIA's you know, CUDA has got this incredible moat, which obviously all true. But I do think that especially with the the difficulty in getting parts and Jordan, you'd point out to me that's like, hey, if I wanna buy an h one if I wanna rent an h one hundred for by the minute, I can do that right now. If I wanna actually buy 1, I'm gonna be waiting for a long time. And if I let alone if I wanna buy, you know, 10 or a100 or a1000.

Speaker 2: 06:33

And I think people are realizing that, actually, it is good for the industry to have a choice here, that, actually, having a single dominant player with with absolutely no competition whatsoever is really not good for us all collectively. And I just feel like and George, George, get your take on this too. But if you because it feels to me like the zeitgeist is beginning to shift, and people were saying that, like, it's gotta be NVIDIA. It's like, actually, you know what? Maybe this AMD stuff has actually got a shot, and, boy, the hardware is awfully interesting.

Speaker 2: 07:05

So maybe it's worth getting you know, making sure that PyTorch works well or some of these these other things that that work at parity.

Speaker 5: 07:12

So at least in terms of the hardware. Right? I would say that this is almost AMD's playbook at this point. If you look at Epic, they did this exact same thing. So you can sort of analogize Mi 200 with, Epic Rome.

Speaker 5: 07:32

Sorry. Epic Milan. Not Milan. Jeez. Italian city dates.

Speaker 5: 07:37

Naples. Thank you.

Speaker 2: 07:38

Naples. Yeah. Yeah. Yeah.

Speaker 5: 07:40

Yeah. Sorry. I Italian city names. I do apologize. But, yes, Epic Naples.

Speaker 2: 07:48

And that is interesting. So Naples in that like, Naples is the first real I mean, this is where AMD is basically dead on the operating table when they when they come out with Naples. And it and Naples ends up being more of a study for these future architectures. Is that what you meant?

Speaker 5: 08:03

Yeah. And and you can so you you saw sort of with Naples, the first chiplet where AMD was heading. Right? You could see that with Mi 200, where they were also heading for Mi 300. And it was both Naples and Mi 200 sort of just it was AMD saying, hey.

Speaker 5: 08:27

We're here. Right? We exist.

Speaker 2: 08:29

We're we're alive. Don't bury me. Don't don't close the coffin lid. I'm alive. I have Naples.

Speaker 5: 08:35

Yeah. For Rome and for Mi 300, it was AMD saying, we're here. Pay attention.

Speaker 2: 08:43

Interesting.

Speaker 5: 08:44

So it's gonna be interesting to see what Mi 400 does if it's going to be sort of what, Milan was. Now granted, NVIDIA is not Intel. They're not sitting on their back end, like, see or stuck with, process node issues. So it's it's interesting to see the the competition heat up.

Speaker 2: 09:11

Do you think that and what do you think about internally? Because the software story has been the criticism for AMD. And, but do do you agree that it it seems that, there is more, willingness to look at non NVIDIA architecture, specifically AMD, from those folks that are kinda at the cold face of the stuff than it has been historically?

Speaker 5: 09:32

So I'm gonna take that in 2 parts. I'm gonna take the software story, and then I'm gonna take the looking at someone other than Nvidia. For the software story, yes, that that is the big crux for anyone adopting AMD. And it's that AMD software stack, it's not CUDA. Like, it's Rock'em is I can't believe it's not CUDA, but worse for for Okay.

Speaker 5: 10:05

So to speak. That

Speaker 2: 10:08

means that's hard. Easy. Oh, no. But you also you're thinking I'm being harsh

Speaker 5: 10:13

because I care. Right? I and why I say worse is not in that it's it's it's not universal. Right? And even if you write code for, say, rDNA 2 or 3, there are certain code paths that because of whatever reason, if you try and port that code up to the Mi series, it just won't port without recompile recompileting the Yeah.

Speaker 5: 10:49

And that's because of just different things between the architectures. So it's it it's a story of I know that AMD has been pushing for it, and they and a lot of people outside of AMD have been pushing for better and better software. And from what we've seen at the event and from what I was told, it looks like they're fine that there's finally starting to become a push towards more software on the consumer side, which is very, very, very good.

Speaker 2: 11:25

And Jordan I'm sorry. Go ahead.

Speaker 3: 11:28

Oh, no. I was gonna say, I'd even go a step further and just say, you know, it's it's classic AMD, you know, the drivers and software will come.

Speaker 5: 11:35

Yeah. Yeah. And and you you do have to sort of give a lot of people are like, well, where oh, AMD's making money. Where's my software? Where's my software?

Speaker 5: 11:48

It doesn't move that quickly. Right? Mulan was really where AMD sort of started getting its foothold and true movement in the data center. Right?

Speaker 2: 12:01

I mean, to be to be fair, this is where the analogy breaks down a little bit because it's like, you know, Naples and Milan and in general, they all execute x86 code. Like, you can have some pretty old, crufty x86 code that will execute on those parts versus like Nave. You've got CUDA. It's like it is you're porting it to Rockham. I mean, there is actually there's a lot more lift involved.

Speaker 2: 12:22

So, George, do do you feel that, like, I can't believe it's not CUDA, which I would like to point out is do you know what that's a reference to? When you say I can't believe it's not CUDA, do you know what you're implicitly making a reference to? I just like to add.

Speaker 5: 12:33

I can't believe it's not butter. Yes.

Speaker 2: 12:34

Okay. So the but I I can't believe it's how does I can't believe it's not butter? That's not that's surely a, a trap in the eighties.

Speaker 5: 12:42

Like I

Speaker 2: 12:43

can't make a Jetson's reference but I can't believe it's not butter or whatever I'm not you know what I'm not even gonna go there but so the do you feel that Rockham is beginning to get out from underneath that? Because it it certainly felt that way with some of the enthusiasm that we saw from partners on on Wednesday. But do do do you feel that they're beginning to get out from underneath some of that?

Speaker 5: 13:03

So at least in my community, right,

Speaker 3: 13:07

Rockham,

Speaker 5: 13:08

I believe, was 5.55.6 and 5.7. Specifically, 5.7 were actually very warmly received. And because there were some very big updates there that were very, very, very welcome.

Speaker 2: 13:26

So let me ask you another thing because and I Jordan, I wondered if this is one of the things that you found. I wanna get to, you know, you said you found something surprising with the launch. I found some things a little bit surprising with the launch too even though, you know, obviously, we, like, you have I've known about m I three hundred for a long time and knew it knew roughly what the what was gonna be launched, but, you know, the way kind of the way it's presented is always interesting. And I loved seeing, and this is not new for them, but it was great to see, the emphasis on open source for that software, which is a real contrast to what we see out of NVIDIA. And, I thought that I were you expecting that that strong emphasis on open source?

Speaker 2: 14:01

Because I think that that's part of the reason, you know, you said that with AMD, the driver's gonna get better, but it ain't gonna get better if it's it necessarily, if it's proprietary. Being open is a really important part of that.

Speaker 3: 14:10

Yeah. I that's up there on my list of things that were I I would say it wasn't a huge surprise for it, but it's up there on the list of things that were okay. That's impressive. You know, it's a strong stance that Intel has taken as well, right, with, with OpenVINO and in going their route. Open sourcing this kind of stuff only has one effect, and that's bettering the community.

Speaker 3: 14:34

Right? I I'm a firm believer in open the code, open it up.

Speaker 2: 14:39

That does not feel like it doesn't feel like you've got some great business insight there, but that some somehow this eludes some other companies that, like, hey. If you sell hardware, open sourcing the software is only in your best interests, actually. But, I'm glad to see AMD get that. So yeah. Okay.

Speaker 2: 14:55

That's interesting. So I that you were a bit surprised by the because I felt the same way. Like, not not shocked, but good to see. What were some of the things that surprised you?

Speaker 3: 15:06

Specifically around kind of the, the architecture and the the amount of I mean, like, the the the the raw engineering prowess that went into the Mi 300 series between both the a and the x. You know, we got some really technical deep dives on, like, thermal analysis and you could tell they put a lot of time and effort into thinking about, okay, we can pump, you know, 5, 7 100, 900 watts through these chips, but how do we get the, how do we get the heat out? Right. That's a completely different conversation than the one we've been having, but it was something really impressive nonetheless, talking about the, I can't think of the exact term. I'm sure George knows, but like the micro tunneling inside the silicon to help remove the heat.

Speaker 3: 15:54

It was it's like, wow, like, okay, these guys came out swinging. And then when you get to things like the Mi 300a, right, and you compare it to something like Grace Hopper, Mi 300 Hayes, you know, has a single address space. It's not 2 distinct things. So you're you're talking now performance of the applications itself. So we're getting

Speaker 2: 16:15

So could you describe the 300 a? Because I think this is a big part of why we're interested in this. I think it's a big part of why you're interested in this. Yeah. So just describe the architecture of the 300 a a little bit.

Speaker 3: 16:25

So, George, keep me honest here. So the Mi 300 a is a what AMD calls an APU. So a lot like their old desktop and laptop chips, but on super steroids and the Arnold Schwarzenegger workout program with some Joe Rogan mixed in there. It's got 24 cores that are all SMT enabled. They're Zen 4 cores, so they're the kind of like the Genoed iteration.

Speaker 3: 16:55

Those those are hooked up directly to the, I wanna say, a 121 gigabytes. George, is that right?

Speaker 5: 17:07

So so it's 256 megabytes of Infiniti cache, and then the memory the memory is a 128 gigabytes.

Speaker 3: 17:16

128 gigabytes of

Speaker 2: 17:18

Yeah. And 20 gigabytes of HBM.

Speaker 3: 17:20

Of HBM. Yeah.

Speaker 1: 17:21

Oh, wow.

Speaker 3: 17:22

Which that in itself, right, from a programming perspective, when you're handling these apps and you're going through the, the how you're doing different calls and how you're how you're building the software stack. Right. This is why this is such a paradigm shift in my head. AMD has an uphill battle in getting the software ready so you can lift and shift like a model at a hugging face. But it's, it's, it's a very different way of thinking about how you interact with some, with a very powerful accelerator like that, right?

Speaker 3: 17:55

So you've got this CPU, an X86 CPU, that's essentially just hanging out in a hot tub of HBM with the accelerator. And man, the options that can come out of that, like, just thinking about it is is kinda wild.

Speaker 2: 18:16

You know, I I'm I'm still I'm all taken aback by the the hot tub metaphor is not one that occurred to me, and I just now my brain is going all sorts of different ways there with the, showing in the hot tub of h b m. Well, there's a lot

Speaker 3: 18:27

of heat in there too, so it came naturally. Yeah?

Speaker 2: 18:29

Very hot. Yeah. It's actually well, it's a scalding tub, actually. It's actually, like, it's actually boiling. No.

Speaker 2: 18:34

The well, and that I think is part of why we are so interested in it is because historically, the GPGBU has kind of been this accelerator sitting on the other side of a PCIe bus, and it has really it's been a device. And it's been a super fancy device, And it's been one that's got this proprietary interconnect and proprietary way of talking to it and so on, but it's been a device. And I think actually pulling this on package and having this be a chiplet that's now sitting alongside your general CPU chiplets, to me, it's like the package becomes the computer, at least that the high performance part of the computer is the package, and, you begin to solve a bunch of problems that you have when you have stratified the system. I mean, I like, Jordan, immediately, you're pointing out that, like, you've got actually now you can you can address memory. You can address this HBM from either the compute side or from the the the APU side, which is, opens up a lot of opportunity to develop a truly unified system, which I think has been part of the challenge, of GBGBUs as as we have implemented them.

Speaker 2: 19:40

And, George, if you were to squint at this, and and I think, you know, Jordan may have already set you off on this because you could look at Mi 300 a and say, okay. Well, that's, like, pretty similar to Grace Hopper, the NVIDIA's Grace Hopper. But, when we were talking about this last week, you, you had some thoughts that maybe, like, actually, we shouldn't think of these things as being may maybe they have some architectural similarities, but there are a lot of differences too. Can you elaborate on some of their on some of their differences?

Speaker 5: 20:10

So at the highest of the high level, yes, you you could say that they're they are competing products. However, their implementation is almost the polar opposite of each other.

Speaker 2: 20:26

Same. Yeah.

Speaker 5: 20:28

You have AMD taking the approach of a fully unified memory space that is cache coherent at the cache line level. Right? So you can actually share 128 byte cache lines between the CPU and the GPU. Whereas in Grace Hopper, it's not one memory space. It's 2 memory spaces.

Speaker 5: 20:56

And there's some trickery pokery with figuring out what's being accessed when and who by who more often and it gets shuffled about. It it's more, Well,

Speaker 2: 21:16

it sounds like it would just be harder to program too. I mean, is that not

Speaker 5: 21:19

Yeah. So if you go and you look at their programming guide, it's you you you can tell that this is not as elegant, so to speak.

Speaker 2: 21:34

There's a big c.

Speaker 5: 21:36

Yeah. It it it's not as elegant as my 300 a is. Now you you can argue that Grace has a more powerful CPU, that's TBD, but you're likely not wrong. But you can also more

Speaker 2: 21:54

more powerful than those Zen 4 cores. That's pretty it. That's

Speaker 5: 21:57

Oh, so what I mean is in terms of multithreaded performance here. So you have 72 cores on, Grace versus

Speaker 2: 22:08

Just the 24.

Speaker 5: 22:09

Yeah. Yeah. So now Mi 308 is clocking them higher if I remember correctly. I think Grace is at 2.6. And my 300a is at 3.7.

Speaker 5: 22:25

But still, I'm pretty sure in terms of the if you weigh in Geekbench on them or Cinebench, Grace would probably end up winning. But

Speaker 2: 22:38

So can can you talk about power for a second in terms of, like, power from the Grace side versus the hopper side versus what power we we think might look like in an m I three hundred a. Because one of the things that's been kinda mesmerizing to watch just as we've brought up our compute sled and and with Milan and then again with Genoa, like, watching how dynamic the power is on the part is pretty amazing where it will it it's really pretty remarkably good at, when, you know, you got a bunch of cords that are idle, you'll have a cord that will its clock will go way up. It's the draw of that core will go way up. And then as you get more work, it will you'll see the the clock drop and the the the kind of the power is is dissipated across all all the cores. How does that work inside of Grace Hopper?

Speaker 2: 23:23

Presumably, there's gotta be a similar I mean, surely, you can't spin up, or maybe you can spin up all those cores simultaneously with all of the GPUs cranking, or is there some dynamic allocation of power across those?

Speaker 5: 23:35

So there's dynamic power allocation.

Speaker 2: 23:37

That's right.

Speaker 5: 23:39

Although I would have to go back and look at the the guide. Am I 300 a? Because I'm looking at our articles right in front of me, so I can just check it. It's 550 watts air, 760 watts, water. If I remember correctly, I believe Grace Hopper, I think out of the box is 90950 watts.

Speaker 5: 24:04

It can go up to a 1,000.

Speaker 2: 24:07

Well, well, I Don't

Speaker 5: 24:08

quote me on that. I might be wrong. I would have to go check again.

Speaker 2: 24:12

But Yeah. That sucks. I mean, that's just a lot.

Speaker 3: 24:14

Even if we were in the neighborhood of the 5 or 600, the package footprint on either of those is wildly different. So those 300 As for reference for maybe some people will understand this is the same exact socket and foot print as a, like a Genoa or a, a Bergamo CPU. It's it's actually really small and it's it's putting a lot of power through it.

Speaker 2: 24:42

Yeah. Which I think is I mean, I think, Jordan, you that you were saying that part of the power of AMD is they kinda take this core innovation and then they package it in a bunch of different ways. And I I mean, I'm still very impressed that they saved the same chiplet on the desktop more or less as they do on the server. The fact that we've got they've they've they've got so much coming out. They clearly have very different parts for a different packaging.

Speaker 2: 25:05

But the the this kinda the fact that they are sharing a socket in at least the at least the physicality of the socket, the mechanics of the socket are the same socket.

Speaker 3: 25:13

It's not electrically compatible. Yeah. Right.

Speaker 2: 25:18

I mean, that's pretty remarkable, though. I mean, just in terms of, like, kinda leveraging your your investment on one side against another side. It's part of Another part of what actually what we find really interesting with the Mi 300 a, is that it just feels like you're you're leveraging a bunch of existing investment. I'm kinda dying to know what the and because you all do not yet do you have your hands on my 300 a? Are they did you

Speaker 5: 25:42

I wish. But, also slight correction. Adam thankfully pulled up the spec sheet. It's between 450 and a 1000 watts for Grace Hopper. So that's

Speaker 2: 25:55

Between 450 and a 1000 watts. Oh, okay. Is this kinda like how California companies are required to list a salary range? So it's like, oh, yeah. It's between a 100000.

Speaker 2: 26:02

Yeah. Yeah. There you go. Like, oh, okay. Yeah.

Speaker 2: 26:07

That's a it's a it's a pretty wide range there. Yeah. So, so you do not have a because I I think I saw did I see a video of you at least palming 1 on? And you just. Yeah.

Speaker 2: 26:21

Must've been tempting to just be like, you know, you're not gonna miss this one. I can just, like, kinda stick this galactic thing in my you can't really stick it in your pocket. That's right.

Speaker 3: 26:30

Yeah. I try to every time, and I usually get tackled by PR guys at the end of, the video shoot to make sure that I do bring it back. But, you know, the actual chip itself is pretty pretty small, and I I my meme my meme post on Reddit was, AMD now selling heat sinks with a little silicone attached to the bottom. Get your EMI 300 a's.

Speaker 2: 26:54

Well, that is a monster heat sink. And Jordan, we when you were in the office, you saw the heat sink that we have on to on Tofino in the switch. I mean, that thing is all basically, all heat the heat sink is gigantic. In fact, the heat sink is so big, and I think we talked about this in a previous episode, that one of the mechanical challenges that we had was the heat sink is so large that we had a moment arm issue where we were worried that that a a small amount of force on the outside of that heat sink could crack the PCB. So the heat sink engineering is is, not trivial.

Speaker 2: 27:23

These are hard mechanical problems. And that heat sink that you had was just it I mean, it was very large. That was a big heat sink and unsurprisingly on top of a pretty hot part.

Speaker 3: 27:33

Yeah. We've we've learned to call it colloquially inside storage review, the lunchbox. It appears that, both AMD and Intel Intel have gone with that kind of similar, that similar design. And it kind of feels like it when you pick it up by that little lever too. It's, it's it's got a lot of mass to it, but I can't help but think that if someone were to, you know, engineer a solution to go into a a smaller than a 4 u form factor, you could definitely widen that out with a proper number of heat pipes.

Speaker 2: 28:04

Interesting. Yeah. I wonder if I wonder if someone could go do that.

Speaker 3: 28:08

I wonder where we can find any good engineers around here.

Speaker 2: 28:11

Exactly. I don't know. Try the next podcast. So again, you know, another question I've got for you because I think when, you know, we designed our rack to have a 15 k w power budget for the rack, and that really informed a bunch of decisions that we made. And, clearly, we are not looking at 15.

Speaker 2: 28:30

1 is not looking at 15 k w per rack for one of these, or one is looking at a very small number of parts. You're gonna have, like you know, if you're drawing a k w for the part, it's like you're gonna have, like, not even close to 15 parts before

Speaker 3: 28:43

you can

Speaker 2: 28:43

fit k w.

Speaker 3: 28:45

I think you can fit 15 k w under my desk if it was all these.

Speaker 2: 28:50

Well, so, I mean, I guess that's what I'm asking is, like, what are are people kinda changing the way they think of the folks that are are are deploying these by, you know, by the 100 and 1,000. I mean, they you've gotta be looking at at 30 k w per rack, 40 k w per rack, and beyond that. Is that what are people actually doing that or are they just deploying fewer of these things? I mean, kind of square this for me from a data center perspective because we look at the enterprise DC and when people hear, it's like, wow. 15 can never be back.

Speaker 2: 29:18

Like, wow. That's at the very outer limit of our facility. And then along come, you know, comes these things with it that are gonna draw a lot more.

Speaker 3: 29:26

But I think that's a bit of the paradigm shift, right, of what's going on right now where we've got a lot of action in the liquid space. We're at storage review, we're getting pretty pretty hands on in in in going further into the liquid side of things. That's more or less the direction you almost have to take, whether it's a direct to chip or, in heat exchanger on the hot aisle, you're gonna you're gonna run into thermal limits really quickly. Interestingly enough, this is something that a lot of, like, crypto GPU mining stuff more or less solved a handful of years ago, and it's just starting to find its way into enterprise and, traditional, you know, which HPC or, you know, AI computational workloads are going where, you know, you're realizing that these chips are so powerful and yeah, you can cram in a 40 some odd u rack, you can cram 40 of these things in, but then you're looking at 1200 watts per box and how do I get that heat out of there? And and then the easiest, simplest answer, especially if you're building net new, and this is the big asterisk on the entire equation, right, is if you're building net new, the answer is go liquid.

Speaker 3: 30:50

If you're retrofitting, that's where things get. That that question becomes a lot harder because now we're talking about facility power or facility water, possibly retrofits to the power, you know, there's there's so many variables to that if you're doing a retrofit versus a net new build, and what we're seeing is a greater willingness to do the adoption, right, of a liquid, but maybe not the perfect answer. Everybody wants to say, yes, I would love to have, you know, aisles and aisles of liquid cooled stuff, but I can't retrofit facility water. So what's the answer?

Speaker 2: 31:37

Well yeah. And I I I guess I mean, because this is a a pretty exciting driver for folks that this is like, okay. This you get this potentially compelling new functionality. I mean, one question I definitely I mean, it's interesting you bring up crypto because, of course, I I I mean, clearly, in in contrast to crypto, I think a lot I mean, there's a there there, with respect to these large language models and generative AI. Like, there's it feels pretty unquestioning at this point.

Speaker 2: 32:01

I don't know. Adam, what what do you net out of that? There's I mean, I think there's unquestionably unquestionably. Yeah. Unquestionably long.

Speaker 1: 32:08

I think there there is some hyperbole around this being the only future of computing or at least, you know, unproven yet. But it's, like, there's a 100% that they're there. It's gonna change lots.

Speaker 2: 32:22

It's a huge loss. I mean okay. Well, I I I like I love Lisa Su. She did say it's the biggest revolution in 50 years. I'm like, 50?

Speaker 2: 32:29

You mean that number? It is a long time.

Speaker 3: 32:32

Well, I'll give you I'll give you this counter. When crypto first came out or I'm sorry. When crypto first hit the mainstream, you didn't have enterprises like banks jumping up and down and saying, yes, I need that now. We have that happening with AI.

Speaker 2: 32:48

Yeah. Yeah. Yeah. Yeah. Right.

Speaker 2: 32:49

I mean, you and you you you definitely had, like, a moment at Web 3 when everyone's, like, oh, everything is gonna be on the blockchain, but no one actually knew what that meant, and no one was actually finding it. And he no. I totally agree. That that you that you've got there's definitely that there is a there's there's something very real here. And the the kind of the question is how much of that is gonna translate into the price tag.

Speaker 2: 33:10

I mean, the question that we've definitely got is it also feels like there's a stronger bias for this stuff to be on prem because you kinda wanna let this thing loose on the data that you don't wanna put on the cloud for some of these folks when you talk about banks. But I don't know. I mean, it's they're George, as you're pointing out, it's like these are way easier. It's way easier to rent one of these things than it is to buy it right now. What do you think the economics of kind of buying versus renting looks like for for these, for the kind of DMI 300 a?

Speaker 2: 33:40

Well, I

Speaker 3: 33:41

guess the the the leading question is, is it going on premises or on premises?

Speaker 2: 33:49

Well, I mean, I guess You can't

Speaker 3: 33:51

tee me up like that.

Speaker 5: 33:52

It's it's

Speaker 3: 33:54

and you and I were having a conversation the other day that was really intriguing and it was, you don't see any Snow Machines leaving AWS. They're they're all going up to it, but there's still a lot of industry out there that either can't or won't or just is literally unable to move the amount of data that they have to the public cloud or even a private cloud. Right. So, so your options become kind of a, do I invest in which architecture? It's not a, do I invest?

Speaker 3: 34:33

It's which one. And and you you start approaching this, like I said, this conversation of it's easier to build it net new. You know, HPC racks in in in that sort of, you know, where we're at with AI is not like anything we've ever seen in enterprise traditional compute. It's it's a very different animal. It's a different animal on the the amount of throughput you need all the way up to what you need in the data center to support it.

Speaker 3: 35:05

It it's a complex question.

Speaker 2: 35:08

And so and I guess it's kinda twofold because, you mean, you obviously have those reasons to run on on prem because, like, hey. I've got data that I just is not gonna go to the cloud or what have you. And as you say, like, snowmobiles do not make return trips. The snowmobile being the the this is the the AWS, pickup truck, not pickup truck, semi truck that that, hovers up your data and then drives off to an AWS data center. I was joking when they released that that if I were that NetApp should be, like, hard jacking those things because it's I mean, I'm just like, you know those it that's disappearing into Matt

Speaker 1: 35:39

Max style. Yeah.

Speaker 2: 35:40

Total Matt Max style. Dan Wurmanhoven out there going like, no. We're gonna go bring down another snowmobile. The but they did not, alas. The so you got the kind of the that issue of kind of the privacy issue.

Speaker 2: 35:54

I'm actually just like just because you you all are so on the economics of this stuff, does it pay to rent versus owning these things? I mean, I know they're hard. I mean, Jordan, as you were saying, it's like just like the supply chain is a mess right now, and it's hard to actually physically get one. But and they are expensive as hell. These things are so expensive.

Speaker 2: 36:13

I mean, somebody said, they had GPUs. Like, holy mackerel. That's a lot of money. But I can say all these for

Speaker 3: 36:21

less than a server full of

Speaker 2: 36:22

these things. Like, you I mean, literally.

Speaker 3: 36:27

Yeah. So I mean right. So you look at it in in I think I think the correct answer to this, to any enterprise at any scale, is to prove it out in some sort of cloud, right? Prove it out. There's, there's very simple ways.

Speaker 3: 36:42

If your developers don't know a way to write a script to debug or create dummy data to use that mimics real data, prove out your stuff, prove out your models in a way. Take your time and select something, you know, like, like find the right chip, find the right manufacturer, find the right supplier, take your time on it. Don't, don't rush too hard on it. And and that's probably anti to everything that everybody wants me to say, but it really is the truth. It takes a lot longer to develop the found even if you're building on a foundational model instead of just fine tuning something, it takes so much longer to do that than just inference.

Speaker 3: 37:20

And now we're talking about crazy things like NVIDIA's got retriever, AMD's got kind of crazy stuff coming out with Rock'em. You know, we've got all these really wild, just complete stack solutions happening, and those take so much time to actually integrate into an enterprise, especially if you're something that's so big and slow moving that you can't just go to the cloud, right? As you get bigger, it's almost a feedback loop of now I am so big and so regulated, I can't just go do this. But taking the time to slow down and invest in, okay, how can we develop a chat model that's able to respond to a customer calling in who's complaining about their medical bill that can actually route them to their right department without leaking HIPAA compliant, you know, being, being compliant with that. How do we do that?

Speaker 3: 38:23

But do we do, how do we do that with fake information? Cause you can still do it, right? It's not impossible. And that's kind of the biggest challenge right now with this entire industry is that training data creation, that first implementation of the entire thing is so human capital expensive. It is, it's unmatched.

Speaker 3: 38:44

It's, you're going to spend 5x what you spend on the actual hardware in human cost to develop something that can fully take advantage of it. And I I think that's even an understatement.

Speaker 2: 38:57

Would and that's also also really interesting. I do agree with you that it's kind of a common classic advice because, you know, there's kind of this anya out there. I mean, honestly, a mania that I feel out of we have not seen so in a long time. Definitely reminds me of Internet mania, .com mania. I think it's much greater than for good reason, much greater than kind of web 3 mania.

Speaker 3: 39:14

Who? Can a boomer?

Speaker 2: 39:18

Damn it. You know, my kids do that. My kids love to call me a boomer to troll me. I mean, it's so effective. It's just, like, it's such an effective troll.

Speaker 2: 39:25

Yes.

Speaker 3: 39:25

I I just remember I asked for permission on how many I could make, and I was told all of

Speaker 2: 39:29

them. Fair. But the but it it I mean, we are seeing this kind of this real mania out there, and there is this temptation to believe, like, I know you need to move, like, right now. And I think it's like you need to take this really seriously. But, Jordan, I think what you're saying is really interesting.

Speaker 2: 39:44

It's, like, no. Be deliberate about it. Like, get it explore it. Start exploring it. Get it working with with, with fake datasets and start kind of figuring it out.

Speaker 2: 39:54

Because I also feel that, like, if this is going to be as big as its proponents, and I'm not talking about the doomers here, but I'm talking about, like, the the proponents who believe the people like Lisa Su saying that this is the biggest revolution in computing in 50 years. I don't know about that, but if it's a if it is that big of a revolution, if it's a revolution on par with Adam, I don't think I mean, I will spare you the coronary, and I won't say that it will top spreadsheets. I mean, the spreadsheet is hard to beat. Right? I mean, as a community revolution, like, spreadsheet, word processing, personal computing, Internet, I feel Internet is I feel teralogically unbeatable because so many of these things are made possible by the Internet that I think that I think the Internet would be right to point to AI and say that, actually, you can't have the the the revolution you're having today in AI without the Internet.

Speaker 2: 40:43

So I think we're gonna give the AI the Internet special exception. But if it's if it's as big as, as that, and I think it, Adam, it probably is potentially maybe on par with the spreadsheet. I mean, kinda up there. Right? I mean, it's a big deal.

Speaker 1: 40:58

Yeah.

Speaker 2: 40:58

Potentially, I am I gonna look back on this podcast and just, like, puke? Because my god. The entirety of the United States

Speaker 3: 41:07

of America

Speaker 2: 41:08

and the largest

Speaker 3: 41:09

private company to the the largest government entity to the smallest business runs on Microsoft Excel, and you can ignore it all you want, but it's true.

Speaker 2: 41:19

Well, no. Yes. In terms of, like, the AI being that big, because I do think the spreadsheet is extremely important. And, you should know that Adam is a spreadsheet maximalist, by the way, if I haven't already made that.

Speaker 1: 41:29

Totally. My love language.

Speaker 2: 41:31

It's for good reason. It's amazing. Spreadsheet is It make

Speaker 3: 41:33

it makes a lot of sense why you guys aren't LinkedIn friends now.

Speaker 2: 41:37

We're not LinkedIn Adam and I are not LinkedIn friends with one another?

Speaker 3: 41:40

No. We I think we are.

Speaker 1: 41:42

Oh. Adam and I, we're not percent we are. But I think I

Speaker 2: 41:44

think we're Oh,

Speaker 1: 41:46

Not not only are we LinkedIn friends, I think, I think we need to remove recommendations for each other.

Speaker 2: 41:51

Recommend wait. Exactly. I know for a hard fact.

Speaker 3: 41:55

My LinkedIn's lying to me then because it says

Speaker 2: 41:58

Really good to find you. Okay.

Speaker 3: 41:59

So Pulling out my investigative journalism here.

Speaker 2: 42:01

Okay. No. You definitely need to. No. Because Adam and Adam's got a very good memory, because Adam and I Adam, can you bring up the recommendations that,

Speaker 1: 42:11

Speaker 2: 42:11

wrote for one another? Because I

Speaker 1: 42:12

don't know that we wanna read them aloud.

Speaker 2: 42:15

We absolutely are the Oh, no. Well, maybe we wanna read them to ourselves quickly Well and determine For

Speaker 3: 42:22

for the audience listening along, so so Brian had to beg me to become LinkedIn friends with him so

Speaker 2: 42:28

he could be the private person. That was done in a private moment. I feel that, like, we should not do we we're gonna drag this out in front of everybody? We're gonna we're gonna LinkedIn shame you like this? I even told you that I was embarrassed when I was asking you to be yes.

Speaker 2: 42:41

It's true. True. I mean And Elon Musk did this. Elon Musk brought this upon us. I am I am embarrassed about my engagement with LinkedIn, and he did it because he was the one that flew my social network into the side of a mountain, and now I'm forced to go to LinkedIn.

Speaker 2: 42:54

It's true. It's true. Makes sense. Did okay. So here's why I asked Jordan in what I thought was private to become LinkedIn friends.

Speaker 2: 43:02

And, also, what do we call it on LinkedIn? Oh, they're not friends.

Speaker 3: 43:05

They're like We're linked. I don't know.

Speaker 4: 43:08

Business business associates? I mean

Speaker 2: 43:10

Business associates. Exactly. It become business associates, because I in order to be able to to, tag someone in a post, you need to be connected to them on LinkedIn. So that's why I, in what I felt was a linked, a a low but private moment, I, I asked to connect you. I also George, I I I I I needed you both to, like, accept it immediately, which I felt really bad about because I wanna be able to tag you in the LinkedIn post because LinkedIn is now the social network of choice.

Speaker 2: 43:40

And, again, I I I blame Elon Musk and the decline of human civilization. Adam, have you found these recommendations? Are these, I can't find

Speaker 1: 43:49

it. I found your recommendation for me. I I'm I'm happy to read aloud it, but I'll explain why I took it down too.

Speaker 2: 43:56

Oh, you took it down.

Speaker 1: 43:58

Yeah. Yeah. Yeah. Boy. Sometimes the so called doctors sure are wrong.

Speaker 1: 44:03

After the accident, the doctors told us Adam's brain damage was so extensive that he would be unlikely to ever care for himself again if they could only see him now. Sure. His speech is slurred, and he drools a bit. But if the task is menial and short, and there's no time pressure, Adam's your man. So, this was actually like a response to a, a recommendation I provided for you, Brian.

Speaker 2: 44:28

That I can find out? I I'll I'll I'll dig it up. I'll dig it up.

Speaker 3: 44:33

I'm I'm clearly doing LinkedIn wrong.

Speaker 2: 44:36

I'm not saying That's not clear.

Speaker 1: 44:39

And I and I proudly I proudly, like I this was, in 2006. I proudly, you know, signed up for that, you know, put, you know, said, fine. Yeah. Tell everyone that that's who I am. Until I was interviewing someone, and we hired them.

Speaker 1: 44:56

And then months later, he had joined, and he said, you know, when

Speaker 2: 44:59

are you guys I've given you some menial and short tasks in which there, there was great time pressure and you did great on them.

Speaker 1: 45:07

The whole thing's pal. So when I had hired, came to me and said, you know, when you interviewed me, afterwards, I was I thought, you know, his speech wasn't slurred at all.

Speaker 2: 45:18

Oh, my God.

Speaker 1: 45:19

And it's odd that they made this guy CTO of this company. So, I mean, what I really should meant is that I wasn't testing for any kind of humor filter, apparently, in the folks I was hiring. So that's on me.

Speaker 2: 45:33

But you should know so Jordan and George, we are so old, Adam and I, that we were this is, like, in the 1st days of LinkedIn. This is, like, how long we have been connected. And so, you know, with every new social network you have to understand. LinkedIn was basically, like, blue sky for a brief moment, and it was in this moment where Adam and I are basically, like, writing fake recommendations for each other. And Yeah.

Speaker 2: 45:57

And then I I Do

Speaker 4: 45:59

you remember when endorsements showed up? You could endorse

Speaker 2: 46:02

Yes. You couldn't. But then Yes. Yes.

Speaker 4: 46:05

We're try and we we were trying to endorse bulk endorse people for things that obviously they would find laughable. But unfortunately, I think I feel

Speaker 2: 46:12

like you had to one for anything. And so Yeah. Well, you had to eventually, you

Speaker 4: 46:15

had to accept the endorsement, which is unfortunate. Because then, like, no, no one's gonna accept like, an endorsement for animal husbandry or whatever that they've been

Speaker 2: 46:23

cut out. You know?

Speaker 5: 46:24

But yeah.

Speaker 2: 46:25

It it was like a 24 hours of really, a a lot of delight on LinkedIn. So, yeah, you you didn't realize that you were walking into a LinkedIn lore session here, but, that's that's where we

Speaker 1: 46:35

we submit. Before we move off of this, Josh, you can endorse people things. My my brother, who just became the VP of a of a different company today, he's endorsed for 4 skills, coffee, Seinfeld, walking and talking, and lunch. And apparently, that's enough to get you VP of engineering at his new company. I'm practically is

Speaker 2: 46:55

a VP of engineering. You got the whole thing.

Speaker 4: 46:56

Did he don't have to, did he don't have to tick some, press a button to allow that to be?

Speaker 1: 47:01

I mean, this seems carefully curated. I see.

Speaker 3: 47:04

Since this conversation has started, I've gotten multiple LinkedIn requests to connect. I don't know how this is gonna go.

Speaker 2: 47:12

Well, you brought this on yourself. They anyway, let's just say, long story short, Adam and I are very much connected on LinkedIn. So I don't know what LinkedIn told you.

Speaker 1: 47:20

I do it. I do actually

Speaker 3: 47:22

we had no mutual friends. It was bizarre.

Speaker 4: 47:24

I do find it really weird when people connect with you on LinkedIn and, like, a lot of I feel like maybe this maybe they've deleted all this language, but the language around the site used to really suggest that, like, you knew the person you were connecting to. It's like, I don't know you. Why would I I don't know. It's inevitable people trying to sell something. So it's like, no, I don't know you.

Speaker 4: 47:45

Go away. Ignore. Like

Speaker 3: 47:47

So my favorite thing, obviously, the name of the site is storage review. My favorite thing is when people try and come into my, DMs on LinkedIn and try and sell me storage.

Speaker 2: 47:58

Adam, I have even though it took me a long time to get it, I have the review. So the did so you gave me this review that I received, but then did not approve it. So I didn't ever have the dogma. Well, I didn't think you were gonna approve mine. I had no idea.

Speaker 2: 48:13

I thought we were just, you know, I thought we were, so this is October 3, 2006. This is a that's a long time ago, man. Like Yeah. I George Jordan, no comment from you in terms of where you were To be clear, this is This is something

Speaker 4: 48:28

that was not on the Internet before, and you're now putting it on the Internet by reading it.

Speaker 2: 48:33

That's right. Okay. Yes. Yes.

Speaker 1: 48:35

Yep. Thank you.

Speaker 2: 48:36

Just checking you. Just checking. If you could please move aside, you're blocking the my route to the window. The, Brian is a fine engineer, and we're all delighted that he was released from prison so quickly. And, of course, we do enjoy gathering around while he describes in detail the time he was shanked in the gut.

Speaker 2: 48:52

So that is it would be interesting to know, Adam. This was and then I, of course, replied with my recommendation for you, which then you accepted, and I never accepted yours. So, really, there's a a deep asymmetry there that I I know has has plagued our relationship. So I'm I'm really sorry.

Speaker 1: 49:05

I'm glad we finally got this out.

Speaker 2: 49:07

We finally got this out. I think this is, ultimately, this was very therapeutic. This was this was healthy.

Speaker 1: 49:11

Yes. Thanks, everyone.

Speaker 2: 49:11

And thank you, everyone. We we this is this has been obviously going on for, you know, almost 20 years, so we finally got that I swear LinkedIn was not much older than that. I don't know when LinkedIn came out, but, October 3, 2006 is, like, maybe it was a year old. So, anyway, wait. How did we get here?

Speaker 2: 49:26

Jordan, why are we why are we talking about what link how did we get mentioned LinkedIn?

Speaker 3: 49:29

Well, I started talking a bunch of crap about it, and then all of a sudden, we had to we went to a family therapy session. So I just kinda stayed in the background from now.

Speaker 2: 49:40

But did they walk on the oxide, You know? He pass the tissues, please. And use I feel statements. So, but in terms of the the I think you've got very prudent advice. I get I don't know how we got here, so I'm just gonna try to get us back.

Speaker 2: 49:54

The, Jordan, very prudent advice in terms of actually starting out renting GPUs on the with, it's a get your feel for it. But, and I think I I sorry. This is what we're talking in terms of the the the scope of the revolution. If this revolution is as big as many of us, many people think it is, it's gonna be with us for a long time. We then if it's as big as the spreadsheet, then the year is like 1982 with respect to spreadsheets.

Speaker 2: 50:24

And, like, there's a lot that's gonna happen in the next 10 15 years, and it's not like you're you're not trying to, like, make the train here. You're trying to pack into something that's pretty fundamental, and that should hopefully take some of the panic out of, you know, as people are trying to figure out, you know, what do I do here? It's like you you got some time because this is gonna be a big deal. Is that a a fair statement?

Speaker 3: 50:49

Yeah. I think it's I think it's not only fair. I think it's, an accurate judgment. Maybe possibly a little too optimistic on the whole thing, but it's not gonna go anywhere. Right?

Speaker 3: 51:04

Like we're not gonna, it's not gonna go away. It's, it's not Netscape. It's of the same business impact, you know, as, as the spreadsheet. And and and enterprise, once once enterprise really does figure this out in a meaningful way, we're going to see, hopefully, and this is probably the optimist glass half full in me, hopefully, we're gonna see really positive impacts to kind of the entire customer experience end to end. Right?

Speaker 3: 51:40

Nothing's more frustrating than calling into a call center agent and I don't know, or getting to a call tree and trying to say, I need to talk to billing. I'm sorry. I didn't understand. I need to talk to billing. I like, that's just like, if we can get past that kinda just menial crap in enterprise, in the whole general implement like like just the surface level stuff.

Speaker 3: 52:06

Forget what it's gonna do for the actual innovation behind the scenes of an engineer who can go and talk to decades of documentation and, you know, knowledge that's been put down by other other folks, just that surface level customer service side, it's gonna be wonderful.

Speaker 2: 52:30

It'd be a big deal. That's gonna be transformative. The and so in what do you see in terms of the current, impediments to this? Because I do feel that the, I mean, this has been a little bit like programming your calculator, even as good as CUDA is. It is it's a very different model.

Speaker 2: 52:48

I mean, when when I, you know, kind of assume the ubiquity of accelerated compute, it's hard not to see a model like the Mi 300 a or like Grace Hopper where you've got, general purpose CPUs living alongside, of accelerators. What is your take on that in terms of what the future what this means for the future of accelerated compute?

Speaker 3: 53:14

That that's complex. Right? Because there's you, you look at it from an architecture standpoint. It's really interesting. It's super compelling.

Speaker 3: 53:23

George and I could probably talk for 3 hours on that. And then you move The article coming soon. Yeah. He's probably I believe he's got one. You move a layer up from that and you get to kind of that, you know, okay, well, how do we how do we employ this as a tool thought process?

Speaker 3: 53:41

And you start looking at, okay, we've got, you know, some really noisy models, or algorithms that really need to kind of go back and forth between that general purpose. And you're, you're leaning into the, you're leaning into some really interesting topics there. And out from there even further, you kinda touch on the stuff I just did, which is how does this actually affect? I think there's a lot of fear mongering and a lot of, they're taking our jobs type thoughts around this kind of thing. But I think we're so far off from something like an AGI, that actually could do that, that could actually replace warm chairs, that it's not, it's not anything folks need to worry about.

Speaker 3: 54:35

It's more of a, how does this help everybody get a little bit better at their day to day? You know? Like, In in You got the monologue in.

Speaker 2: 54:45

Oh, yeah. Well, I've got some data. That's fine.

Speaker 5: 54:47

If I could interject here. No. You bet. I kinda have to agree with what what's being said in the live chat right now. And if you look at the percentage accuracy for smaller models, which is realistically what because trying to train these 1,000,000,000 plus a trillion plus, size parameter models.

Speaker 5: 55:09

Very, very intensive. Right? Not just in terms of compute, but also in terms of memory, bandwidth, and capacity. Only really the big players can do that right now. So if you start looking at sort of the 7,000,000,000 parameter models, their accuracy is something in the 40% range.

Speaker 5: 55:30

Like, you're looking at it it It's maybe a help if you can train it to be a very specific sort of model. But

Speaker 2: 55:46

Yeah. That's why I'm not

Speaker 5: 55:47

trying to train a general purpose model on just 7 Billy parameters is, I would argue, a fool's errand.

Speaker 2: 55:54

A fool's errand. Okay. Because the the you would this is where I'm searching, George, because I agree with you that it it does feel like, you know, we have taken this we, humanity, for the last 3 years, it's been a very, very, very brute force approach of just like more parameters, more parameters, more parameters. It feels like the breakthroughs recently have been in the stuff that's got, smaller parameters and a little more focused and how you can actually like, it feels like there's a lot of focus on efficiency, which kinda leads me to my question for you, which is, are you know, ultimately, the power bill did catch up with a lot of the crypto mining. Is the power bill going to be part of the the calculus here?

Speaker 2: 56:34

In terms, like, well, if we, you know, training that that model, there's there's a huge amount in terms of physical plant or that 7,000,000,000 parameters, but then there's also just a huge amount of power that went into that. And are we going to begin to to really because I think, Adam, you and I have been waiting for our entire careers for people to really start thinking about the power consequences of of their their software decisions. And it feels like for the g p GPU or for for accelerated compute and for AI workloads, that moment may be here just because it's so cost prohibitive. Is that George, do you think people are gonna really start thinking about the the the power dissipation of a give of training a given model?

Speaker 5: 57:12

Oh, absolutely. I mean, it's I you you sort of already hear talk about it of just how much power OpenAI needed to train GPT 4. Right? It just staggering amounts of power was needed. And while g p GPT 4 is good, can it replace a human?

Speaker 5: 57:39

But not, in my opinion, no. And if that's sort of where you're having to go to replace a a person, just the power the power budget isn't there. If you consider that a human consumes about a 100 watts, and what a human can do in that 100 watts is absolutely staggering compared to what compute can do in many cases. Yeah. Yeah.

Speaker 5: 58:04

So it's it's

Speaker 2: 58:09

It's very clear that our that our brains have got a very different compute model at least from the the I mean, we're we're obviously not silicon and not air cooled, or goo cooled, water cooled.

Speaker 3: 58:24

Gross.

Speaker 2: 58:25

You know? The I mean, I think that, like, the and I think what I'm I don't know. What I what I feel I'm seeing more of out there is I mean, because I think we need to get past this, like, the doomerist nonsense. It's just I mean, it's just it such nonsense. And kind of unaccepting that you need to have or Jordan, you mentioned kind of the HIPAA compliance and so on that you're that that safety needs to be a part of the way you kinda think about these things, especially if you're gonna have them be generative.

Speaker 2: 58:51

But I just feel like there's so I mean, the the problem that I Amy and Adam, I know you and I both got kind of our our little problems that we're interested in. But, I mean, I'm so interested in letting this stuff loose on technical documentation to do things that are to use it as a much more informed search effectively. And I mean, you were

Speaker 1: 59:09

we're kind of rolling out some of our internal problems. Like, we've written written I don't know how many words, couple million words. What do you think?

Speaker 2: 59:15

A lot.

Speaker 1: 59:16

Yeah. In terms of the specification of all the different aspects of the machine. I mean, I think you're you're speaking for me as well when you think, like, couldn't an AI kinda help me figure out the truth here? And maybe even, like, have some citations so it couldn't invent too many hallucinations that I couldn't fact check?

Speaker 2: 59:35

Yeah. I mean, I don't think so. It's where Adam was referring to is kind of we've got this this, we our technical documentations and these things that we call request for discussion or fees. And, I mean, I would love to let a an LLM kinda loose on the on these things to, help us as we are you know, help us find inconsistency. And there's a bunch of stuff where it's like actually, like, hallucinations are are are fine because we're just gonna kinda fact check this stuff.

Speaker 2: 01:00:01

It's, like, it's not a huge waste of time to be to be told that, and if you can find some stuff that's otherwise hard to find, that's kind of interesting. Sorry, Jordan. I cut you off there.

Speaker 3: 01:00:10

No. I was I was gonna kinda interject just for a second in the conversation. Right? So and I think this is because of my own personal biases. I view things like chat GPT, you know, 3.5, 3, 4, any version.

Speaker 3: 01:00:24

I view those more as a tool. Right? They're not they're not an intelligence. They're a tool. In in I was wondering I know we're kind of at the top of the hour here, but if everybody's okay with it, we could go down kind of a thought experiment route with this whole thing.

Speaker 3: 01:00:42

I'm seeing a lot of I just tuned into the chat for the first time and and and it's triggered me. For the boomers in the crowd, that means I'm I'm getting energized by their feedback. Lordy. Lordy. Just to go down a thought experiment.

Speaker 2: 01:00:59

Jordan, if if if you keep this up, I'm gonna have to actually, draw we'll do it a little bit like yeah. Oh, because well, the this the the way to get a millennial back, millennials don't realize that they've actually, like, kinda quietly aged, and there's an entire generation behind you that, has some comments on the way. So we'll we'll do some old Gen z slang quiz later.

Speaker 3: 01:01:19

No. I was just I I was just curious. Like, if we look at this as more of a thought experiment. Right? I'm sure it's been talked about and brought up a 100 times.

Speaker 3: 01:01:25

But if you look at it more of a tool, right, how many tools have been invented throughout mankind's history that people have been fearful of either the cost or the harm that could potentially happen or the anything, right? Any new tool that has come out and been invented by mankind, there's always been a certain amount of pushback. And for me personally, right, aviation is big connection for that. And, you know, in the very early days of lighter than air and they were using balloons, you know, oh man, it's not meant to go, you know, more than 7 feet off the ground. Like there was just wild stuff that was being written and printed and talked about.

Speaker 3: 01:02:16

I'm just curious if we look at it through a lens historically of that kind of caliber of a a game changer, like aviation or or electrification, How does this parody those types of innovations? And if we look

Speaker 1: 01:02:39

at those as the playbook

Speaker 2: 01:02:41

Yeah. I mean, I actually think that that there is gonna be less. I I I I think the demographic that is concerned about a the AI doom is a pretty small demographic, like a very small demographic. You get outside of technologists and start talking about doomerists. You sound like you're off your rocker.

Speaker 2: 01:03:00

You know, you go talk to, you know, go talk to your kid's basketball coach about AI Doomerus. There's be like, what the what are you talking about? Like, can you just can you get your kid on time to practice? That's what I'm actually asking you.

Speaker 1: 01:03:10

Right. Are they in the room with you right now, these tumors?

Speaker 2: 01:03:13

Absolutely. I mean, it just it just sounds so unhinged. And I think that this is gonna be, and I I think you can easily make the argument that this is, that, this will not be I mean, there are other innovations that are broader in terms of their scope. I mean, electrification is is one that, obviously, electrification and, and rail and aviation and, I mean, a bunch of things that that really, I mean, very materially impacted people's lives. And I think, you know, it this could be there, maybe, in in certain regards.

Speaker 3: 01:03:42

I mean,

Speaker 2: 01:03:42

I can tell you in some regards, it's already there. I mean, the ability to I've said this before, but, when my father-in-law passed away, the ability to go to Google Photos and pick up every photo that I'd ever taken to my father-in-law instantly was and that was and that was in 2017. That was a moment for me. It was like, okay. There's all there is, like, real here, and it's gonna have a a real impact on people's lives.

Speaker 2: 01:04:05

And I so I think that I would say also, Jordan, the flip side of that, just for whatever it's worth, to give you the other side of it is, I think, yes, every technological revolution has been feared at some level. I think that, just about every technological revolution has also been abused, and I think that, you know, and I wrote a blog entry on this to someone recently, about what what punch cards can teach us about AI safety. The, if if you look at the role of, IB I've read Edwin Black's IBM and the holocaust, and the role that punch cards played or the very important role that punch cards played for the Nazis. And the so these technologies can and that's not a reason to not develop them. That's that's just a you gotta be really clear eyed about these technologies are gonna be, some folks are gonna wanna be using them with reckless abandon.

Speaker 2: 01:04:58

Some folks are gonna fear them. And, they really we wanna be kind of in the moderate middle for especially for the revolutionary ones, where, we wanna appreciate their power, and we wanna understand and learn how to use them responsibly. I think it's kind of the and I'll get off my soapbox on that one. So, yeah, we've been, needless to say, we probably we we actually got an entire episode on on a the AI tumorists, that we've I Adam, I think we've done a good job resisting the temptation of going back to the well on. I I I

Speaker 1: 01:05:29

think I think we don't make every episode all about that, so I think we're doing pretty well for us.

Speaker 2: 01:05:35

I just love your pause. It's like I think we're doing pretty okay. I think I'm I you know, it's great to hear that you believe that. That's important. So

Speaker 1: 01:05:44

Well, I tell you I was looking for the mute button, but we both know that's not true.

Speaker 2: 01:05:48

We both know that's true. The, what we're gonna and actually, Adam, maybe that's a good this is a good tee up for our, our we're gonna have a predictions episode. So so Jordan and George, we do a, we've got a little oxide and friends tradition that we started when did we start this, Adam? In 2022, maybe. Something like that.

Speaker 2: 01:06:06

Yes. That must be right.

Speaker 1: 01:06:07

Yeah. Yeah. January 22. Yeah.

Speaker 2: 01:06:09

Of doing, predictions. So you can go check out our past predictions in 2022 and 2023. And, I I I think we can safely say, Adam, that we're gonna have some AI related predictions, in 2024.

Speaker 1: 01:06:23

That safe prediction for sure. Barely a prediction. Exactly. So what

Speaker 3: 01:06:27

what bookie is, underwriting us?

Speaker 2: 01:06:30

Yeah. Well, so you should go listen to the past recordings. We got some good ones out there. And, so those are, as so we are, this is gonna be I I don't I think we are you you're we're out next week. Right?

Speaker 2: 01:06:42

We're out for the next

Speaker 1: 01:06:43

couple We're out for the next couple weeks. Right.

Speaker 2: 01:06:45

So so this is it for 2023. We are bidding a deal to 20 George and Jordan, you are here with us as we wave goodbye to 2023. I think it's safe to say that it has been SVB was this year, Adam. Does that feel like doesn't that feel like

Speaker 1: 01:07:02

That's wild. Yeah. That that's I can't believe that. I mean, it feels like it was at least 3 years ago, but, yeah, it's been a busy year.

Speaker 2: 01:07:10

It has been yeah. So I'm and George and George, I know that the, the SPB Silicon Valley Bank. The Silicon Valley Bank failure was, a notable event in 2023. That feels like it was a notable event in, like, 1923 because it feels like it was so long ago, but, that was this year. So it's been a it's been a hell of a year, and, we, that year certainly, there's been, a lot of a lot this year, and, hoping everyone will bring their predictions with them for what would be the first episode in January.

Speaker 2: 01:07:40

Right? I don't give the date on that in front of you by any chance. I think we already I think you put it on the calendar already.

Speaker 1: 01:07:45

I said if you go up to the events tab, it will even tell us It must be generated.

Speaker 2: 01:07:49

Right? It'll be generated.

Speaker 1: 01:07:50

There we go. January 8th, 5 PM Pacific. Also, the 100th episode of Oxide and Friends. So doubly special milestone.

Speaker 2: 01:08:01

That's right. That's a big milestone.

Speaker 1: 01:08:03

Yeah.

Speaker 2: 01:08:05

So, Jordan, sorry. I if you, you you may have kicked off more than you intended to there. But, we, I'd so would encourage folks to be noodling on their predictions. Go listen to our the past predictions, from 2022 and 2023. We'll be doing a checkup on that, see how risk 5 in the data center is doing.

Speaker 2: 01:08:29

I'm trying to think of what some of the other big themes. Adam

Speaker 5: 01:08:31

or Or risk 5 in the data center.

Speaker 2: 01:08:36

Exactly. So I think the I think Laura still got some years to run on that, though. And That's right. So I I think

Speaker 1: 01:08:42

We had some Discord predictions, web 3 prediction. Web 3 was very popular last year, I think, in terms of In

Speaker 2: 01:08:47

20 predictions.

Speaker 3: 01:08:49

Yes. Oh, yeah. Is that right?

Speaker 1: 01:08:51

Yeah. Jeez. Okay. Oh, yeah. You're right.

Speaker 1: 01:08:54

Jeez.

Speaker 2: 01:08:55

I know. We're in a time warp over here. But so our 100th episode, that's gonna be a lot of fun. So, but, George and Jordan, any, any kind of closing thoughts for us on the, the m I three hundred, you know, kinda we had asked in the Betteridge's law of headline style if this is the future of accelerated compute. It does feel like, we did get a glimpse of the future here from, both in terms of this and Grace Hopper from NVIDIA.

Speaker 2: 01:09:22

It does feel like, the future's upon us. What do you think?

Speaker 3: 01:09:27

Yeah. I mean, we definitely took a turn a little bit there and kinda got down the AI road, but I think we kinda all are in consensus consensus here that this is a, very intriguing product and it's gonna give us a lot to offer.

Speaker 2: 01:09:43

Well, we are we we're still at Oxide as folks know, we do not have an accelerated compute product, but, this is a building block that, we obviously have our eyes on. Very excited excited that it's open, excited that, AMD has embraced Ethernet as the interconnect as opposed to proprietary interconnects. That's a that's a great move, I think, and, it's exciting stuff. So

Speaker 5: 01:10:07

So it Yeah. Actually, just the last comment that I just saw in chat. Actually, I I personally, I think m I 300 a is gonna be more in the sort of traditional HPC realm than for the AI realm.

Speaker 2: 01:10:23

Oh, interesting.

Speaker 5: 01:10:25

That's sort of my prediction is it if you want to knock that down as one of as a prediction is I think Mi 300a is going to be much more popular in the HPC space, and m I 300 x is going to be the sort of

Speaker 2: 01:10:41

AI AI go to. That's interesting. The can you elaborate on that just so I can so we because I think that that's what is kind of the the what what do you see is the difference between those workloads, and why is a a particular fit for 1 and not the other?

Speaker 5: 01:10:57

So if you look at a lot of HPC code, a lot of it is stuff that was written before I was born, before probably a lot of folks in the chat were born, back in the seventies and eighties. And a lot of that code not only, like, can't be moved forward to GPUs, but they're very memory bandwidth hungry. So that CPU component on a will do wonders for a lot of that code. Like I know we were using, basically.

Speaker 2: 01:11:43

But for AI based problems, it feels like that's also a win. Is that not also a win to to have your general purpose compute sitting next to your accelerated compute?

Speaker 5: 01:11:52

But for them, it's it's less they care they're already accelerated. So they they don't care as much about what's happening on the CPU side.

Speaker 2: 01:12:05

Interesting.

Speaker 5: 01:12:05

They're almost all just GPU dependent code. Right? And what they care about is memory capacity and memory bandwidth. Now in terms of memory bandwidth, they're a and x are equal. But x has more memory capacity, which is what the AI folks are really, really want in order to fit all those big models in to,

Speaker 2: 01:12:30

And are you talking about the training side or to the inference side or both?

Speaker 5: 01:12:33

Both. Both. Both the training and the inference side. Yeah. So I see Mi 300a as the HPC and m I 300 x as the AI.

Speaker 2: 01:12:49

But Interesting. Yeah. It feels to me that, like, if this is gonna be a ubiquitous part of compute that we want it to be as close as possible to our general purpose compute and that that having it sit out there as a peripheral is gonna be just there's it's gonna continue to get in the way. The the the split between these is gonna continue to get in the way, but maybe not. Maybe, maybe that's the there are, it it just yeah.

Speaker 2: 01:13:14

Maybe the ending point for these AI workloads will be to sit on on the kind of the disjoint accelerators. And just for the reasons you mentioned, George, they just get getting a bit more memory or what have you or being able to dedicate all of that to to the workload. Alright. Well, that's a I don't think this is a pretty good, pretty good roundup here. Interesting part for sure.

Speaker 2: 01:13:36

Something to keep that keep an eye on. A lot of interesting trends out there, and, a great 2023. So thanks for spending your 2023 with us, you all, and, we will see you in 2024 with lots of great predictions.

Speaker 3: 01:13:52

Thanks for having us.

Speaker 2: 01:13:53

Thank you. Guys. Alright. Adam, see you in 2024.

Speaker 1: 01:13:58

See you, Brian.

Creators and Guests

Host

Adam Leventhal

Host

Bryan Cantrill

AMD's MI300 and the Future of Accelerated Compute

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere