The neXt Curve reThink Podcast

Silicon Futures for February 2025: The State of NVIDIA (with Karl Freund)

Leonard Lee Season 7 Episode 11

Send us a text

Karl Freund of Cambrian-AI Research joined Leonard Lee of neXt Curve to recap February 2025, another action-packed month in the world of semiconductors and accelerated and non-accelerated computing on the neXt Curve reThink Podcast series, Silicon Futures. Karl and Leonard share their thoughts on the state of NVIDIA and their Q4 25' earnings call.

This episode covers:

➡️ Karl's initial impressions from the NVIDIA Q4 25' earnings call (2:31)

➡️ The DeepSeek factor: factored in or not? (6.00)

➡️ The distilled future of AI: cheaper tokens, more models (10:03)

➡️ How will the gravity of AI computing shift and can NVIDIA shift? (11:21)

➡️ The diverging economics of AI supercomputing (16:50)

Hit both Leonard and Karl up on LinkedIn and take part in their industry and tech insights. 

Check out Karl and his research at Cambrian AI Research LLC at www.cambrian-ai.com.

Please subscribe to our podcast which will be featured on the neXt Curve YouTube Channel. Check out the audio version on BuzzSprout or find us on your favorite Podcast platform.  

Also, subscribe to the neXt Curve research portal at www.next-curve.com for the tech and industry insights that matter.

Karl Freund:

Next curve.

Leonard Lee:

Hi everyone. Welcome to this next curve. Rethink podcast episode, where we break down the latest tech and industry events and happenings into the insights that matter. And I'm Leonard Lee, executive analyst at next curve. And in this Silicon futurist episode, we'll be talking about, what's happening in the month of February, In this segment, we're going to be focusing probably overly on NVIDIA. they just had this call, so

Karl Freund:

we just got off the call.

Leonard Lee:

yeah, so as always, I'm joined by the always accelerated, Mr. Carl Freund of Cambridge and hyphen. It will say the artificial, always artificial. No, that's. That's kind of me. That would be rude. That would be evil. I'm not that bad. Uh, unfortunately, our, friend Jim McGregor of curious research isn't able to join us at least for this segment, but he'll join us tomorrow and we're going to stitch all of this together to give you. Perspective on, the month of February, which is really turned out to be quite another insane, month, before we get started, please remember to like, and share, react and comment on this episode and also subscribe here on YouTube and on buzzsprout to listen to us on your favorite podcast platform. opinions and statements by my guests are their own and don't reflect. Mine are those of next curve and we're doing this to provide, you know, an open forum for discussion and debate on all things related to, AI, just semiconductors in general, but we do a lot of AI on this, because this is really what Carl loves to talk about. So, yeah, how are you? I'm good. Thanks. Really? Little sunburn. Oh, little sunburn. Yeah. That is weird. Usually sunburns don't really go well with colds, right?

Karl Freund:

Yeah, I got sunburn and a cold. I think the reds from the sunburn, the nose, nose is probably from drinking too much alcohol in Mexico last week.

Leonard Lee:

Oh, yeah, you know, there's always a price to pay, there's a something. Variation of mine zoom is revenge, but, but yeah, well, yeah, you just got into it.

Karl Freund:

I think, you know, uh, clearly all the spots and radical. Prognostications about DeepSeek, certainly haven't come to pass yet. I don't think they will. I still think that, cheaper is good. And if you can build your AI model cheaper, you'll build more AI models. NVIDIA's revenue, didn't show any kind of slowdown whatsoever. interestingly, you contrast that with AMD and, I think the gap is widening. I'm coming to the opinion of at least for the next year, I can't see anything past a year, but for the next year, I think NVIDIA's leadership is not only solid, but it is improving against it's, commercial Silicon, vendor competitors. they're probably a little more worried about Google. but that's just one cloud. It's just one cloud,,NVIDIA is on all clouds. And that's, that's a good thing. And of course, the same would be true for Tranium, 2, which is actually looking pretty good. but again, that's only on AWS, but yeah. Yeah, all in all really good. they, they increased, revenue. They're going up to, 44 billion next quarter. 4343. I got it wrong. Yeah, I bet you. I bet you they hit 44. what do you think? We're done Not going to happen. So 43 billions, pretty damn good. And the growth rates compare obviously coming down the cell. You compare it year over year and their growth is phenomenal. They're up tremendously.

Leonard Lee:

Yeah,

Karl Freund:

yeah So anyway, I think jensen makes really interesting comments on his call first of all, he says Blackwell is in full production and we'll fully ramp in the next three quarters He also made some comments that I thought were interesting about the impact of ai on software development And the impact of ai on software in general. Yeah, and if he's right, and I haven't seen him be wrong for about seven years, that would imply that every computer's gotta run ai, whether it's through an NPU on a CPU or on an attached accelerator. We'll find out. and it may not happen next, next earnings, but this whole, we talked about last time too, right? The whole idea of he's going to build the best AI PC on the planet, and if he's right about software, that would imply that, his AI PC could do quite well.

Leonard Lee:

yeah, we'll see how it's positioned, further downstream, because it is really a workstation, right?

Karl Freund:

PC. 3000 bucks. That's a, that's a low end workstation.

Leonard Lee:

Yeah, it's a low end workstation, which is oddly incredibly small.

Karl Freund:

yeah, it's the size of a deck of cards.

Leonard Lee:

Yeah.

Karl Freund:

I don't want to go back and repeat our last broadcast, but, that's what I'm looking forward to in May when they start doing that. And GTC later on in March, I expect we'll hear more about it.

Leonard Lee:

Oh, yeah, definitely. And I think this was a really, interesting earnings report to go into excuse me, GTC with, obviously, the outlook, that they provided is very, very positive. But, yeah, I don't think the deep seat thing has been factored in yet. I mean, this is from a quarter ago. I mean, deep seat, believe it or not only happened. What a month ago,

Karl Freund:

right?

Leonard Lee:

know, I think. People are still figuring out, we hear a lot of talk about R1 and, there's a few things that he said that didn't kind of reconcile with what I'm seeing. number 1, a lot of the rhetoric around distillation, how that's really bringing the cost of post training down. So, you know, there's conflicting narratives there, where I think he's thinking there's going to be more compute required. For distillation, but then we're hearing about folks who are distilling models, at like, whatever. Single, you know, like, double digit figures, right? 20 bucks, 30 bucks, you know, 250 bucks. So how does that square with what he's assuming and that might be 1 of the interesting things that we, investigate or talk about and speak to the video folks about is how is post training, shifting and I think a lot of folks aren't putting enough attention to what happened with V3. There are still a lot of questions. That are not answered about V3. R1 is pretty well known cause they published a lot of, transparent. And quite frankly, replicable, methods, in that particular white paper. But if you look at V3, there's still a lot of unknowns that people have not addressed. And I think 1 of the biggest challenges is going to be to figure out how they pre trained this model where it had this ridiculous sparsity. Right? I mean. You know, basically, it's 10 times more efficient than a llama model. At least that's what I I've read in the research search I've done. You really still have to look at V3 and what are the implications on pre training and how is it that they were able to sit there and take this and get to that. you know, the level of sparsity where it's just incredibly efficient and, you know, a friend of mine at Intel sent me a, study that a team at Intel is doing exactly on. That is the efficiency. Of the token routing within the V3, uh, MOE architecture and they're examining how is it that this model can be so efficient orders of magnitude more efficient than anything else. I don't think I have not read. A good treatment of this problem, other than that paper, and it's only the 1st step in this investigation, or what I would consider the 1st step in an investigation that pretty much I don't think anyone has done. And so I think that they're on step 1 and they're still about 5 or 6 steps before you really have a sense of what it is that deep seek. And quite frankly, the Chinese model builders have done because. They didn't just take stock, NVIDIA stock, open source stuff, and then implement to build this model and there's this misconception that they somehow just, copied or post trained on top of, a llama model. That's not true. They use the llama architecture, something similar, and they tweak the crap out of it. they even designed and built their own tokenizer. So there are all these things that have not been addressed or not known. yet we have people out there saying that this is a nothing burger. And it's like, well, if you haven't asked these questions, you haven't gone down these paths to investigate what these guys did, how do you know what the implications are on the market? Because the economics they're introducing, even with our one. Are there it's disruptive, right? So, I think there's still a lot of questions that need to be answered, but, we'll see. I think this is going to play out over the course of the next 2 quarters. we won't see, the impacts of it for a while. But yeah, I think you're right. I mean, I

Karl Freund:

also know this, Ali Baba announced a text to video. equivalent, this morning,

Leonard Lee:

really?

Karl Freund:

Yeah, using the same kind of distillation techniques. Yeah, I think it's going to be pervasive. I think it's going to help a lot of people that couldn't afford to use a eyes and allow them to use a eye. The question is, where's that balance point? Right? Yeah. Devon's paradox is a general thing. It doesn't mean it's, If you cut the price by 2, then the number of units increased by 2. It doesn't mean that. Yeah. Yeah. It's it's a general trend observation.

Leonard Lee:

Yeah. And then, and definitely, in video, or at least Jensen, probably all of it's Jensen. They're really banking on this whole. inference time scaling, he cited the 3 scaling observations. They're not laws observations and that as being 1 of them and, how that really plays out and, factors in the economics, I think, is going to be important, right? Especially the token prices continue to collapse, which is different from.

Karl Freund:

Yes,

Leonard Lee:

and people get confused.

Karl Freund:

Yeah, right. I think we will definitely see data center growth training growth slow down. There's no question in my mind. The question I have is. Well, okay. How fast is Jetson going to ramp up things like physical AI, genetic AI, yes. And IPCs, how fast will that ramp up? Let's be able to take the place of phenomenal growth they've enjoyed for the data center training. so I know we have more of that valid point balance point will be, I agree. I think we will know a lot more in the next two quarters as we see this begin to play out.

Leonard Lee:

yeah, but already that outlook, the 43 billion is an indication that a lot of the stuff is already baked in. Like, he mentioned visibility of the pipeline. but then, as, you know, Carl, I'm always looking at the end markets and that situation. I think those are factors that sometimes are not well, let's say treated in the. AI discourse, especially the supercomputing discourse. But, I think people do need to consider that when you listen on the Microsoft call, Satya will say, look, the data center build outs those can be depreciate over a long period of time. Those are investments that we can make, that, where the expenses will be, spread over, a decade or 15 years or so. but we're going to be flexible and we'll build out the compute part, based on demand I think as we look at the investments going into the data center, not all parts are fixed. And not all things are prioritized, in the same way. when, you look at it from a risk perspective, the likes of Microsoft, Google and others are going to be probably adopting a similar kind of strategy in terms of how they. manage the risk going forward because there's all risk. It's just a matter of, what's your mitigation or, what sort of controls are you going to put in place to respond to the either unexpected or the inevitable.

Karl Freund:

I think that makes sense.

Leonard Lee:

Yeah, I really think you, hit the nail on the head with, these other parts of their business, which I think outside of gaming and gaming kind of took a 20 percent plus, quarter on quarter, decline professional visualization, I think was up. It wasn't like 16%. It's a relatively small market. So I think the 1 that caught my

Karl Freund:

Automotives started to grow again. Their automotives have been fairly stagnant. And, I don't think you can pin all that on Toyota. They haven't even shipped those vehicles yet that are using video technology. So it's not from Toyota. So that was interesting to see because I haven't written them off for auto, Qualcomm has a much stronger position. And much lower costs and, a very good customer list, that is pulling, on, on snapdragon today. So, that was a bit of a surprise to me to see the automotive crow so well.

Leonard Lee:

Yeah.

Karl Freund:

if it's sustainable.

Leonard Lee:

Yeah, exactly. those are the segments that we need to keep an eye out on, especially as you mentioned, things shift away from the large data center training. Toward inference. But I do think that there's still going to be a lot of training. it might be confederated, across the edge, it's going to be a different variety of training, probably more of the post fine tuning type of stuff that's like the whole life cycle of, the AI application or the model. Itself, because those things need to be maintained, right? yeah, it'll probably those segments will be a leading indicator of how, their businesses, from the data center and diffuses. as the AI workloads start to, spread across the edge, I suppose,

Karl Freund:

right?

Leonard Lee:

yeah,

Karl Freund:

definitely.

Leonard Lee:

Yeah. I also

Karl Freund:

noticed, I-B-M-I-B-M launched another wave of granite models it's just every month or two months, they come out with a whole new way of new granite models, IBM clients can take advantage of more efficient, more advanced models.

Leonard Lee:

Yeah, I'm sure there's going to be even more models now that everyone's jumping on this, distillation bandwagon, right? They've discovered distillation, but, ultimately, I think, the benefit is going to be. the opportunities outside of the large data centers, especially as these more robust model capabilities can now be brought down into a smaller, model footprint and then, deployed, even on a IPC, for instance, right? Oh, that's cool. Yeah, I mean, the edge is getting more capable, which is kind of cool, but and it's sort of democratizing the data center capabilities across these, different edge environments. So, I think that's probably 1 of the big pluses. And can you believe it's only just been a month. Right. It's just the month of February. I'm almost afraid GTC around the corner. I know. And then we have GTC in March. And so, What else? Anything else? stand out for you on the call.

Karl Freund:

No, I kind of covered it. Yeah.

Leonard Lee:

the only other thing that I can think about is, like when Jensen says that the useful life of these systems, that they build, whether it's in the data center or some of the older. systems based on older GPUs. he mentions the fungibility. I think that you have to reconcile that with one year cadence. Right. Cause if you're like, what is it, the doubling the performance, is it every year or it's more than double, right?

Karl Freund:

Yeah, I don't know if I've, I haven't seen him say whether there's doubling or tripling or whatever, and I, I think, he was asked that question on a call and his response was, Hey, look, it's just another black well with more memory and better networking and he did say improved compute cores. Something to that effect, which I'm not sure how valid that is. We'll have to wait and see. We'll probably hear a lot more in about three weeks. I know. GTC. Yeah. So yeah, but I think you're right. I think the annual Hayden's is going to be hard. I think for the industry to just, it may be possible that if they, if that, that they'll just, I don't know what the, what people will do. But if you got two generations to Blackwell that are just ramping within six months of each other,

Leonard Lee:

I

Karl Freund:

think it's really up to NVIDIA to correctly position each of these platforms, for, specific use cases and models. And maybe, I don't know, maybe Blackwell becomes more of an inference platform and, uh, Alter becomes a training. I don't know. I'm just guessing there's something along those lines. You got to have a lead dog in the race and, you can't have two lead dog. Well, you could, but you're competing with yourself. It's interesting that it's hard to forecast,

Leonard Lee:

yeah, but then, you know, it is interesting because we may be at an inflection point, because even based on some of the things that we've cited, just in the last 20 minutes or so, they're sort of counter bailing or contradictory trends. I think there's some friction forming, right? Whether it's what I call, sort of the token dumping dynamic where prices are collapsing for tokens, the end market where monetization is still a problem. I just got on off of a whole series of calls this week. real monetization is really nowhere. Insight, it's actually disturbing and then, and then you have, the economics of supercomputing, that, may be impacted by what deep seek V3, did and so how that plays out, we still have to. To, um, you know, see, if there was ever a really interesting time in this whole cycle that we've been witnessing with, generative AI, this is probably just interesting time right now.

Karl Freund:

hang on tight.

Leonard Lee:

I know. I know GTC is going to be crazy. So any other last thoughts?

Karl Freund:

No, I think we pretty much covered the waterfront

Leonard Lee:

So, yeah, I know. And uh, Daisy took off. So that means that we took off something. Something's going on. Yeah, I know. I know. It must be dinner time. So, yeah. With that. Hey, Carl, thanks a lot. That's always a pleasure. Appreciate, yeah, you sharing. And, before we, uh, take off, remember to like share and subscribe to the next curve, YouTube channel and also follow our research at www dot next dash curve. com. Also follow Carl Freund's research at Cambrian hyphen, AI research at, Cambrian hyphen. ai.com.

Karl Freund:

Dot

Leonard Lee:

Yeah.

Karl Freund:

In ai.com.

Leonard Lee:

Yeah. And, follow us and, check out our Silicon Futures, show, current past, and, stay tuned. Yeah, stay tuned and check us out for the. tech and industry insights in the semiconductor industry and all this AI stuff that matters, and we'll see you next time. Thanks a lot.

Karl Freund:

Take care.

People on this episode

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

IoT Coffee Talk Artwork

IoT Coffee Talk

Leonard Rob Stephanie David Marc Rick
The IoT Show Artwork

The IoT Show

Olivier Bloch
The Internet of Things IoT Heroes show with Tom Raftery Artwork

The Internet of Things IoT Heroes show with Tom Raftery

Tom Raftery, Global IoT Evangelist, SAP