The EDGECELSIOR Show: Stories and Strategies for Scaling Edge Compute

PART 2: Navigating the Realms of Tiny Edge AI with Industry Leaders Evgeni Gousev of Qualcomm and Gopal Raghavan of Renesas

February 06, 2024 Pete Bernard Season 2 Episode 3
The EDGECELSIOR Show: Stories and Strategies for Scaling Edge Compute
PART 2: Navigating the Realms of Tiny Edge AI with Industry Leaders Evgeni Gousev of Qualcomm and Gopal Raghavan of Renesas
Show Notes Transcript Chapter Markers

Prepare for an exhilarating ride as we navigate through the intricate realms of Edge Compute with industry leaders Gopal Raghavan from Renesas and Evgeni Gousev from Qualcomm. Get ready to unravel the technologies that empower the Tiny Edge, the people fervently pushing their development, and the revolutionary transformations it can trigger. 

Our guests share their profound insights into the potential of microcontrollers (MCUs), the challenges of executing workloads on ultra-low powered devices, and the remarkable advancements in connectivity, algorithms, and developmental software tools. We shed light on the pivotal role of the three V's of data - Volume, Velocity, and Variety - in making edge computing a success story, particularly in the automotive industry. Gopal and Evgeny also highlight the importance of a balanced mix of low power silicon, smarter workloads, and efficient connectivity in reducing the energy footprint of AI.

As we bring our enlightening conversation to a close, we anticipate the future of technology and connectivity, the inevitability of smart devices, and the evolution of the software stack. We emphasize the need for robust deployment tools and stringent security measures and explore the potential repercussions of poor coding on energy dividends. Join us as we dissect Edge Compute to foster learning, growth, and acceptance, and propel ourselves towards an exciting future. Don't miss out on this opportunity to gain insightful perspectives from industry leaders and expand your knowledge horizons in Edge Compute!

Want to scale your edge compute business and learn more? Subscribe here and visit us at https://edgecelsior.com.

Speaker 1:

When you ask people what Edge Compute is, you get a range of answers Cloud Compute in DevOps, with devices and sensors, the semiconductors outside the data center, including connectivity, ai and a security strategy. It's a stew of technologies that's powering our vehicles, our buildings, our factories and more. It's also filled with fascinating people that are passionate about their tech, their story and their world. I'm your host, pete Bernard, and the Edge Celsius show makes sense of what Edge Compute is, who's doing it and how it can transform your business and you. So let's get started.

Speaker 1:

I get things rolling here and I can even see sort of the jitter and packet loss of our session. It's pretty good, so I'm pretty excited. Cool, yeah, so well, this is going to be kind of a very special episode of the Edge Celsius show because we are going to have a three person conversation with Benny and Gopal, who I'll introduce in a minute. Oh, so one other thing, just as a housekeeping thing is try to silence your alerts or any of your beeps and loops, because they can't really edit this out. No, don't want to do that. Yeah, I mean, it's not Sergeant Pepper or anything here, but we're trying to keep it professional yeah.

Speaker 1:

Try to keep it professional. Cool, and this is an audio only recording, so don't worry about how you look.

Speaker 2:

Okay, I found it oh good.

Speaker 1:

Yeah, I had someone I was recording and they had it all set up in the hair and it looked really good and I said you know, I appreciate you getting dressed up, but this is an audio only thing, so whatever makes you feel good.

Speaker 2:

So what is your typical audience?

Speaker 1:

Well, you know, so this goes out. I mean, it's through iTunes and Spotify and all that and I actually are getting I'm getting a lot more views on or listens on YouTube now Probably probably 10 X on YouTube that I'm getting on, like iTunes and Spotify Just in the past, like month or so month or month and a half.

Speaker 2:

Is it like like a general public or some engineering? Like what are we? What kind of Well?

Speaker 1:

I don't know exactly who the who is the audience. I mean I'm assuming they're self selecting to be interested in edge computing. So I would find it hard to believe it. If you were not into it, then you'd actually listened to the whole thing.

Speaker 2:

So people are somewhat familiar with the area.

Speaker 1:

Yeah, I think so and I, you know, I think it's. We've. We've had Intel on the show, we've had different analysts on the show, leonard Lee has been on the show, we've had Dave McCarthy from IDC on the show, we've had Microsoft on the show. So lots of different players in the kind of the edge computing space. So, yeah, it's, it's all around the the edge stack Cool, and you know so. Today I thought we could talk about I'll introduce you guys in a second here but the talk about the tiny edge, the edge of the edge, and which is something that a lot of people don't focus on as much. But before we get into it and all the different topics, let me kind of go through. Maybe you guys can introduce yourselves. We have Gopal Raghavan from Renaissance calling from Southern Cal. So, gopal, do you want to give a two second? I'm in two seconds. Give yourself 15, 20 seconds A quick background and who you are.

Speaker 3:

Thanks, Pete. You know I've been working on ML on edge devices for about the last 10 years, First in a company that I started, and then I joined Microsoft, where I was working for Pete actually doing the same thing, and now I'm at Renesas, where I'm coordinating the AI strategy across a range of edge devices, from MCUs, which are tiny devices, to MPUs and some beefier AI accelerators.

Speaker 1:

Great, great Appreciate that. And, evgeny, do you want to give everyone your info?

Speaker 2:

Absolutely, and it is a pleasure to be here, pete, thank you for inviting and to be able to share our thoughts about this exciting area. So I've been with.

Speaker 2:

Qualcomm since 2005, for almost 20 years, a lot of different projects, and more recently in the past about 10 years, we've been working on embedded compute, low power compute, machine learning, edge AI applications, hardware, software basically the full stack. And I also serve as the chairman of the Board of TinyMail Foundation, which is a nonprofit organization of many global companies doing business together in the area of edge compute or the edge of the edge compute. Like what you say, tinymail software, hardware applications very diverse, very interesting ecosystem, so it's actually fun to be part of it.

Speaker 1:

Cool, yeah, no, that's great and yeah, and we've done some work together in the past as well, some good projects. So, yeah, and I wanted to get both of you together because I think we've all occupied sort of a similar space over the years. You know, a lot of times when people talk about edge computing they tend to talk about maybe some of the heavier edge stuff you know in servers and gateways and kind of big things like that. But in fact you know, as many people may know, in terms of high volume, you know, billions of devices out there are running much lighter weight, lower power, compute capabilities, a lot of those in the past, like MCUs. Everyone you know probably hopefully everyone who's listening to this podcast knows what an MCU is. But microcontroller, they're everywhere, they're in everything these days, everything from your bathroom scale to your toothbrush to your, you know, whatever toilet seat Beyond the bathroom, even in the fridge and everywhere healthcare.

Speaker 1:

But what's been happening interestingly is that MCUs over the past several years have become much more capable.

Speaker 1:

So they're not just kind of doing very simple kind of functions, but they're actually able to do compute and they're able to communicate and they're actually driving workloads and so because of that and we can talk a little bit about some of that architectural advancement. These platforms are becoming really interesting, useful platforms, including doing, you know, ai and other kind of edge compute. So that's kind of a frontier, and the challenge, as I'm sure we'll talk about too, is how do you do that in an architecture that's really designed to be ultra low power, you know, many tons, battery operated and also with very low cost. So these are typically single dollar or so type of chips in very low cost things. So that's kind of the frontier and I'd love to kind of get both of your take on. You know what has been changing in sort of this MCU space and this tiny edge space. It's now making it a lot more usable and feasible to do things with. So either you guys can chime in.

Speaker 2:

Yeah, I think there are a couple of fundamental things happen in the past I would say five years or so and it's all driven by road map technology, road maps, but also innovations in the systems area and also talent developments, specifically, and also to specifically, silicon is becoming more and more capable. So, like one point of comparison, what we can run on a small microcontroller today which is maybe just like five millimeters square, this small microcontroller has as much horsepower as a Pentium computer. Was what 100, not 150 years ago? So you can imagine big desktop. You can run so much workload on this tiny couple of millimeters square piece of silicon. That's one Second.

Speaker 2:

Algorithms and models are becoming more sophisticated and they're becoming both more capable and lightweight in terms of model size. And there are other techniques like how do you make it even smaller using quantization, pruning and so on. People build this small models and the third driving force are tools, software tools. There are many companies who offer the software tools. As a result, there is a big developer community around this. It's easy for people to use all the way to like no code programming. You can kind of develop your own code without they knowing programming languages. So I think all these forces, silicon, innovations, algorithms and tools. They drive this massive development in this area. That's on the technology side, and we're seeing more pool on the end user and application side, which I'm sure we're going to address later in this session as well.

Speaker 1:

Go, Paul. What's your take on? What's changed with good old fashioned MCUs to make the model new and exciting?

Speaker 3:

Before we get into specifically MCUs, one of the points you made, pete, which was a very good point, is there have been advancements in connectivity as well as compute True, and what this does is transforms the AI problem into a continuum. So, given a problem, you have a choice of where you want to solve it and why. Would you want to move towards smaller devices or towards the cloud? That depends on the three V's of the data you have. So like, depending on the data volume, the data velocity and the variety of data, it makes sense to solve or do AI either on the tiny device or, let's say, a more capable MPU, or even an on-prem data center or a server All the way up. Next up would be the server provider edge and then finally the Cloud.

Speaker 3:

So there is this whole thing, and so what happens is, as more capabilities are available in MCUs, to me it only moves your problem up or down. If your MCUs not very capable, then you move to the next level. Up, right, you keep moving. So, having said that, mcus and Silicon for Machine Learning, all the NPUs there's a whole host of companies doing that and software that Evangny mentioned also, all these are pushing the compute access, but without the real big changes in connectivity we have seen recently, I don't think the edge market would be as interesting.

Speaker 1:

Right, well, I mean, it's a good point. The edge solutions are defined by a continuum of devices, everything from the hyperscaler to the potentially the tiny edge sensor and maybe a few hops in between. That actually distinguishes it from maybe traditional IoT solutions, which may be sensors sending data one way up to a Cloud or something like that. So you're right. I think the ability to now think of these solutions as a continuum of compute I would throw in there's management and orchestration of those workloads is also pretty important. But, yeah, getting these things connected and we could talk a lot about connectivity for the tiny edge, whether that's LPWA or lower-wan things or NBIOT but also, like you were saying, is just thinking about the ability to plop the workload on the right piece of the edge at the right time to get the right result. I think that's the thing that's now bringing a lot of the tiny edge capabilities, making them much more valuable in completing these types of solutions. Right, exactly.

Speaker 2:

I think that's an excellent point. We are not talking about tiny versus Cloud. It's not either or it's a distributed compute, depending on the workloads. The winners in these games are those people in company who know how to partition your system in a such smart way you can get most of your compute in terms of energy efficiency and cost, because that's basically the rule of the game today.

Speaker 1:

Yes, exactly In my experience working with customers, they want to solve the problem for as little cost as possible, that's fast as possible. Anything that solves those problems is good for them because they don't really want to spend any extra on it. But yeah, no, it's interesting, I think, also for our listeners. Openai just had their DevCon, their developer conference, and so there's been so much oxygen in the room has been taken up talking about generative AI, which is fantastic. It's wonderful. I use it too. But the types of AI that are happening on the lighter and tinier edges is not generative AI, it's other types of AI, it's anomaly detection, it's vision AI, it's object detection. I mean, can you guys talk about what are some of the scenarios and use cases that really snap in well with the lighter edge of compute?

Speaker 2:

I think there are also probably two driving forces here, pete, like what you said. On one hand we have all this cloud-based AI techniques and technologies being developed. I think what you see that this type of approach is techniques are being adopted also to the edge type of devices by making models smaller, smarter, more application specific. That's one driving force. On the other hand, we have a bottom up type of evolution like sensor type of companies ST microelectronics, bosch, all these companies. They try to make their sense of smart term by area and ML capabilities there.

Speaker 2:

So those are the two things that are connecting the big AI and the tiny AI world. This is a general comment, but application specific. I think there are many areas, basically a way you need to bring more intelligence to the edge and do it in a way that does not violate especially privacy and is beneficial in terms of latency, because you don't need to connect to the cloud or sometimes you cannot connect to the cloud. For example, some applications in the industrial IoT yes, you connect it through Wi-Fi, but Wi-Fi in a big industrial building may not be reliable. It may be on and off, type of things. So I think the latency and the liability are also important. That's where this HAI tiny ML comes to shine, because you don't have this type of constraints in terms of energy, in terms of dependency on networks and so on.

Speaker 2:

So those are kind of general things and if you look at bi-verticles, I think we are at the very beginning of this. I think it's kind of hard to make big predictions yet, but you would definitely see some very interesting trends, like, for example, in the industrial IoT. There are many applications, like you mentioned. Predictive maintenance, for example, is one of them. How?

Speaker 3:

do you?

Speaker 2:

make your machines smarter, more reliable, more predictable in a way, and anomaly detection is kind of related to this. How do you detect things that are about to happen before they happen? So you replace a motor based on some vital signs from this engine.

Speaker 2:

That's one big vertical. Another one is consumer electronics. There are many applications of these type of devices and consumer electronics. Just like V8, qualcomm you just release the product in a laptop business, for example, using these type of technologies or wearables as a big application. Augmented reality and mixed reality, xr because all these devices they have constraints in terms of battery size or energy. I think that's kind of where these technologies come to play a big role there. That's a second vertical. In the third one, I would say it's in the healthcare and medical type of applications. I think that's kind of another big opportunity. But there are more than these.

Speaker 2:

But those are kind of three way we see quite a bit of traction.

Speaker 1:

You mentioned about some of the kind of privacy security things.

Speaker 1:

I mean, as we know, in some of these, especially industrial solutions, they're air gapped for a reason, and also in sort of military and government things.

Speaker 1:

So there is no cloud access or there's occasional cloud access and so the types of kind of compute and AI that needs to happen needs to happen at the edge without relying on a cloud. Then, as you mentioned before, also there's just the latency and, frankly, the cost and ingress of all the you know where back in the old days, maybe years ago, like healthcare stuff maybe would collect some data, dump it up to the cloud, get some analysis and get the data back maybe at some point. But now a lot of that stuff, especially in the healthcare, medical fields, that can happen like on device instantaneously for doctors and patients to see without having to go off-prem and do calculations. So yeah, no, it's interesting. I think probably we'll eventually see, you know, a lot of these MCUs and kind of very kind of light edge rather than AI being sort of an interesting added feature. It'll just be sort of a standard way of running compute. But go, paul, any thoughts on that? What's?

Speaker 3:

your take on the hot areas. My take on the hot areas, I think the ones which are closer in. Part of the problem has been a lot of these AI edge applications die after the POC level and, depending on who you believe, it's somewhere between, let's say, 70 and 90 percent.

Speaker 1:

Yeah, sounds about right.

Speaker 3:

I think we're both familiar with that too, peter, that's right. Yeah, so the question is, when we try to force AI into applications where there is no how would I say it but there's no concrete or provable benefit economic benefit it usually tends to die out. So, while we've seen a lot of applications, I think the two big verticals I think, as Ewing mentioned was one of them is, of course, industrial, and the other one, which you know may not MCUs do play a role not as big is the automotive, because the automotive is absolutely the edge. You know and again to me them Way I identified things that the edge is based on the three V's of data Volume, velocity and variety.

Speaker 3:

So if you have a huge volume of data coming, it obviously does not make sense to send it all to the cloud. Similarly, the velocity is very high. You're not sending it to the cloud. The only other thing is when you close the loop, you need latency if you need to act on this data, and so those are the two that I feel sort of warmish about. I'm not really hot about any of these Because I'm not seeing the money, but I feel warm. Automotive certainly is a big one.

Speaker 1:

Yeah, yeah. Well, I was gonna say automotive, I mean. Yeah, I mean that that whole segment and I mean Qualcomm and Renaissance are both doing big business, and automotive these days, the you know that the rise of the software defined vehicle and kind of consolidating all these Legacy ECUs into these kind of central vehicle computers and stuff, it's a huge engineering project For for everybody and there's a lot of real-time, you know, safety critical stuff and needs to happen there. So that's actually a really good example because it's it's it has to be Deterministically real-time and it has to be and you know, there's.

Speaker 1:

I mean, we've all driven cars or maybe not everybody, but you know they go pretty fast, they have brakes and things, and so you need to sort of make sure that all that stuff is it's instantaneously actionable. So, yeah, no, that's a really good example of one of the most complex and interesting Edge platforms that are out there, right?

Speaker 2:

and actually some of the other more, if features may be actually mandated because they will be a safe Right or mission critical, right, right, and they can't be dependent on any kind of cloud connectivity to work. Right like for example yeah, go, go, go, go, go go.

Speaker 3:

Actually, the funny thing about this is one of the reasons we tout edges to work well with intermittent connectivity. Mm-hmm but I think, if you saw the news over the last time, when, six months when I forget which of these Automated, automated cabs in San Francisco they were, I forget which one they lost connectivity and all of them went to this one street in San Francisco.

Speaker 1:

Oh really.

Speaker 3:

Yeah, and they all jammed it. So I think behavior with intermittent connectivity needs to be Worked on some more.

Speaker 1:

Yeah, yeah, yeah, there's a whole other thing going on there, but yeah, no, it's, it's fascinating to see that and you know, I think I know Renaissance just introduced, like this, cortex M85 chip. I saw that today, by the way, and I know Qualcomm has a lot of stuff out there. Qualcomm had their snapdragon summit recently, talking about all of the on-device AI, yeah, and everyone's jamming in like a lot of acceleration silicon into their, you know, into their chips and stuff, and so one of the things people have talked about with on device AI or you know this, this term is Squeezing down that you mentioned of getting models getting smaller. I mean to give our listeners an idea of what are the size of some of these models that we're seeing on what we consider on the tiny edge or the lighter edge of things.

Speaker 2:

Yeah, I think. I think that's a great question. I think it depends on the Application, more specifically on the sense in modality, like, for example, some of the audio models, like wake Keyword detection for example. Those models can become Quite small, like, for example, on the order of like 10 kilobytes, but it's really small models. Some and this is just basic functionality, distinguishing, I don't know, maybe a class of like, I don't know hunger, different keywords, I mean still useful use case. And there are more sophisticated type of models. For example, people can run with sport type of networks, which is actually pretty sophisticated networks for audio recognition and Transcription on also my controller, not microcontrollers, more like microprocessors, those models about like 40 megabytes. But this is really the high end, but 40 megabytes is actually Quite, quite, quite doable. So those are for audio type of applications, for Vision type of applications. Models are typically a little bit bigger because you have to deal with larger images and images, pixels and pixels and needs.

Speaker 2:

So Typically those are, you can get a good models, like for face recognition, face detection type of models, on the order of like hundreds of Kilobytes, which is actually not bad, because if you think about microcontrollers, typically microcontrollers have on order of like One megabyte of memory, couple megabyte of memory. That's kind of a typical range. So basically you can fit several models, like even vision based type of models, on on the microcontroller.

Speaker 2:

And then there is a whole zoo of models used using other type of sensing modalities like, for example, accelerometers, gyros, temperatures, whatever any type of those, are even smaller models, because those kind of sensible Modalities that on send too many beats bytes of data, so it's on the order of like also a few kilobytes. So actually, if you can think about this, that the hardware is becoming very, very capable because the model sizes are quite small. You have a lot of memory on your embedded device. You have quite a bit of force power. You mentioned, like mm 85 and even before this, m55, and then you have Arm has specifically talking about. Arm has dedicated accelerators for ml u55, u65. We can accelerate some of this matrix multiplication and different type of operations.

Speaker 2:

So they're super, super capable, in terms of both power and memory footprint, and what you can do with this type of devices Right, it's a great time to be in this field.

Speaker 1:

And also I think I mean at the snapdragon summit they showed, you know, some generative AI models running on snapdragon is kind of a proof concept and maybe we'll see this whole AI PC thing come to life at some point. But those are, those are big. I mean, those are right. You know, the generative AI models are even it would be. It's a challenge to sort of fit them even into effectively into you know, a pretty beefy snapdragon you need to have a?

Speaker 2:

Did you have type of memory it's. It's all about memory. So you need to have extra memory and then fast Access to memory. Fast, fast access to this memory, very high memory bandwidth.

Speaker 1:

Yeah, yeah, that's a different beast you were mentioning about keyword detection. Another interesting use case that I'm familiar with, too, is a Lot of kind of audio anomaly detection, so gunshot detection, you know glass breaking sounds, you know. So there's a lot of work that's been done to, you know, have those audio sensors out there, kind of detect those things on the edge and then send a signal saying, hey, check this out or whatever. The alert is on the back end.

Speaker 2:

So those can be pretty tight, those can be pretty tiny models and, by the way, those use cases you just mentioned be those are not exotic proof of concept, those are products. I mean there are some products yeah, they're out there by big companies and that's it's already out there.

Speaker 1:

Yeah, yeah, it's doable and then also some of the anomaly detection. I've seen things around you mentioned about motors. You know you could measure the current of a motor and you could see if it's drawing more current over time because it's about to wear out. Or you Can check vibration and you can say, oh, this vibration pattern or this, even this noise pattern is different, so the ball bearings might be, it might be wearing out. So there's all kinds of interesting, you know, tiny AI models that can happen there to head off A potential disaster.

Speaker 3:

Yeah, pete, I think one of the problems I've seen, at least in like industrial If you you can detect that a motor is getting, say, worn out or something, but it's not clear what you do, because if it's part of a big process line, you just don't shut off the motor, right right right. So what you need to do is you need to go up to level one or Level two from level zero In order to take a decision.

Speaker 1:

Right right.

Speaker 3:

So and that's been one of the big challenges is, while you can detect that a motor is varying out With some degree of confidence, Mm-hmm. What do you do with it, mm-hmm? And how confident are you of bringing a whole line to a standstill?

Speaker 1:

Yeah, well that's a challenge to it with like these kind of brownfield Implementations right, where you kind of bolt on some things to some legacy equipment and then you don't really have a way of Managing that equipment anyway, right.

Speaker 3:

Yeah, exactly, exactly. So closing the loop is turning out to be a major issue.

Speaker 1:

Interesting. Yeah, you can't just kind of turn the engine light on and you know, hope for the best right.

Speaker 3:

Yeah exactly.

Speaker 1:

But yeah, no, that's true, that's a good point. It's like you said. I mean we started talking about these are parts of Holistic systems. Right, they're not taking of themselves it's. It's now a very viable tool in the toolbox for an edge system To do AI on these, on the tiny edge, and it's it needs to plug into. You know, like you said, what do you? What would be? What are the actions that are taken once the the anomaly is detected?

Speaker 2:

Yeah it's pretty important and that's actually another excellent point because if you compare to cloud-based AI, it's more like homogeneous, in a way that on one hand, you have people who build data centers they know how to do it and they have a different camp of people who know how to build models with the data center right. But when you talk about edge type of applications edge AI I tiny it's all about holistic. You really need to look at this from the system perspective. It's not like you have software people doing one thing, how the way people doing another thing. That's like in the cloud space. They're completely not completely, but they're kind of more distinct group of people. But in this case it's all about holistic. How do you find the best solution?

Speaker 1:

and, like what Gopal said earlier, how do you use this distributed continuum compute to solve to solve a particular yeah, yeah, well, and also one of the things that distinguishes edge computing is that it typically it's it's connecting sort of real-world processes you know to to kind of the software right, whereas sometimes in cloud computing you're doing some interesting things in the cloud but it's all happening in the cloud but typically, like when you're connecting sensors to things like engines and motors and Amusement rides and, you know, car braking systems, there's like real, real-world impact. That's that you have to think about.

Speaker 3:

Yeah, sorry, but I think that's an excellent point you make, because you know all the generative AI. The models are trillions, but the input is unicode, right, okay. Or some other binary picture data right, it's very well controlled, as opposed to taking analog data and which is affected by your environment I mean, think of a camera which has got a thin layer of grime built on it over a year, right, right? How accurate is your model going to be when your physical transducer Itself is corrupted? And how do you know the physical transducer is corrupted?

Speaker 3:

Right and this is where you need the full deployment model with the cloud behind it, so that you can collect some raw data once in a way and actually study input data, statistics and stuff, and see if there's been like data shifts or concept shifts, and calibrate and calibrate and so it's a. It's a fairly more complex problem. Like a kidney said, you have to think of it at the system level and it requires a multidisciplinary approach to the system. And it requires a multidisciplinary approach sometimes.

Speaker 3:

Yeah, and so makes it much more complex and more fun as opposed to just.

Speaker 1:

Yes, it is. It's fun when it works. Yeah, I was going to ask you guys about energy. One of the hot topics these days is around kind of energy Conservation. You know water conservation.

Speaker 1:

You know, as we have this appetite for more and more AI and you know more cat poems and cool Illustrations that we want to create. You know data centers are starving for power and electricity to to run all these workloads and really there's kind of no end at this point. I mean the the curve is exponentially going up and to the right in terms of appetite, and I know that a lot of folks are struggling to find the power to to drive these things, even if they could get the chips. And so the lot of folks have been talking about you know this. I call it the energy dividend for the edge.

Speaker 1:

For the edge and this idea that you know if some of these workloads could be distributed and kind of run on the right Silicon at the right time, there's a there's a benefit of the conservation of energy. So, for example, I would not go to chat gpt and ask it. You know what two plus two is, because that's a horrible waste of a Very nice nvidia h100 gpu cluster that I just used to add two numbers together, so can you talk a little bit more about so where, where things are heading in terms of energy efficiency and low power and sort of how can that benefit the kind of this more systemic approach we're talking about in terms of energy usage?

Speaker 2:

I think you touched upon a couple of very important points, pete. I think one of them would definitely they the way the, the way the current ai goes, especially gen ai, it's not sustainable. It's everyone acknowledges this and I don't think there is a clear solution. I mean, there is an overconsumption of ai. People don't know what to do with this. People just Do some bounding blocks, boxes on your facebook pictures, and it consumes energy, right, like why are people doing this type of things? So I think to the point, like I think I was attended a workshop at mit last year and there was a presentation there saying that the way we're going and with ai today, by 2040 we're going to be running out of electricity periods. Like all our electricity will be going to feed data and 2040 is not that far away. It's not far away tomorrow.

Speaker 2:

Right, and fundamentally that's because the cloud-based ai is super inefficient. We were talking about single digit numbers there. That's kind of what it is today, and do you? Really need need it and for water. I think it's all about Actionable, and I think this distributed compute gives us an opportunity to do things in a smarter way. You offload low power tasks to the Low level intelligence and, as needed, you kind of keep bringing it up and up and up yeah and as you go up, obviously your energy footprint goes up, but you don't do it every time, right.

Speaker 2:

So that that that that that's kind of the key here. And just if you're talking about numbers, typically this Tiny ai agi devices, they consume energy in Media. What type of range? Some in the micro wots. I think there are many applications using using micro wots, so you're talking about devices that can operate at Coil cell battery for four years, for example.

Speaker 2:

I think gopal brought up a very important point before about connectivity and the duty cycle of your connectivity. It doesn't have to be every second, right like to the point that your device is autonomous. You generate and you do this analytics on device for a long time. You have a little bit of on cheap memory, you store all this information and once a day you send it out right Like like one example here there is a company you you probably know them, short line iot and they do predictive maintenance and what they told me?

Speaker 2:

That they've been able to ship a product with this predictive maintenance for oil refineries just due to the Due to the this on cheap intelligence, on cheap ml, on device ml, because otherwise, if you do connectivity all the time, they say, the battery would have been Going in five minutes and right. With this type of on device analytics, it enables them to have a device that can last for five years and that's, that's basically a deal breaker type of thing. So I think the energy implications of doing Um ai at the very edge is enormous. If you do it in a smart way, as a system, right, how do you partition your system? How do you do connectivity? What kind of duty cycles you do it there?

Speaker 2:

And it all boils down to what kind of problem you're trying to solve and Right right definitely be very, very, very efficient, as I mentioned earlier, like, for example, for AR glasses. You don't have tons of battery to do it and you don't want to recharge your glasses every every five minutes. Right, you want to make sure this you can use for a day or so so those are kind of Some very practical examples and in my opinion the HEI tiny mill is the only way to enable this type of edge intelligence without kind of breaking the energy.

Speaker 1:

Yeah, yeah, what also requires for some pretty sophisticated orchestration and, uh, like you said, systemic thinking about. You know when do I ladder up my AI workloads to higher performance or ladder them down till to lower lower power to get the job done. But Go, paul, you're gonna say something.

Speaker 3:

Yeah, pete, um. So First I want to make a comment about generative AI. I don't know whether you saw this, but Every time you query generative AI with I don't know something like 50 tokens or something, you require half a liter of water to cool it, and so you can imagine. So generative AI is going to have its own track and, as I've said, by 2040, we may be running out of electricity just for generative AI. But the more interesting thing is about I forget the exact number, but say 70% or 80% of enterprise data is dark, so it doesn't come to the cloud.

Speaker 3:

So, as people are heading towards, enterprises are trying to go towards a more AI based or data driven management style, all this data needs to be processed and looked at Right. And then the question arises where do you do this? So, if your approach is, I'm going to move it all to the cloud, where there's already a severe problem. That's a non-starter for energy. Yeah, and so edge plays a huge role and will help mitigate that growth.

Speaker 1:

Yeah, I think there's kind of I would just to play Dill's advocate too. I mean, you know, if you look at the history of the telco kind of infrastructure, we went from sort of distributed RAN to centralized RAN to now V? Ran and O ran. One of the things they found was that there was a lot of computed cell towers that was not being used Right. So there was a big distribution of equipment out there that was for 95% of the time doing basically nothing. So they decided to consolidate it into these kind of base station hotels and get more horsepower across their hardware dollars. So that was an example where centralizing worked.

Speaker 1:

But you know, like you said, there's also this idea of you know, if you can actually, you know, run these workloads on these edge platforms and keep them optimized and, you know, maybe save the cloud for more training and other kind of compute intensive things and do more inferencing on the edge, then there's some balance there, I think, one of the things also that's missing, frankly. I mean, I think when we have this conversation a year or two from now, hopefully we have more telemetry data on what is the power consumption of these workloads, because I have not seen anyone with a real solution that says this system is using this much energy. It's just not metered right now, it's just not measured and the tools aren't there to really even have a conversation to compare the efficiency of some of these designs. I think.

Speaker 3:

That's true. That's true because we don't know what is being consumed. But I mean, we have a good idea for. But again, other interesting thing you brought up was virtualization, Pete, which is what makes these more efficient and the cloud more efficient. For sure, right, Without virtualization there's no cloud. Yeah so, and you won't get virtualization in these tiny devices or even on on-prem servers and stuff. The question is, how well are they virtualized so that your energy efficient, because you may not have enough workload?

Speaker 1:

Right, right, exactly.

Speaker 3:

And so it's not going to be as efficient as the cloud is. So you're going to give up a few points in efficiency. I don't know how many points, but certainly a few.

Speaker 1:

Right, yeah, no, it's interesting, it's an interesting trade-off and, yeah, I actually have one of my viral videos now on my YouTube channel. Talks about the AI. I think it's called AI's environmental Armageddon. That's my it's a little bit of a clickbaity title, but it basically walks through this issue of there's not enough power and not enough water and how many liters and all that stuff. And but one of the things is the one of the on the solution side of the video, toward the end, after it's very dark is how do we think about measuring and how do we? You can't really improve things you can't measure, and so I'm optimistic that we'll get some more tools in place and more awareness, so like. So, for example, when people are, like you said, using bounding boxes and doing cat poems and things like that, they probably don't realize the environmental impact Exactly. If they did, maybe they wouldn't do it as much, maybe they would save it and they'd understand that this is. You know, this is a couple of liters of water here or a tree there.

Speaker 1:

And you know, but we don't have that yet. We don't have that built in to the systems for awareness. I think people would be smarter with the resources, that they knew what they were using.

Speaker 2:

People just do it for fun or simply because they don't realize that this is a default setting, right? Maybe they need to change the setting. Don't don't do bounding boxes.

Speaker 1:

Maybe, or they've already paid for their co-pilots, so they might as well use it, you know. So for something, but yeah, no, interesting so.

Speaker 1:

I think there's a lot of work ahead, but it's exciting to see that as this stuff evolves and if folks haven't been to the tiny ML website, you should people should definitely go to see, learn more all about this stuff as it becomes more and more capable. You mentioned the semiconductors are faster, lower power, the tools are better, orchestration software is better. This becomes really a critical part of the whole solution, and not only just solving problems for customers faster and cheaper, but also with a much smarter use of the resources the scarce resources that we have available. So any kind of closing comments from either of you that you'd like to kind of leave listeners with on this topic.

Speaker 3:

Yeah, I was thinking about that. You know, in five years from now, where do we see tiny edge right? So on the normal development side, I would think you know the focus is on more efficient deployment across the whole stack, not just one piece of it. How do you build a solution and deploy it? Number two would be more improvements in model sizing. And a third one would be can we start doing training on these small devices at the edge so that we could go for some kind of federated learning or something where each one does a little bit of training since data back, as opposed to collecting all the data in one place and then training like we do right? So these would be the normal things. Some disruptive things that could happen could be neuromorphic computing, which has been there for the last I don't know 20 years and continues to not show much promise. But you never know, it takes one step and I think that can really revolutionize this whole story.

Speaker 1:

Yeah, that's a good point. Yeah, Kenny, what do you think?

Speaker 2:

Yeah, I think, as we learn in the past three years, the world tomorrow is not gonna be the same. But I wonder if today yeah that's for sure Keeps changing every day in all dimensions. And talking about dimensions, in this HCI compute I see kind of three vectors. I think one vector is definitely the technology vector. As we all technologists know, technology is gonna evolve fast. There will be new capabilities, new features, fastest, smaller, more memory, more on device learning, computer memory. I mean there are like a longer list of technology innovations that are already in progress, I think. So the silicon is gonna be much, much more capable, much more intelligent, with better software stack.

Speaker 2:

That's kind of one dimension there. Second dimension is, or second vector is, end users and applications. That I think we are going to see more and more end users and applications in this space, to the point that this type of devices will become invisible and they will be everywhere. Just like think about smart TVs right 10 years ago, the way you go to Best Buy, you have a TV and you have a smart TV and the smart TV costs twice more simply because it has a Wi-Fi there.

Speaker 1:

You can do connectivity.

Speaker 2:

These days you go to Best Buy, there are no smart TVs, because they're all smart TVs, right, that's right, that's right.

Speaker 1:

It's kind of became an enormous.

Speaker 2:

So I think we're gonna see similar type of things there. So we're gonna see a lot of smart devices around us. They will have all ML AI, kappa BORIS, tiny ML, agi, kappa BORIS and you just normal to have this type of functionality around us and to be able and the third dimension there is like what global product there? I mean to make this all happen, you really need to have all this tool, deployment, security, all the things really developed to the point when it's super easy for people to use and deploy, specifically for end users, because not every company, like gas and electric company, can have enough software engineers to do this kind of things. It will be. We need to have this robust scale up type of deployment tools available and this investment is already happening in this space. But I think it's going to be great because all of the devices are going to make a better world for all of us in either consumer, electronics, industrial, environmental, smart cities, smart homes, smart environments yeah, so it's going to be exciting.

Speaker 1:

Yeah, yeah, no, definitely Cool. Yeah, it's a lot of things to look forward to, so we will have to regroup at some point and see how much progress we can make, pete.

Speaker 3:

I just have one more point to make. Yes, sir, talking of your energy, what do you call it? Dividend or energy? Energy dividend yes, yeah so one concern is with all these devices out there, billions of devices hopefully we have good coders programming that or gen AI programming that, Else you put bad code and we could see the power get totally, energy get totally bloated.

Speaker 1:

Yeah, that's true, although that's where all of our incredible new telemetry and measurements of energy will help us identify the bad actors right and say, hey, that sensor is really out of whack out there and yeah, all the code will be written by some sort of GitHub co-pilot thingy, dingy right.

Speaker 3:

But I hope so yeah, we'll see Cool.

Speaker 1:

No, I appreciate taking the time for both of you. I think it's good to see you virtually, and hopefully we'll be able to get together in person at some point in the near future too.

Speaker 2:

Yeah, well, thank you, peter, Great talking to you, go for it.

Speaker 3:

Thanks for having me. Good to see you, Pete. Always good to talk to you.

Speaker 1:

Alright, thank you both. Thanks for joining us today on the Edge Cell Sewer Show. Please subscribe and stay tuned for more and check us out online about how you can scale your Edge compute business.

Speaker 3:

The Edge Cell Sewer Show.

Understanding Edge Compute in Business Transformation
Advancements in MCU and Edge Computing
AI in Automotive and Industrial Industries
AI and Energy Efficiency at Edge
The Future of Technology and Connectivity