Bringup Lab Chronicles: A Measurement Two Years in the Making

The Oxide electrical engineers share their experience bringing up a 100Gb link--it's got everything from a purpose-built probing station to a 100Ω resistor that proved to be the difference between life and death (of the company)
Speaker 1:

Alright. Well, we got a good crew here.

Speaker 2:

So let's, let's go ahead

Speaker 1:

and get going. And, so we you know, in the origin of this alright. I thought you might might kick us off because I think the the origin of this is is your tweet, which I I loved. I mean, a a measurement 2 years in the making, which really gets to kind of the odyssey we've gone on here. So do you wanna describe, kind of at the outset what we've actually built and and what some of the challenges were?

Speaker 3:

So in the beginning no. What do we build? Well, so super quick. What we what I showed in the tweet was our measurement of the backplane cabling that we're using between our switch and our compute sled. And, in particular, what it shows is a a TDR measuring the full channel of the cable connected to 2 circuit boards through the connectors using probes touching the printed circuit boards on like both printed circuit boards each on one side, so you can measure through the entire stack, like everything.

Speaker 3:

And what that lets us do is it will let us extract some performance data about how well the signal will travel through the PCB, through the connectors, into the cables, through the cables, out the connectors, and then into the PCB, and then ultimately, through the pads, and then where the chips then would be that then are connected to each other.

Speaker 1:

So that's pretty exciting. So I'm gonna and and I'm just gonna apologize in advance for, starting with a bunch of questions. You mentioned a TDR measurement. You wanna describe what that is briefly?

Speaker 3:

How about, Eric, who's perhaps a TDR measurement? Sure.

Speaker 4:

I'll I'll I'll I'll go see back. Exactly.

Speaker 5:

Action items?

Speaker 4:

There there's 2 there's 2 ways of measuring a channel. A channel is basically just to find a a set of wires that goes from some source to some destination in a high speed link, in our case. So the two main ways of doing a measurement of how good or bad that link is is are the an s parameter measurements using a VNA, which is a vector network analyzer, and a measurement using a way a TDR, which is a time domain optometre, that we were using from the decroy Teledyne decroy that's known as a a wave pulsar. So a vna basically sends out a sine wave, and it looks at how much of the sine wave gets through and how much of the sine wave gets reflected back because some of it, you know, like a wave hitting a a wave hitting a a rock and a pool or something tends to send a wave backwards because it hits a a discontinuity. So the machines measure that reflection and the transmission and how much of that that sine wave gets through, and that tells you how good or bad the the channel is.

Speaker 4:

A TDR sends a pulse. So it's if and for for those of us who remember that, I remember just enough to be dangerous. You send an impulse function, so a delta function, which is an infinitely narrow unit pulse down a transmission line. The FFT of that is basically all frequencies. And so a TDR tries to approximate that using a a known pulse, and it sends that pulse down.

Speaker 4:

And it looks at how much of that pulse is reflected back and what shape is and how much of the pulse gets through. And so there's benefits in detractions either way, but the the way we're using is a TDR, method method from Teledyne and Croix using a wave closer. And that that thing is good up to to 40 gigahertz or so, but our our probes, we've limited to 26 gigahertz because of probing limitations. But, basically, it's a it's a way of measuring how good a channel is using a a little voltage wiggle that we send down the line and see how it see how it returns and see how much of it gets to it.

Speaker 1:

So, obviously, a lot of meat on that phone, and I want us to kinda work up to to that measurement. But in doing so, maybe it's worth backing up a little bit. And this is a this is a 100 gigabit backplane, and less than anyone I mean, it it I think most folks may be coming to this kinda from a higher up in the stack, And, it is this is really high speed stuff, and it is really physically challenging to get a cable backplane to operate at a 100 gigabit or or this the 28 gig n r z in this case. Arion, do you wanna talk about some of the challenges that we had in the earliest ways we were kind of thinking about this?

Speaker 3:

Yeah. So I wanna, like, put a little asterisk. When someone here's a 100 gig, what we're talking about is 4 channels of 28 gigabit signal, the 28 gigabaud signal. And for for our discussion here, we're talking about a, a roughly 14 gigahertz per, signal, because you're measuring at, you know, half the frequency, the Nyquist frequency. Think back about information theory you might have encountered in in a computer science undergrad.

Speaker 3:

So we're looking so what we're concerned about is a signal, let's say, up to about 28 gigahertz because you want to get the fundamental frequency plus then the first harmonic potentially, maybe even the second. But by the time you get through a little bit of the channel, all that is gone already. Because, and maybe we can get to that later. But there's just some every little disturbance in the impedance profile causes a little ripple or reflection. And so you, you can see all those in the pictures that are in the in the in the thread, you can see all those reflections.

Speaker 3:

But you were asking about the challenges of getting this getting this done. Well, the the major challenge. And so if I understand the question you're asking correct, is that the the receiver on a chip has a noise floor, up to which it can measure, you know, this signal coming in, and you basically want to send a signal down a channel down a cable or but but it's more than that. It's a cable plus the connectors plus the PCB. And you need to keep that signal above that noise floor because otherwise the receiver can't receive that signal anymore.

Speaker 3:

And it can't make up, whether you send a 0 or a 1. Let's keep it simple for now. And then it does a bunch of signal processing even to artificially lower that's that that, noise floor even further. So they could there's a there's a there's a full on signal processing chain at the other receiver end. But then not the the the manufacturer of these chips give you these devices give you a specification that you need to live within.

Speaker 3:

And in our case, for a 28 gig NRC signal, They say a ball to ball measurement. So meaning where that where that where the 2 devices that you're transmitting between our our solder to the circuit board, that needs to be you can you can tolerate up to 20 dB of loss, meaning that most like a lot of the signal at the at the Nyquist frequency that we're interested in, that's roughly 14 gigahertz. So you can absorb at that frequency. You can if you are less if you if your signal that is received is is above that 28 dB, noise or 28 dB level, it was the receiver will still receive at least something whether or not you can get a completely valid link out of that that will be that's a little up in the air. But

Speaker 1:

And, Orian, at each level I mean, the term for this that I that I had not heard before oxide as I I've I've said many times, but I I'll say it again many times in this conversation that, I definitely every day, I think I know how computers work. You know, I I the next day, I learned I actually did not know how computers work. And I definitely learned a lot at oxide, have learned a lot of oxide. What the the the term that is used for the loss that is induced at each of these steps is called insertion loss. Right?

Speaker 1:

Yes. Yes. Which is kind of a I don't know. I don't know if you heard of the term insertion loss. I Never.

Speaker 1:

No. And and I guess I mean, like okay. Look. A lot of things are poorly named, but it feels like insertion just feels like I don't know. I feel like you're inserting something, and the the the insertion is kind of the and, Eric, maybe you can kind of explain the origins of of the term to those of us who are new to it.

Speaker 4:

Yeah. Insertion loss and return loss are the 2 main parameters that you look at. So you want insertion loss to be low, and you want return loss to be high. That means that your insertion loss means I I stuck a signal into some channel, and a lot of it got through. So the loss to me inserting that signal into a channel is low.

Speaker 4:

And my return loss is high because I want all of the signal I insert into it to go somewhere else and not reflect back at me. So I have a lot of I don't know. Low return, like, you know, a bad return loss, and I have a lot of that signal returned back to me. And that comes from my impedance mix matching that. That's the, like, you know, 50,000 foot RF guys rolling in their, you know, rolling in their chairs.

Speaker 1:

Yeah. Exactly.

Speaker 4:

But I

Speaker 3:

know who's doing it.

Speaker 4:

All these all these terms and the techniques and the methods used for high speed signal integrity all come from RF because they did it way before any of the high speed data. So this is all based on RF stuff. So, like, a VNA is fundamentally it's a measurement system intended for RF. So if you're measuring, you know, radios, that's what you use a VNA. You've never used a TDR because it doesn't have nearly the the range.

Speaker 4:

We can shallow we can tolerate, like, 20 dB of loss. So, you know, your typical wireless channel tolerate, like, 80 or a 100 or a 120 dB of loss without flinching. But in circuit board land, because our receivers are much lower power per bit, you can do those per bit or whatever, versus a wireless modem, we have much lower tolerance for loss. Yeah. Insertion loss and return loss for both RF terms.

Speaker 4:

Looking at how much signal gets through and how much gets kicked back at you.

Speaker 6:

Well, you you would use that too if you're talking about any reflections, though, like, in any, you know, channel. So I mean, it's technically correct no matter what.

Speaker 1:

You you right. No. And I think it's I and, like, Eric, I guess that I was so the insertion the the thing that's being inserted is the signal itself that's being inserted. Got it. Okay.

Speaker 1:

That makes a lot more sense. And, Adam, have you been hit by impedance mismatch and and reflections of you this is like our I I feel like adventures was spy involved. So when you so there is one of the ways I know Matt I saw Matt Kiefer here. One of the ways that any software you get burned by this is you can set the speed of a pin, which is how fast it will transition. And the speeds you can set are, like, very slow, like, ridiculously slow, slow, and all the way up to, like, super fast.

Speaker 1:

You're like, I'm a software engineer. I wanna set it super fast. Right? Exactly.

Speaker 4:

I I

Speaker 2:

I did I I fell for that same trap, and

Speaker 1:

and Matt told me how how that was wrong. Yeah. It not. No. It is wrong.

Speaker 1:

Bad.

Speaker 4:

Bad. Go slow.

Speaker 1:

Go slow because

Speaker 4:

work with slow user.

Speaker 1:

And you end up with a signal coming back at you that you think is the is the the other end sending to you, but it's like, no. No. That's you sending to you, and it gets very confusing.

Speaker 4:

Yeah. So, random and the story is when I my previous company, we had a a 20 megahertz clock that was getting sent a bunch of places. And we need a lot of those clocks sent, and they had to be aligned with each other because of various reasons. But the only chip that would do that in a reasonable, you know, amount of chips was a, like, a one to 28 or something fan out buffer that could drive, like, 1.6 gigahertz signals, which are, you know, absurdly fast. And we have a 20 megahertz clock.

Speaker 4:

We're like, okay. Fine. It'll be fine. 20 megahertz. It'll handle that.

Speaker 4:

No problem. Right. Yes. It will. But it'll send send that out with a, like, 6 gigahertz edge rate, you know, edge head bandwidth.

Speaker 4:

So we're sending a 20 megahertz clock with the sharpest edges known to existence. And amazingly, that caused us problems because, you know, when you're sending a 20 megahertz clock, you don't need 1.6 gigahertz harmonics coming in. Right.

Speaker 7:

In fact, your confusion bad.

Speaker 3:

Right. So who you're on the space has a has a a good explanation for what impedance actually means. Because I've I've it's it's a term that I use daily, and it's it's and I have some amount of intuition for what it means, but it's actually kind of fuzzy to describe. There's a mathematical term for it, but that's not intuitive.

Speaker 6:

I mean

Speaker 1:

yeah. Go for it, Rourke.

Speaker 6:

Impedance in a non math way, you can think about it as the amount of information that travels from one place to the next in terms of, like, how much you send in, and you're analyzing the amount that you get out. So it can be a voltage. So I send in something at 5 volts and over the course of whatever the period that I am sending it through, it comes out a little bit lower. It's a bit or you're sending I'm trying to talk to you and it's noisy. And I say, you know, 5 words and you get 3.

Speaker 6:

Or, you know, so it's any any characterization of the amount lost.

Speaker 3:

Yeah. It's so the way I sort of try to grasp it is you said you insert a certain amount of energy into a channel. And that is characterized by a voltage and an an an amount of current. But these are these are these are just, sort of ways in which we can measure impedance, but they're not the actual thing. It's a measure of energy going into that channel.

Speaker 3:

And when we talk about this edge rate being too fast, you're basically inserting energy at a really small, small time delta, you're pushing a lot of energy into the channel. And if the channel is not able to absorb that energy at that exact rate, then it is going to immediately reflect that back at you. So that's where you talk about an impedance mismatch or an or an impedance discontinuity. It's where where basically the amount of energy that the channel can absorb or or or push forward, like, transmit for you is different from one place to the next. And that's like at an instance, there's there's like a sharp edge or that is.

Speaker 3:

And then at that point where these two things are where where the channel is not matched, where basically there are 2 different impedance, like value, so to speak, you're gonna get a reflection that is going to come come back to you because that energy needs to go somewhere because it can only absorb some amount of energy that that it can that it can handle. And then the rest needs to it is preserved that needs to go somewhere. So it will come back to you. That's the only place it can go. And then

Speaker 6:

you can be a jerk about it and just say, it's entropy,

Speaker 1:

and then be right at the time. Oh, that guy. Oh, that guy.

Speaker 4:

Yeah. I mean, it it you know, the the definite the Wikipedia definition of characteristic impedance is the ratio of voltage to currents, you know, in a way of propagating through a channel, which is great and probably, you know, mathematically correct on everything. But it gives you basically zero intuition on what the hell you're trying to do. And so the only way I could think of it is, okay, the impedance is the the you know, whatever that trace width and trace spacing to the ground plane, we have to hit so that the, you know, the impedance and ohms matches both the transmitter source impedance and the destination receiver impedance. And that varies based on the standard, like, PCIe and, I think, 85 ohms or something.

Speaker 4:

A lot of the newer stuff is 92 ohms. Single ended stuff is always, you know, like 50 ohms or 75 ohms if you're talking via broadcast. And there are reasons for those impedances. But, basically, it's you have to match whatever you need to based on whatever the transmission protocol you're using is. And in our case, we're using somewhere in the 90 90 ohm ish, 92, something like that range.

Speaker 3:

But but those are all those are all conventions. So we all build our receivers around this idea of, like, okay. Let's all let's everyone try and build a receiver that's in 10 seconds.

Speaker 4:

There's a reason that 50 ohms was chosen. I don't remember enough of it to go into the explanation for it, but there is a reason for that that you can yeah. The the interested reader can look up.

Speaker 3:

Yeah. Yeah. But but but so, I guess the point I'm trying to make is that the 50 ohm, it's not a totally arbitrary number, but it is a it is a number we chose. There's there's nothing necessarily fundamental about why it couldn't be, say, 40 or 60 or whatever. It's just that we started to all use 50 because otherwise all you need to match all these devices, you need to match all the pieces in the channel to ideally all beat at 50 ohm so that you have no reflections.

Speaker 3:

You can make this perfect channel that has no discontinuities and is impedance so that you can come like, you basically as the energy travels through this, there is nothing that will get reflected back.

Speaker 1:

And, Arin, when you say we chose 50 ohms, do you mean a totally naive question? Do you mean we oxide or we human?

Speaker 3:

No. We humanity. Okay. Okay. We electrical engineering humanity back in, I don't know, the the thirties, forties, or ever when they started doing and and but in in this case, the broadcast industry picked 75.

Speaker 3:

So a lot of rocket or or definitely older broadcast equipment is at is at 75. So I mean, of course.

Speaker 1:

Sure. I mean, of of I just like I feel like there's certain truism that transcend every engineering domain, and the fact that 2 different bodies picked it. If there's an if there's an arbitrary number being picked, you know that 2 different groups picked different numbers then peed

Speaker 4:

it out of the.

Speaker 3:

Well, the so the but this this whole concept of impedance and, like, Eric is absolutely right. When you're building a system, all you think about is this this impedance number. It's expressed in ohms, and you do your best to try and match it everywhere. And you get a there's a bunch of equations that you're using, and some of them are, you know, more detailed or complex. And then there's some shorthands that you use to get to those numbers.

Speaker 3:

And then but as a as a person writing software for for, you know, like, pretty much forever and then transitioning into electrical engineering, this was a couple years ago, this was a very tough concept for me to grasp because I struggled with the math or the or the physical pers I struggled with the mathematical perspective of this. And I I had to sort of transition a little bit in the physical aspect because you can look at this from 22 different directions. And you can look at it from a hardcore, like, equations perspective, and it will tell you absolutely nothing, or at least it didn't do for anything for me. And so I started reading some books that had a different perspective. And then I like, oh, okay.

Speaker 3:

There's I started to build some intuition. Although, like I said, the reason I was asking this question earlier, can we define it is because it's so fuzzy. No one seems to really be able to like, intuitively tell us what it is, which is frustrating to a degree.

Speaker 1:

Well, I I mean, it feels like it and I I'm I I it's so great to hear you kind of ask these super basic questions because it does feel like some of these things are like, I'm I'm just having a very hard time with the intuition around a, like, a a 28 gigahertz or 14, I guess, gigahertz signal is just I mean, I I'm so accustomed to that being so much faster than anything we could run our clock on in in IC, in an ASIC. To think that, like, we've got anything at 14 gigahertz Just and I I get this I guess this gets to your point, earlier, Eric, about, like, this is all RF. And when you're at that point, like, it's it's it's a radio that's in a channel, basically. This may be the better way to think about it.

Speaker 4:

Yeah. High high speed serial links are crappy radios that only do 2 lengths. I mean, they're they're really just crappy radios.

Speaker 1:

Yeah.

Speaker 4:

And they're they're crappy radios because they can they can have idealistic channels. There's no external interference from, you know, your neighbor's Wi Fi. There's no there's no crap from the local airport blasting. All of this is very you know, it's the high speed serial lights are fundamentally an RF channel in a very, very controlled environment.

Speaker 3:

Well, and we wanna make some crappy radios because we don't wanna spend all that silicon real estate on on expensive, like, analog front ends for this.

Speaker 4:

Our RAM chip space.

Speaker 8:

Is So we

Speaker 3:

want expensive. We wanna make this as simple as we can can make it. And that that's why, for example, with every step in generate like, every step in technology generation, say, for PCI Express, these these receivers become more complex because we need to go to faster speeds, which means that that noise floor needs to go down, etcetera. So we're starting to those those receivers borrow more

Speaker 9:

and more and more out

Speaker 3:

of, like, actual radio receivers. And as a result, they become more complex, more power hungry.

Speaker 4:

Yeah. Like, 100 gig and 400 gigs are using things like tech and, you know, forward error correction because they know they're gonna get bit error. You know, back back in those, you know, 10 gig lane, like, better is like, no. You just they're so rare that you just handle them at a higher level. No worries.

Speaker 4:

But when you get up to 28 gig, you know, payment code is like, oh, no. You you're gonna see errors. It's just how many. And you just wanna make sure that rate's not so high that your forward error correction or your, you know, CTLE or DFE and all that stuff can't correct you.

Speaker 3:

And and and your radio your the radio protocols that we use for, let's say, your your your cellular network or whatever, use really encode or relatively complicated encoding schemes exactly for that because they have to you want you you don't want to retransmit in that channel ever because that's expensive. So you want to recover as much of that signal as you can. So they the radio world has been doing forward error correction for a long, long time because they they were forced to. And but but slowly, more and more of that shows up in these serial receivers for, you know, wired Ethernet and PCI Express and whatnot.

Speaker 1:

Yeah. I mean, they're crazily sophisticated. And, I mean, like and maybe this is a good segue, Tom, to you, and I I wanna get so we get some folks from from Ancestry as well to out to talk about like, how do you think about the system in advance to know that, like, when we build it, we are gonna have any chance of of having something that actually works. I mean, how do you and and maybe, Arun, you wanna kinda service the Internet. Like, how did we model the system?

Speaker 3:

Well, we paid Ensis money, to get software that will And it

Speaker 4:

lets us spend many, many hours.

Speaker 8:

Yeah. Well, it but but also, like, fundamentally, it serves with, like, the piece of paper you know you need to go let's say you're going over the some length of coax, and you know how much loss there is in a, you know, coax per meter. And so you start building up. You know that the chip vendors told you you've got this many dB of insertion loss for your margin, and then you start sort of, like, papering it out. Can this architecturally work?

Speaker 8:

For instance, if you if you did if you decided early on that there's no way you could meet that insertion loss spec with the amount of PCB that you had to go across and the cable length, then we'd be, you know, looking at, like, 3 timers somewhere, and then the retimer would have to fit in there. Right? So, I mean, fundamentally, it starts with a piece of paper and just, like, looking at the specs and making sure the specs sorta work out. And then you could delve into it and, like, actually put it to the test and look at the physics, you know, make sure it actually works.

Speaker 3:

Yeah. Because the back the back of the gives you sort of the rough estimate for, like, is this even worth trying, which we did.

Speaker 4:

Right. So we do everything right.

Speaker 3:

We went we went to, you know, the the cable manufacturer. In this case, we've been working with Semtech. And they gave us an estimate for like, Hey, this is what our cable system can do. We've this is model, this is implemented. We've we've tested that.

Speaker 3:

And then yes, but that assumes that you're doing everything right. That assumes that all your, you know, everything on your PCB is correct, that you're not basically losing more of that signal than you absolutely have to.

Speaker 8:

And that's where ANSYS comes in.

Speaker 3:

And that's exactly. Yeah. That's where we start to really dial that in so that we build simulations of the actual PCB, the actual trace on a PCB, the dielectric material between all the copper pieces, the vias, all that. And then you start to run simulations, and you start to slowly carefully tweak all these pieces so that you improve that that or you basically reduce that loss until you are as close to that optimal that you can that you can hit. And then hopefully, you you, by then, have have met the specifications and your link actually works.

Speaker 4:

Yeah. So so speaking of via, like, we spent a lot of time looking at looking at via, like, accounts. Like, every can you give a little bit of an overview?

Speaker 8:

Yeah. So difficult. Well, every time, you know, what if you were to look at a cross section of a PCB trace, for this sake, there's there's 2 classes of traces you could run on a PCB, what they call microstrip, which is any external routed traces, which are half reference to error, and the other half is, well, not exactly 50%, but but it's, it's not a uniform dielectric. Part of it's error, part of it's whatever your PCB FR4 is. And so strip line are where you're totally, embedded within 2, let's say, 2 shielded ground planes.

Speaker 8:

And your signals have they're they're a uniform dielectric. So, Tom, are those

Speaker 1:

two different kinds of vias?

Speaker 8:

Well, I'm getting to that. The, the idea is that as we're routing across a PCB, the the the cross section of the fields are really well behaved. We can understand them even, like, with the 2 d field solver. But as soon as you approach a via, the via is basically a drill going from one layer to the next, and all of a sudden where your return currents were really nice and uniform on your ground plane, now they they go into another ground via connecting the 2 ground planes together. The problem becomes really complicated.

Speaker 8:

You've got, physics that are not simple, and that's why we need 3 d field software to actually put the geometry together and actually investigate what is going on and what are the in other words, you know, you can look at the different inductances and capacitances through you know, as the the signal transitions from, TEM mode in the stripline PCB to this via, which is like a really funky you're trying to make, like, a coax in the z axis of your board going from the top to the bottom, for instance.

Speaker 3:

So you're you're going for something that is, like, nice two d math that you can sort of still approach with your high school math to having to solve through, like, full on differential equations for a 3 d or, like, not an arbitrary 3 d structure, but a very complicated 3 d structure. And and that quickly, like, surpasses what you can do by

Speaker 8:

By the way, I'll say this is a very hard discussion to have without, like, showing diagrams in this because you got you know, like, pictures are really helpful here.

Speaker 1:

Yeah. Well and I yeah. And, Adam, I don't know. In terms of, like I mean, do first of all, Adam, don't you just feel just, like, listening to this? Like, we've entrusted all of this, like, rocket science to the pile of idiots at the top, wrecking driving this with software.

Speaker 1:

I mean, I'm just like I I I I I mean, I I I I just feel like about all of these software that I've seen or written that has just, like, abused the network or behaved terribly. You're like,

Speaker 3:

don't We're we're creating this beautiful thing, this this physical piece of art, and then and then and then we hand it to you.

Speaker 1:

And then we hand it I know. For for some dumbass time out that was, like, specified in, like, in, like, seconds. I'm supposed to say milliseconds or whatever. It's like, I'm almost done.

Speaker 2:

That's right. You know, back back in the day in Solaris, if you unplug

Speaker 4:

the cable, it would say link down cable problem.

Speaker 2:

And I feel like I mean, even then, it felt like a a pretty unsophisticated diagnosis of what, like, is going on. Yeah. And now it's just like, what why did we think cable problem was I mean, there's so many problems that could that could arise.

Speaker 1:

I know. It's just amazing that, like and and Tom, just to go to, like, when you're when you are kind of visualizing these vias where you are because, unfortunately, these are not our circuit boards have multiple layers. Unfortunately, reality is a real pain in the ass.

Speaker 8:

Yeah. Yeah. The challenge is, like, in 2 areas, basically, you have to get from the BGA, which is like there's you know, everyone's probably familiar with the BGA. There's a bump, and it has to have a circular pad on the top side of the PCB or the bottom, in our case, the top. And you've gotta basically get from that to one of the inner layers, and then you go across your board to the next connector or this in this case, the Semtech to the RF 6 connector we're using.

Speaker 8:

And you've gotta have another via there to

Speaker 4:

be able to get go

Speaker 8:

to the connector. So that's what the vias are really as long as you can, like, only have the 2 vias, the via transition areas in the BGA and then at the connector. And so those are the 2 areas we focus on the most. And, in fact, you know, like, it gets really complicated because as you put a lot of vias through in tight proximity to one another inside of the VGA, and you have traces routed through next to those and the voids and there's backdro clearances and all this stuff, You actually can have some of the RX layers coupling to the TX layers that are, you know, maybe half a millimeter away. You can actually have some coupling through the void, which, again, this would be great to have a picture.

Speaker 8:

But but there's, like, this really complicated way of seeing seeing the way the coupling modes can happen. And, unfortunately, that isn't really clear until you can visualize it in 3 d. The 2 d CAD programs, you know, they give you like a 2 d down view of, like, polygons. But but there's a lot more to it because you need to really understand what is going on on the z axis, how how much space is between this and that and and the other thing. And that's where once you can extract a, 3 d geometry and then look at it in that world, or you get good at it after you do it enough, you start to, like, have a

Speaker 2:

3 d view just as looking at it too deep. And this might be getting a header source. But as we look at as we do the math and do the simulation, do we just get back yep or nope? I mean, how do you how do you debug this thing? Like, how do you how do you figure out

Speaker 4:

where back in you get back in inscrutable waveform that has a bunch of little spikes all over it, and you have to assess whether or not that it's good enough.

Speaker 8:

And it's it's even trickier. Absolutely. It you have to know you have to sort of test whether or not your simulation was valid. Like, did you actually set it up correctly? Because there's an art to this as well.

Speaker 8:

When you have to, like, put a you put what they're called are ports. When you when you have a 3 d geometry, you assign a port. Then when it solves for those ports, it'll basically look at energy in one port and see how it comes out on all the other ports, and then it solves that whole set of equations so you have a set of your s parameters that tells you what happened. If you only have 4 ports, that's one thing. But, like, we did an extraction on a larger chunk of the chip that literally took 2 or 3 weeks on a 512 gig machine and had 48 ports.

Speaker 8:

So, yeah, I think, similarly, when we were doing the, the, Samtech extraction, which, Samtech did for us because they have their connector model, I think that similarly took, like, a week or something like that. So these are non trivial things, and and you gotta really know how to set it up correctly and make sure that, like, the answer you're gonna get back is the right one because, unfortunately, you've spent all this time, and it could be, oops. My port was, you know, put to the wrong reference, and now the data is invalid. Right?

Speaker 1:

Right. I mean, that must be very nerve wracking. I mean, it because and, obviously, in any simulation, you are trying to to to find as many ways as possible to check your model against reality. But these are in and, Thomas, you mentioned these are these things are run for a really long period of time, super sophisticated software. This is not I mean, Adam, I feel like like the the kind of simulation we do in software generally is done with, like, an aux script usually of, like, 2 or 3 lines.

Speaker 1:

I feel like I mean, I think the modeling we do is so unsophisticated by comparison to where we've got it's actually a physical thing, and these electrons just don't behave very well. I mean, they it's like we we kinda have this idea of, like, oh, these two wires are connected logically. So, clearly, it's going to, like, travel along this path. And it's like, no.

Speaker 9:

No. No. I was in

Speaker 2:

a meeting earlier today where we're doing some performance benchmarks. Were we using the right SSD? Well, we were using an SSD. So, I mean, it's, like, pretty gross, by comparison.

Speaker 1:

I know. I know. But it's so it so in it I in touch, I know we got, Larry and and Robert from who joined us from from ANSYS. Maybe I I don't know. And I you're getting your first, like, real exposure inside the oxide.

Speaker 1:

You may just be like, oh my god. These guys are turkeys. We need to

Speaker 9:

just don't ask us to create, a network switch because we can't do that.

Speaker 1:

There we go. Okay. Fair enough. So, I mean, do you first of all, is is Oxide's use case is this, like, a a a common use case for folks that are are I assume that everyone has to do these kind of simulations before they actually build this stuff. Right?

Speaker 1:

Because it's so high consequence to get it wrong.

Speaker 9:

Well, back in the old days, if you if you're moving, you know, something at 1 gigabit per second, not so much. But now, today, you know, every little discontinuity on the line, as you've mentioned, conspires against you. And, having something that solves for the physics well, what are those actions and how are they gonna compile on top of one another? Using our simulators really makes a big difference.

Speaker 4:

And Yeah.

Speaker 9:

And so you you meant

Speaker 1:

Well, I think you there's a really good point here that is and, Eric, I know you mentioned this to me as well. Tom, you've said this as well, that, like, when you have loss, like, you're never gonna get it back. So every little bit in this like, what of it matters? All of it matters. Like, all of the details matter enormously in this.

Speaker 9:

Especially at these frequencies. So, you know, what the interconnect like, you can look up the on the datasheet, the semiconductor vendor tells you you can stand 20 dB of loss or whatever it is. You have to actually in in fact, it's frequency dependent loss. It's a it but the network, the the interconnect like a low pass filter. So DC, you said earlier.

Speaker 9:

Right? A simple, hey. These are just connected together. So DC, sure. It marches right through.

Speaker 9:

But the higher frequencies, 14 gigahertz, 28 gigahertz, is diminished a lot more than the lower are. It's a low pass focus. So any sharp edge that you'd like to send, like transitioning that NRZ signal from 1, gets rolled over. It gets smeared out. And how much does it get smeared out?

Speaker 9:

And then it gets worse. You stick a via structure in between or an IC package. Some of the energy goes forward towards the the receiver. Some of it bounces back. Some of it bounces back to the transmitter and then bounces forward yet again to the transmitter

Speaker 4:

base sitting upon one another. So the the the bounce back and forth things can have some interesting effects because, you know, like, let's say your fundamental is 14 gigahertz, and let's say you have 2 structures that haven't beaten sets, gotten to these better space just far enough apart, then you get a standing wave at 28 gigahertz. And now you have this massive energy suck out at 28 gigahertz, and it basically looks like a notch filter. And it just sucks out any possible information there and gives you the middle finger on your fingers.

Speaker 1:

Yeah. I mean, a standing wave at 20 gigahertz. I'm just trying to, like, wrap my brain around that. Like, that feels bad.

Speaker 8:

Well, and and you if

Speaker 4:

you move those Venus, like, 5 millimeters apart, it's gone, and you won't even see it.

Speaker 8:

So and and have those at

Speaker 4:

just the wrong spacing. It just screams

Speaker 8:

you so bad. And and I think this you're exactly right in the sense that one of the things that's important to, that I've always found in my flow of of simulation is that, like, you start by looking at it in the frequency domain so you can at least get an idea of, like, at what frequency was the loss is. But then because there are all these time dependent, dependencies to a particular channel where the via actually lives within your data rate, that's yet another level of simulation that's really helpful to look at this in time domain to see what the eye actually is because you might find some of the discontinuities. Like, this is where, via backfilling comes in, and you can really see it clearly, where there's some frequencies that that little structure resonates and the wrong data rate at the wrong stub. You know, it could actually, you know, kill your kill your line.

Speaker 8:

So

Speaker 1:

And and, Tom, can you just describe the eye? Because that's something it does come up.

Speaker 8:

Oh, it's just taking, like, taking any, data transmission, and all you do is you you take a like, if one UI is is one bit symbol, so at 20 it's, at the 25 gig that we're running at. It's 40 picoseconds roughly. So, basically, you overlay every 40 picosecond chunk on top of one another so you can actually get the aggregate of what does every single up down transition look like over one another. So it ends up creating this nice diagram that shows that looks like an eye, whether it's open or closed is the terminology. And it gives you an idea of, like, how much margin you have for your receiver thresholds.

Speaker 8:

Right? So you just get to wake away everything over on one simple view, over one

Speaker 9:

Yeah. And and it's

Speaker 3:

a it's a driver on the point. The the the open eye, meaning there's a large difference between what is considered a 0 and a 1. So that's easier for the receiver to to make out and therefore people

Speaker 8:

4, you've got now 44 level. So, you know, it just gets worse.

Speaker 4:

You've got this triclops up.

Speaker 1:

Yeah. Adam, have you seen the PAM 4 eye diagrams? No. Alright. So if you look at an eye diagram, you're like, okay.

Speaker 1:

Like, I can appreciate that. That's a good clean eye. This is an open eye. It's like, but what happened over here? This one looks totally wrong.

Speaker 1:

It's like, no. No. That's not the m four signal

Speaker 4:

because That looks really good. Yeah.

Speaker 1:

Yeah. It's like, oh, now it looks great because as Eric mentioned, it's a cyclops effectively, And you haven't because you are you are laying on multi multiple voltage levels. Apparently, we're giving up on digital is, like, has taken us as far as digital is gonna take us. So we we're just binary. If

Speaker 4:

you look at, like, Ethernet is, I think, a high low, something like that, signaling.

Speaker 8:

And there's 2 of these with that.

Speaker 7:

But Gigabit Ethernet base t is still, using PAM 5, and then I think, 10 gig over twisted pair jumps up to, I think it's PAM 16.

Speaker 4:

Yeah. You're coming to What?

Speaker 8:

Yeah. I I did a 10 g 10 g base t pi.

Speaker 1:

Oh, and it's it's 16 different voltage levels?

Speaker 7:

Yeah. 4 bits per Holy.

Speaker 1:

Man. I I I feel like it's like everything's a lie. Like, what's not a lie,

Speaker 4:

Rhett?

Speaker 8:

It's lower frequency, though, so you have that going for you. Yeah. You know? Yes.

Speaker 2:

Even those eyes are a lie because what you're looking at is something that's already been equalized. So, really, what's going in the receiver? You don't even know what it look.

Speaker 4:

It doesn't

Speaker 6:

even look like an eye.

Speaker 4:

No. I'm just if you broke it, it looks like magic.

Speaker 5:

Oh, it

Speaker 3:

go this is garbage. Yeah. Absolutely.

Speaker 8:

We we couldn't look at that on a scope we had because it's so encoded. We we had had to rely on DSP and the slicer to basically tell us how good we did. That's, you know, it's a that instrumentation of this is, you know, all another part of the problem, which is

Speaker 1:

Yeah. Yep. So and alright. So before we get off the simulation piece because, Larry, I've got a question for you. When you got the input into the simulation, which I mean, consists of things that you're getting out of these vendors, they presumably, there's obviously materials consequences, I assume.

Speaker 1:

Are you getting and and how do you validate that that information? I mean, surely, I just based on the number of errors that I have found in data sheets, we have found in data sheets, truly that information is not always correct. How do does one validate that that information is correct?

Speaker 9:

Well, we start we start with the ground based physics. So what we do is we go from the bottom up. I should explain to the listeners what, you know, what we do. We have these remarkable electromagnetic field simulators, like this thing called the high frequency structure simulator, which does finite element. So you mentioned differential via structures.

Speaker 9:

Well, what happens when the signal approaches it? We we look at a differential via. We input the geometry, and you bring it in from your layout. You specify material properties like, copper traces and whatever material you're using for the printed circuit board, what dielectric, you know, permittivity it is, what what lost engine. And then from first principles, we solve all those wave equations.

Speaker 9:

And, again

Speaker 4:

Oh, wow.

Speaker 9:

We were talking earlier about, you know, if you were to do this analytically on pencil and paper, it would take you 6 pages of math and a PhD degree in electromagnetic field theory to do it. But we do it on the computer. And so we're solving for the field. If a signal approaches that differential via, it can bounce off it and go back. If it if the impedance is not correct or some of the energy can go forward with the direction you want it to go.

Speaker 9:

And we can, you know, take a look at how those happen. So that we can do that for an IC package, for the escape routing from the IC, IC package. We can do it for vias, connectors, the cables that you have, and then we cascade all of those together into a circuit model so we can look at frequency domain. As you mentioned before, what does it look like versus frequency, Low versus high? How much signal can get in?

Speaker 9:

What is the so called insertion loss versus frequency? And then we can also look at a TDR, a time domain So you can say, well, where are the discontinuities down the line? Right? TDR is really cool because you you send in a pulse and you wait for the signal to come back. Whatever bow whatever it bounces off of, you can determine what is the impedance of that of that discontinuity and how much energy is coming back.

Speaker 9:

And so TDR, we can do in in simulation also just as you can do it on the, in the lab.

Speaker 1:

Yeah. That's really neat. It actually has given me a flashback to our, my algorithms professor who taught the class in FFT. This is a computer science class, and and, and I remember, you know, a lot of math in an FFT, and, this guy was a pure, computer scientist. Adam, we may have had the same professor.

Speaker 1:

You may have

Speaker 2:

this is I know exactly.

Speaker 1:

This Phil Klein, god bless you, but you are a pure not an engineer at all, not overly pragmatic. And someone in the class asked, like, what is an FFT for anyway? And he's like, I don't know. I don't know that they're that useful. And I was talking to what and, of course, we went over to the engineering department, and they just about declared war on the computer science

Speaker 4:

department. I think

Speaker 1:

they were gonna I think the engineers are actually gonna march on the computer science building and just porch it because it's like, no. An FFT is only the most important thing in humanity. All humanity rests on FFT.

Speaker 2:

FFT. No. No. I mean, his specialty is planar graphs, which is pretty far from FFTs, or

Speaker 1:

or it's understandable why he would not be that interested.

Speaker 4:

That's right.

Speaker 1:

And he was forced to teach us.

Speaker 3:

Yeah. He was not. So so they forced him to give back all the tech that relied on FFTs, and then he had to walk home?

Speaker 1:

That that's it. That's it. I I I believe, Arion, there were several similar kinds of suggestions that he needed to be as punishment, he needed to be deprived of all innovations that relied on an FFT. But, Larry, it sounds like when you're when you're sending that pulse back, part of the reason you can figure out where it's coming from is because of the frequencies you're getting back. Is that a is that correct?

Speaker 9:

Certainly, you can perform processing on it like that, but a TDR method is actually really old. It was used with phone lines, and it was used phone lines, telephone lines, and, and cable TV that the guy with the cable TV truck that shows up at your house has a TDR device. And what they used to do is they it would literally just send a a a voltage that goes from low to high quickly. And that's gonna it's gonna basically, you know, sort of, crack the whip and send that energy down the line. And however long it takes to get down the line and back, that round trip distance, you could divide by 2 and say, that's where the discontinuity in the line is, and they service it send a service technician down there to figure out where the broken break is

Speaker 4:

in the line. Because because you know you know how fast a Yeah. Speed of light is in that line, so you cannot calculate the distance. So in the DVR, sometimes it'll say, like, I have a distance resolution of 5 millimeters, and that's based on how much how, you know, how fine of a time you can resolve in the propagation velocity of light in that medium.

Speaker 1:

That is so cool. Right. That of course, that makes

Speaker 4:

no sense.

Speaker 9:

Came about.

Speaker 4:

So you're like debate you're

Speaker 1:

the cable guy being like, there's somewhere around here. There's a discontinuity.

Speaker 4:

I got So

Speaker 2:

do I just start there and start counting 5 Mississippi until I get to the discontinuity?

Speaker 4:

Yeah. Just Yeah. Really fast. Five Mississippi in 5 nanoseconds.

Speaker 9:

And then send the guys with a shovel to dig at that location, and they dig up the cable and fix it. That's how they

Speaker 4:

use it. Well And they they still use that for, like, underground power lines. Do you wanna know where a fault is? You send a you send a thumper out there,

Speaker 9:

and you thump the power line,

Speaker 4:

and you measure how long it takes to come back.

Speaker 9:

So that's what we do with, you know, high speed interconnect also, but you you can get a lot more information. Right? You use a time domain measurement to to develop the understanding of what's down the line. And you can back calculate with simple mathematics. You can back calculate.

Speaker 9:

If you know the transmission line impedance and you see what voltages are coming back, you can calculate, well, what must be the impedance at at that location. So, say, you send a signal down, your differential transmission line in your current circuit board and you hit that via the amount of energy that signal the signal comes back, you can use that to calculate what must be the impedance at that location.

Speaker 1:

And and, Larry, are you doing that in simulation? That's happening once

Speaker 3:

it is.

Speaker 1:

Oh, that is really

Speaker 4:

cool. And then, so,

Speaker 1:

Tom, are you are you using that information to kinda, like, alright. We need to change the layout here.

Speaker 8:

Every day. Now what most mostly, it's the way way the way I use it are, like, a couple of different ways. When I'm doing a via structure, if I'm looking for matching, one way to do that is to look in return loss, and, another is to do it in the time domain. So you can look at the time domain and do a TDR on your via transition. And, you know, it's it's, it's a little tricky to do that because you've gotta make sure that you mesh your structure to a good enough frequency that your time domain edge can you know, so that that the data is well formed for the the speed at which you're going to ping it at.

Speaker 8:

And, you know, so there's like but the the tools help you with that. You can actually set up meshing based on what you plan the TDR with, which is really cool. I like that feature. And, yeah. So, basically, it's it's a really helpful diagnostic when you're trying to just kind of, like, learn a little bit more about your circuit, and you're trying to look for a way to tune it.

Speaker 1:

Yeah. Interesting. So I I was because, Tom, I was gonna ask you, like, how you are acting on the simulation data you're getting back. But and it sounds like you were doing that a lot. You were to

Speaker 8:

It's a very iterative process. You have to, yeah, it's it's very iterative. We'll run through I mean, and that's part of why, for instance, before we ran that 2 week simulation, we had we had spent a lot of time looking at little things and getting everything dialed in, you know, on quicker SIEMs, and then you kinda build

Speaker 1:

up to it. Right. Okay. So then that would make sense. It's like, okay.

Speaker 1:

Now we're ready. We we think that this thing is basically where we want it. We don't need it. We're not gonna use simulation data to iterate, but we are gonna we wanna validate effectively with our kind of final simulation run and get an idea. And so based on that, did we I mean, it was, I know we were optimistic going in to actually getting this thing physically in hand, But there must always be some apprehension about something that has been forgotten in the situation, something we've neglected to account for and so on.

Speaker 8:

No. We're perfect, and we never make mistakes.

Speaker 1:

Nice. That's awesome.

Speaker 4:

I was scared as hell. Right.

Speaker 8:

Totally.

Speaker 1:

So then we so we actually get the, I Eric, I definitely want to we we gotta tell the actual story of getting this backplane actually up because our what we we we get our Rev b Sleds, Gimlet Rev B Sleds, and we those of you who listen to our tales in the Bring Up Lab, remember our tales of getting the Chelseo NIC brought up and the 499 ohm resistor.

Speaker 2:

And the gimlet sleds are that's our compute sled into which the NIC

Speaker 4:

is being plugged. Yeah.

Speaker 1:

Definitely an odyssey, for a bunch of folks, and the do it on on that one, so go listen to that that re recounting there. But so we get the nick up, and it is we're we're talking at at we're at 40 gigs, so which is to say, I think, 10 gigabits per per channel, but we can't get it to 28.

Speaker 4:

Yeah. Just it bombs when it tries to go to 25 per lane, which is a 100 gig. And the In Look. This this is a tale of, like, data sheets and and stuff screwing us up yet again.

Speaker 6:

Well, no. Because they were like, don't do this. And we said, okay. Because you said don't do this, we won't do it.

Speaker 4:

Well, it's not

Speaker 1:

alright. So we yeah. Go ahead. Tell the story, Eric.

Speaker 4:

It's turns out it it there was a lot of jitter coming out of it, and we can measure that jitter using the very nice Labmaster scope we had from Teledyne. And we saw this just horrific jitter on the off like, way more than we expected. And then, like, okay. And jitter jitter is essentially the the difference between when we expect an edge to show up, you know, a transition, and when it actually shows up and that's measured in, you know, seconds. And that's, you know, usually picoseconds if it's bad and femtoseconds if it's pretty good.

Speaker 7:

So femtoseconds. Femtoseconds. Yes. Yeah. My Yeah.

Speaker 7:

So Yeah. If you think this

Speaker 1:

is an expensive device to measure this, by the way, you're right.

Speaker 4:

It is. If you wanna mortgage your house, you could probably get a down payment.

Speaker 1:

Yeah. So the and so this is a Teledyne Teledyne LeCroy scope, right, if I remember correctly?

Speaker 4:

Lab master 36 gig lab master, I think.

Speaker 1:

And and these are, like, half a $1,000,000 plus, basically.

Speaker 4:

Oh, yeah. They're it's they're gorgeous. They're they're the cat's meow of high speed scopes.

Speaker 1:

And these are things that you, that I mean, we're I think we're still we're saving up our allowance to buy, but you, you rent or in this case, I think Teledyne, before it was very that was very helpful in terms of of letting us eat our unit, and we Yeah.

Speaker 3:

This was this was the first hit of crack they gave us, and now and now we're

Speaker 4:

hooked. I laughed exactly.

Speaker 1:

Well, they did a good job.

Speaker 3:

Oh, they did this deliberate.

Speaker 1:

But these things are super, super expensive. I mean, these things are Yeah. The this is the kind of equipment that you're looking for.

Speaker 4:

I mean, that's super expensive. You have to have to look at a signal like this in real time. And so we looked at this and saw a bunch of jitter. We're like, okay. What the hell did you ever get?

Speaker 4:

I mean, is it power? Because, of course, everybody goes to power first because, you know, of course, it's power. So we looked at power, and it's like, well, it's not really showing us anything that's really that bad. And it's like, okay. What the hell?

Speaker 4:

Let's look at the input clock, and we thought the input clock was pretty good. And we measured it, and it's absolutely that crazy then. Like, what the hell? Like, I thought we fixed this the last one. And so we're looking at the clock source.

Speaker 4:

So we're looking at the power of the clock source for the port point of the power gun. And it was looking fine. Like, there's some non idealities which we're improving on the next one. And I'm like, okay. Yeah.

Speaker 4:

This isn't this isn't there's, like, no, you know, smoking gun here. What the hell is going on? And it just like, something was always bothering me in the back of my head. Why in the hell doesn't this thing have a $100 gift term? And And a lot of chips have 100 ohm diff term internally for something like a clock input or even, like, high speed input.

Speaker 4:

A lot of them just have 100 ohm terminations as part of their receiving structure receiver structure. But this one and the t six, if you put it into LVDS mode, it did not. What it did have is 4 k bias resistors, which are also needed for LVDS, to power and ground. They did not have a 100 ohm discharge. And, like, it's not particularly clear in the datasheet, like, when you put it in this mode, here's what the infrastructure looks like.

Speaker 4:

Because, of course, they never tell you that. I think it's probably IP from some other company and whatever. So it's not spelled out. And it's like, well, alright. Screw it.

Speaker 4:

I I don't know what the hell is. It's alright. So I'm just gonna try putting on an on diff term on there. I put a hard on diff term on there, and the the clock was, well, still didn't look great, but it looked a lot better from Jig. I'm like, okay.

Speaker 4:

Let's see. Let's call break up. I'm like, great. You wanna try firing this thing up? So he fires up.

Speaker 4:

He's like, you just linked. Like, this is what? It it it it it drained. I'm like, shit. That that's right.

Speaker 4:

That

Speaker 1:

so so the I mean, we the 499 ohm resistor that was the difference between life and death has now been trumped by the 100 ohm resistor that is the difference between life and death.

Speaker 4:

And and the thing is, like, if you look at the dev board, they're they're nicked, but they have does not have a 100 ohm disk term on their LVDS clock. And the my hypothesis for this, and this gets into transmission lines again, is the clock source on their on their NIC, on their physical, you know, like, PCIe by 16 NIC is about a half an inch from their chip. Not even. Maybe, like, a centimeter from their chip. And that, I think, is slow is close enough that the edge rate of their source clock is not fast enough for them to be electrically long.

Speaker 4:

Right. So even though they don't have termination, they never actually see the reflection because the they don't actually have enough length to get a reflection developing. That is a poor explanation. But, basically, something short enough and, you know, electrically short like, if you put 2 completely mismatched things right next to each other, like, not all the energy will get there, but the signal will still look okay ish. And

Speaker 1:

And so to be clear, this is on the clock. Correct?

Speaker 4:

Yes. It's using the clock input to the Right. Ethernet NIC chip.

Speaker 1:

Right. So this is what and so the clock and so I I don't know. I'm not sure if you got you got what the actual issue was. The the clock was was missing a terminating resistor. So it it which I kind of view as a kind of an off ramp.

Speaker 1:

I mean, this is a bad way of of of thinking about it, but an off ramp for the signal as opposed to just kind of bonking in, and so we were getting effectively reflection on the line.

Speaker 2:

And getting jitter. And the spec sheet said specifically not to put a resistor there.

Speaker 3:

They said

Speaker 4:

They said, oh, no. You don't need that.

Speaker 6:

Yeah. So, I mean, this is

Speaker 9:

one of the things where it's like Oh,

Speaker 4:

well like production. You don't use that because it's so short. Well,

Speaker 1:

and this is where you get to something that I feel we've learned again and again and again. When the when you don't necessarily build from first principles and you have a reference design, you don't really know what's working by accident there. And in this case, like, they were speaking kind of their truth, which is, hey. In this in the designs that we have built where the clock is super close to the part, we haven't needed it. What they don't know is, like, well, actually, you may have needed it.

Speaker 1:

You just were kinda getting away with it. It was kinda working by accident.

Speaker 3:

Well, not just that. If they actually fix if they actually added it on their NIC, their clock might their input clock might have less jitter, meaning their PLL does better. Their SerDes will do better. And, therefore, actually, you can extend your cable a little bit more because you're eating less into the budget of your of your your bid periods.

Speaker 1:

And so And so I so I feel like this this is kind of a punchline we gotta work out too. And then, Andrew, I wanna get you in here as well because I know you've been looking at some of the the the the the scope output that we've been able to get you. The so you yeah. What what did it look like once we got the we we we've done all the simulate we've done the paperwork to to roughly verify it. We've done the simulation, the iteration with the simulation on the layout, the big simulation multi week to verify that it's gonna be we think this is plausible.

Speaker 1:

We find the missing 100 oh resistor on the clock. We get that resolved. And what what does the what does it look like end to end?

Speaker 4:

It's gorgeous. Like, it like, it's still better from Tofino. Like, that that chip is still putting out gorgeous gorgeous signals. But even, you know, even from t six going out in our, you know, some 9 reality, it's like you don't expect the CI, honestly, looking at the the at these things because the signal is usually so degraded and so crappy that the only prayer you have seeing anything reasonable is by enabling every equalization at the known demand on a receiver. And

Speaker 1:

Sorry to ask. I what what is equalization? I'm sorry.

Speaker 4:

Equalization is basically a way. It it's there's a couple of mathematical things that they can do to boost the frequency, essentially, in a in a very poor man's subscription. Like like, they're saying that the the channel is a low pass filter. So you send a nice high frequency signal out and you send your channel that ends up with a low pass filter and you get a much more attenuated signal coming out of the receiver. Well, the receiver knows that there's generally some, you know, characteristic of that, and there's mathematical ways of boosting that that high that lost high frequency information.

Speaker 4:

Got it. And there's that's that's one of the methods. There's there's DFE, decision feedback equalization, or it's FFE, t four equalization. I think that might be all kind of better. But I and that's that's how my expert I have expertise.

Speaker 3:

You you basically try to amplify specific frequencies again that you do care about that are in relation to your fundamental frequency in the hopes that you can buy buy basically artificially, making these these values, like, bigger by amplifying this, you can recover more of the original signal.

Speaker 4:

Yeah. And that's that's part of that really fancy scope is having things like CTLE, DFE, etcetera, that can do those kinds of equalizations Right. Like the chip does. Because you can't actually probe well, it's impractical to probe on the dye. Right.

Speaker 4:

So you have to probe somewhere else and then figure out what it looks like at the dye based on that measurement. And,

Speaker 1:

Eric, maybe now is a good time to mention the, the probe station that you that you built,

Speaker 7:

which I think is pretty neat.

Speaker 4:

Sure. Yeah. The the probe station is just it's a collection of off the shelf parts from either Thor Labs or Misumi or Mastercard. And its entire purpose in life is to hold microwave probes, which are basically coax pieces of coax that are very, very small that have very pointy tips that are both plated that touch down on a circuit board to try and measure the channel. And they're certainly delicate.

Speaker 4:

They're very, very sensitive to movements and vibration and things because they are very, very delicate. And they're basically designed for probing wafers that are perfectly flat and, you know, perfect. Pretty bright. But we're, you know, we're using them to probe this, like, horrific, like, not with landscape in the PCB. Right.

Speaker 4:

And so the we need some sort of mechanism to hold the board steady, support it, and then hold the probe steady and get them in whatever position you need them to interface with your BGA pattern or your connector pattern or whatever it is, and hold it very still for extended periods of time while the system does its thing and does all its measurements. Right. And so

Speaker 1:

And this is the actual literal prob effect that we are I mean, with the or when I mean right. The not distorting the system while we probe it is really, really tough.

Speaker 4:

And also getting a good, like, connection between the probe and the system is really damn difficult. And so, you know, there's all sorts of stuff you have to deal with that. Basically, the the system that we looked at that were commercial were either Expensive or expensive? Well, expensive or expensive, even if they were really expensive, they were something to, like, one board at a time, and we needed 2.

Speaker 1:

Right.

Speaker 4:

And so I'm like, screw this. This is stupid. Like, somebody should just make one of these things. So I just made one. And I and my previous job, I was in the herd of making one anyway, so I kinda had a little plan on my own.

Speaker 4:

And there's so many things to improve upon it, and it's, you know, not not great, not perfect. But, eventually, when I get a free moment, which is in short supply lately. But when I get a free moment, I will, you know, publish the design and open source it, and people can do with it, but they will. But there's some feedback that Tom has given in such to to make it better but I haven't been able to get it for quite yet either. And well, and just like

Speaker 1:

I I think this is a very common oxide theme that I was on the verge of making one in my previous job. I think it's, like, true for many different kinds of one and many different kinds of jobs. I feel like but, Ari, do you think that that's kind of a common theme across the company is that everyone kinda come here with a chip on their shoulder about the kinds of things that they are, either sick of paying too much for or wanna do on their own?

Speaker 3:

That is definitely a theme. I sort of was guilty of that too.

Speaker 1:

Oh oh, no. Oh, and I mean, I'm not speaking euphemistically. I mean, this is true of, you know, that we done our operating system. We done our own, like, lots of things.

Speaker 2:

I think, almost to a person

Speaker 5:

in in an office with a host of stuff like

Speaker 2:

this where where it's like tooling, where it's like at no other company and no other position could I make the justification of like, we'll spend too much and get the wrong thing. And this is a great example where we're now we're we're building this thing that is totally fit to purpose. And in places where it's not, you know, it's it's our design, and we can fix it.

Speaker 1:

Yeah. I gotta say, like, if I had, like, one piece of technology leadership advice, if anyone on your team wants to build their own tooling, you should always let them. I mean, I just like I the because in general, the people who want to build their own tooling have already kind of thought through the problem well enough that it is almost certainly not ill advised just because they're asking the question. I mean, we know the things that we're not gonna build on our own. Right?

Speaker 1:

We know the things that are that that that, you know, if it does exist, there are things we're not gonna build on our own. But Not yet. Not yet. Exactly. The Larry, don't worry.

Speaker 1:

Not simulation. With the probe station, we are able to actually because the thing I love about the probe station is it is strictly mechanical. There's no kind of a fancy robot arm here. This is like

Speaker 4:

No. That can't cost too much.

Speaker 1:

Right. No. Right.

Speaker 4:

It's a mechanical system, and there's a little, like, elect you know, little mic microscope that plugs into a TV that I use, and you can lower it down and use knobs on 3 axis things in it. There's one feature on it that is, like, the one unique feature of the entire system. And the fact that you can adjust the planarity of the probe, which is basically you have these 4 very sharp delicate points that have to touch down the exact same time. And inevitably, like, nothing is in the same plane. So you have to adjust those 4 probes at the to to be the right angle to touch down your board all at the same time.

Speaker 4:

And the method I use adjusts that linearity or that angular mismatch and adjust that without actually moving the tip, which is not very common. And I haven't been able to find it in any other room station that I would guess. But that's the the one the one salient feature, which is

Speaker 1:

Yeah. That's well, that is cool. And I'm looking forward to get the I mean, it's gonna be fun to get kind of our your plans out there and and it's to open source it or whatever. It'll let people go go wild. So we're we're gathering the data, and it's looking good.

Speaker 1:

And, Andrew, I know we were feeding a bunch of this data that was coming off the Teldrino and LeCroy scope to you for some of the stuff that you've done on your GLscope client. Do you wanna describe some of that work? Because it's really interesting stuff.

Speaker 7:

Yes. So, speaking of let your engineers build tools, that's a policy I firmly subscribe to as well. So, for those of you who aren't familiar with gel scope client, it's a open source application that is essentially a, user interface and front end for oscilloscopes, and now starting to branch out to more content heads with multimeters, network analyzers, and so on. But for the most part, oscilloscope. And it consists of a object oriented library of liposcope HAL that provides a API to oscilloscopes, and then GLscope Plan itself is the UI in front of it.

Speaker 7:

So I can connect to a Rigel. I can connect to Looker. I can connect to a Seglin and have the same user interface in front of it, the same c plus plus API in front of it. I can load waveform data from a file coming from any of these scopes. And so, that's how, Eric and company were sending me the data from the webmaster is they were taking the files off of the scope, exporting it to Macquarie's waveform format, sending them over to me, and then I would load the dot TRC file and then process that waveform in exactly the same user interface as if I was sitting in front of his cup.

Speaker 7:

That is so

Speaker 1:

you so alright. So just I just repeat this to to kinda say this back to you. So I think it's pretty amazing. You have implemented effectively the front end on this thing, so you can take data from our scope and pull it up on your own virtual scope and manipulate. Is that a fair

Speaker 4:

summary?

Speaker 7:

I can acquire data from any source node. So it's built around a filter app architecture. If you're used to a new radio or any kind of, like, audio processing pipelines or and stuff like that. So the source node can be a physical instrument or it can be a block that simulations earlier for example. So, very common use case is to create a synthetic step waveform and then import channel response data as as parameters, and then you could apply channel emulation to that in order to see what either the what were time domain transmission response or the reflected time domain reflectometer response would look like given channel data from, say, an s sorry, from a, tuckstone file.

Speaker 7:

But you could also do the same thing with, say, live dip coming off of the DNA. You can have that the system just cares the input to this filter block is a magnitude channel on an angle channel, and it doesn't care whether that channel came from a file or whether it came from VNA or whether you generated it from some arbitrary simulation. It's just data flowing into blocks and flowing out of blocks.

Speaker 1:

That is really cool. And the and I gather that is are these formats always open or I mean, it feels like, there would be some temptation for bad vendor behavior in here that you must have sidestepped.

Speaker 7:

So vendors in general have been incentivized to provide APIs for equipment because production, test automation, and things like that are you know, so many people buy scopes to tool up production lines in factories and test some product, then you just need to go trigger on the same event over and over again. You know, once every 5 seconds, the new thing comes down the assembly line, goes onto a bunch of pogo pins, you run a bunch of waveforms, you make sure it meets back, comes off, you do the next one. You have to be able to script that. The problem is the the previous state of the art had been using ASCII text commands that were all instrument specific in order to be able to

Speaker 4:

see the

Speaker 1:

right. So

Speaker 7:

Libsco, how does away with all of that. It is still using that same API under the hood to connect the instrument, but it presents a unified vendor agnostic c plus plus API to you as the end user. So you can write code against an abstract smoke that has 4 channels and a sample rate of at least 20 gigasamples per second and a bandwidth of at least 2 gigahertz and supports a positive edge trigger. You can write your test case against that minimum set of requirements, and then any oscilloscope object that you are given will work with it, and you don't care what it is.

Speaker 1:

That's very cool. That is very cool. Yes. So but you can give it to me once.

Speaker 4:

So if

Speaker 7:

you can connect to several scopes, and it's got a feature that allows you to probe a common test point with 1 channel from scope a and 1 channel from scope b, and then look at, the delta between them. It'll calibrate out the phase shift in your trigger cable. So now you can view the same, say, 8 channels, scope and 4 from another all in 1 user interface.

Speaker 2:

And then, sir, could you describe some

Speaker 1:

of the data that Arian and Eric got to you? Because I I was really I I I loved what

Speaker 7:

you were doing with it,

Speaker 1:

but I didn't understand it. Good.

Speaker 7:

So, again, we're talking about the filter graph. And so the entire architecture application is built around this. You take a series of data sources, you apply a transformation to them, and you visualize the output of that transformation or use it as the input to other blocks, and you don't necessarily have to, you know, be seeing plots of every intermediate result that we can if you want. And so, in this case, the main dataset that I was looking at a lot was a, q signaling link, so quad serial gigabit media inter independent interface. It is 4 lanes of up to gigabit that are time shared on a single physical bus.

Speaker 7:

So the individual lanes are running at 1.25 gigabit per second, and it's pretty much just round robin send 1 10 bit symbol from lane 0, 10 send 1 symbol from lane 1, send 1 symbol from lane 2, and then it they just all take turns. And then there's there's a there's a few little tweaks to the encoding because, you know, you need to know those 4 lanes corresponds to which physical port and so on. So they they change the idle character in lane 0 in order to let you know this is lane 0 so you can keep them safe.

Speaker 1:

And, Matt, this was because it just the folks may be confused where this particular network was coming from. That was using this high speed scope that we used for the high speed backplane, but that's on the management network. Right, Matt?

Speaker 2:

Yes. This is actually taking a couple steps down in speed. So this is a 6 gig link and the the 28 gig link. That's the main backplane. This is basically between all of the servers, all of the service processors.

Speaker 7:

Yeah. So what's what's actually going on is there are 4, 100 megabit, I believe, at the far end, but they are multiplied up to 1 gigabit because the the link actually has a minimum data rate it can run at. And so the way that serial gimme works is if you have data going at 1 gig, you send each byte in sequence. If you have data going at at a 100 meg, you send each byte 10 times. If you have data going at 10 meg, you send each byte a 100 times.

Speaker 7:

And so you always end up having the same data right over the link just because some of this hardware doesn't like running too slow. Anyway, so once you have these streams of, bytes coming from each of these four lengths, again, you just interleave them, send them 4 times as fast, and the bytes take turns and so on. Anyway, so the nice thing is that the signaling within each of these lanes is again serial, which is essentially regular, gigabit Ethernet, 1,000 base kx or base s x and so on, same line code, other than the support for 10 100. And so we were able to write a we had an existing filter block in Just Go client that recovers the clock from the link, another one that takes the analog waveform and threshold it to give you a digital waveform. Then you feed the recovered clock and the digital data into another filter block that decodes 8b10b, and then that gives you a sequence of, control characters or data bytes.

Speaker 7:

And then you can feed that into the new block that I wrote to work with this data, which just takes in a stream of 8 b 10 b symbols and outputs 4 additional streams of 8 b 10 b symbols at one quarter of a rate. So it determines which lane is lane 0 because that one has the, k 28 dot 1 instead of k 28 dot 5 as the idle character. And then it so once it's recovered, the sync and it knows which lane is lane, 0, then it just takes all the incoming data. Again, round robin's it. Okay.

Speaker 7:

You're lens 0. You go out to this port. You're wing 1. You have this port. And it creates 4 streams of data at 1 quarter of the rate.

Speaker 7:

Then you can take those data streams. You can feed them to individual product called decoders for serial gimme, for example. And that will then decode up to Ethernet frames. You can then apply another decode on the end of that that decodes, say, I need the 4 headers, or you can add a c node that outputs to a PCAP file and look at it Wireshark.

Speaker 4:

So you can cascade all these different all these different filter blocks independently in any order you want depending on what kind of application you have.

Speaker 7:

Exactly. But they are more strongly typed. So for example and this this does confuse some new users, and it's something that we'd like to work on as far as making the GUI kind of infer type conversions a little bit. So for example, it is it wouldn't be able to apply a r s 2 32 protocol decode to an analog signal because r s 232 is a digital protocol that expects a digital input. And so you first have to apply a threshold filter to convert your analog NRZ signal into a digital signal.

Speaker 7:

And once you have a waveform of type bull rather than of type float, you can then apply the RC 30 to that and so on. But yes. So as long as your input is legal for the data type that the filter block expects, you can cascade them arbitrarily.

Speaker 4:

You can basically make an arbitrary decoder for any kind of data you want based on the based on the data that you have. Like, okay. I have some arbitrary a b 10 b thing. So I'm gonna put a CDR block in there, and then I'm gonna put a, you know, threshold ring, all that all those kind of blocks and series, and I can basically, you know, very a pretty good start to a, you know, cost of vertical leap of Exactly. And and it depends

Speaker 7:

on the underlying libraries are all open source. There is a plug in model. So if you make your own decoder that you don't want to release, it is completely impossible to make a binary blob protocol decode that fits into this API. Now right now, since this is still kind of a work in progress, there's no binary stability. So releasing a blob would be ill advised because 5 commits later probably won't run anymore.

Speaker 7:

But architecturally, it is possible. Anyway, so you can write your decode as a plug in or compile it into the main code base, you know, submit a power request for it as upstream and so on. But now your decode so let's just say, hypothetically, you could make a decode for, say, gigabit Ethernet, which we have already, but say you're writing a new one, you could write a decode that would take 8 b 10 b objects and spit out Ethernet frame objects, and then you could take your new because it's there what we did here, actually. So I've written the decode for serial gimme, and then since that output of the Ethernet frame objects, I could take the output of that block and feed it into, say, the IP before decode that I already had. Yeah.

Speaker 7:

Because as long as the data type, the the NAS stage doesn't care that this is a filter I just wrote. It just sees 800 frames, and it's like, oh, 800 frames. I know

Speaker 1:

how to do with those. Yeah. That is really cool. And I I think we Matt has got a great blog entry from, Matt, when was that? 2 weeks ago?

Speaker 1:

A week ago? I can't even remember all the time just pulling together.

Speaker 2:

Yeah. It looks like it was 2

Speaker 1:

weeks ago or something like that. But a great blog entry on going from the scope to the to actual Wireshark output. Yeah. Larry, I'm not sure if you realize this, but we're actually doing we don't we're not actually just building 1 switch. We're building, well, at least 2, arguably, 3.

Speaker 1:

We we've got, our our main switch has got a lower speed switch on it for the management network. So we were actually doing a bring up of 2 switches, not not just one. So it's been it's been exciting times around here. So in in so one of the the Eric, I wanna get back to one of the things you were saying about, you know, you get this whole thing up, and it looks pretty glorious. And, Larry, in particular, a question for you.

Speaker 1:

How often do you kinda go back to, like, okay. We've simulated this. Let's go to the actual built artifact, and how how did how did it work out?

Speaker 9:

Well, that's interesting. I mean, well, maybe maybe my colleague, Robert, who's also on the line, could tell you a bit more war stories out of the trenches of doing these things himself at Microsoft. But, yeah. Usually, it's it's as as I'm hearing on this call, is, it's it's you you keep working at it until you make it

Speaker 6:

work. Yeah.

Speaker 9:

You know, there's a lot of that that happens and and things come up. Right? Oh, I didn't terminate a particular, transmission line. I need to add a different resistor. You know, and the vendor didn't tell me to do that.

Speaker 9:

You know, there's a lot of fenics going on, and they're not gonna cover all those cases. And that's why simulation comes around and is useful. Going back and doing the sort of, postmortem, I suppose, is is what you're asking. I think that there are, you know, a lot of our long time customers that have been doing this, they they've come to rely on the software and you get good at it if you're doing it all the time. And then you, then you, you know, you you you you you get that faith in in what what the software can do and what you're doing.

Speaker 9:

The software will tell you exactly the answer precisely of what you said to it. But whether or not that matches the network that you're designing in reality is another story.

Speaker 4:

If you send it a pile of crap, it'll give you an answer based on a pile of crap.

Speaker 9:

And it'll be right.

Speaker 4:

Right. Exactly. It'll be right.

Speaker 5:

But you spend it And that's why a lot of people spend a lot

Speaker 1:

Yeah. Go ahead, Robert.

Speaker 8:

I was just gonna say that's why

Speaker 5:

a lot of people do spend a lot of time doing simulation measurement correlation, because a lot of times, you might need to tune some of the parameters of your simulation model to get to match what you're seeing in reality. Right. You know, you might get a certain dielectric constant from a vendor, but that might not actually be the the actual value you need you need to use in the simulation to get everything to nicely align.

Speaker 4:

Yeah. Or you're or you're getting leave effect, which you didn't necessarily model because that requires a whole hell of a lot more simulation time. Yeah. And even being like,

Speaker 5:

if you have surface roughness, how do you account for that? Are you accounting for too much of it or not enough?

Speaker 8:

Yeah. I was just saying different factors. Service roughness is, like, one of the core most poorly specified thing from the vendor and and and requires really everybody to characterize it for themselves pretty much

Speaker 4:

as far

Speaker 8:

as I know.

Speaker 2:

Okay. So surface roughness. Can you I mean, this feels like it we've been in

Speaker 4:

the grasp of my copper optics.

Speaker 8:

Yeah. The the dish the issue is that for for one, there's a thing called skin effect, which is that the higher and higher frequency you're transmitting across copper, the electrons skate across closer and closer and closer to the surface. And at the frequencies we're talking about, the skin depth is microns. Microns deep on the surface of the copper. So that means the resistance loss in the transmission line that you're, that you're calculating or losing across the entire transmission line depends on what that surface looks like.

Speaker 8:

It's not a nice, polished, perfect piece of copper. It's actually intentionally rough on one side in order to, improve adhesion to the epoxy layer.

Speaker 1:

And so man. Right.

Speaker 8:

So they they make it rough on purpose on one side. They polish on another. So that is also why you want to create your stack up in a way that you want to more closely bias the polished side so that most of your losses are sitting there. It's it's all these details that you gotta dig into and understand and how you're constructing the whole whole shebang. You know?

Speaker 4:

Yeah. And, like, and, like, your idealized notion of a trace being the vertical rectangles that sits above and below the infinite ground plan is also, of course, because Oh, yeah.

Speaker 8:

Your trace Trapezoidal effects yeah. Exactly.

Speaker 4:

Or if it's on the surface, it's a trapezoid with maybe an inverted trapezoid. I'm not glad.

Speaker 8:

Plating. Exactly. Yeah. There's

Speaker 4:

a lot. And then you get the, you know, your skin effect, and then you get the fact that you we use e negative. Right? On the nickel, it's actually pretty crappy for RF.

Speaker 8:

And it's magnetic, so it has magnetic. Yeah. More other problems, which is why we don't use microstrip or external lasers. Right. On edge of these these high speed traces.

Speaker 7:

Nickel has a really shallow skin depth compared to copper. And so not only is well, it's not even pure nickel. It's nickel phosphorus alloy. They have to add phosphorus to it so it'll plate better, and I think it affect the mechanical properties a little bit too. And so you've got your nickel phosphorus plating on top of your copper.

Speaker 7:

So now it's got a shallower skin depth, so your signal is traveling in less material than it would if it was pure copper, and the resistivity is about 4 times higher than copper.

Speaker 8:

It it's which is why we do microstrip only. Oh, sorry. Strip only. Sorry. Other way around.

Speaker 1:

So alright. Is where do we use the the kind of the phosphorus doped nickel? Wait. Where does that be?

Speaker 4:

Everywhere you see copper on our External layers. Gold that has that that layer of gold is Angstrom thick, and the what's underneath in between the gold and the copper is actually a nickel phosphorus alloy, like Andrew is saying.

Speaker 9:

Damn it.

Speaker 4:

Whereas it's ENIG, electroless nickel immersion gold. So, like, electroless nickel is a type of plating method for putting nickel onto copper, And then immersion gold is basically like this gold flash plating that makes the solderability a lot better, but also doesn't it's so much gold on there that it's expensive, and you don't have gold brittle gold embrittlement of your solder joints. Too much gold is also bad then.

Speaker 1:

Computers are really complicated. How does any of this stuff ever work? It's just amazing to me. I You know, there there are all

Speaker 2:

these, like, memes going around about, like, we don't have enough electrical engineers.

Speaker 3:

It's like, well I mean, yeah.

Speaker 1:

We don't have enough

Speaker 4:

think that you can choose.

Speaker 2:

Who can keep this their heads?

Speaker 1:

Yeah. It's it's just remarkable. And I think it's interesting that I mean, the so service roughness, Tom, is, like, one area that is particularly tough to navigate, and it sounds like it has a real effect.

Speaker 8:

And in fact, somewhere around, like, around 10 gig days, the resistive losses became on par with direct dielectric losses just based on and, again, this kinda gets into what is your impedance. For instance, in order to create an impedance of, let's say, 50 ohms, your geometry has to be a certain width and a certain distance from the dielectric. Right? And so in general, this is all yet another complexity that you're trying to, like, fit your PCB into a certain overall thickness, and I have to get so many layers into that. So then that you know, all of these things cascade to say, in order to route the board to make it work, you have to do this geometry.

Speaker 8:

And then that geometry, you go and look at the losses of it. And it's like, generally speaking, about fiftyfifty is what I found on the boards I was designing. And so resistance is

Speaker 4:

huge What

Speaker 7:

do you say?

Speaker 4:

50fifty? What do

Speaker 8:

you mean? So the resistance So there's 2 types of primary loss in a in, in most of PCB, we'll just talk about, like, resistive loss, I e, literally LRC, so the r part. And then your dielectric has losses due to, well, there's lots of fun physics there. But, nevertheless, like, your dielectric loss constant is what sort of dictates that primarily. So we can pay for a WES loss in the dielectric by buying better and better dielectrics that have lower and lower, DF or the dissipation factor.

Speaker 8:

And so we do that. We pay for a really good, really good very low loss material, but you still can't get very you can't get that good on copper. So resistance losses are still dominating in in most cases. You can improve that by making your trace wider because literally you add more copper. But that is yet another problem because if you make the trace wide, that means you have to make the dielectric thicker, which means you have only so many layers you can fit in the overall thickness of the board you're trying

Speaker 4:

to achieve. So all of

Speaker 8:

these things. So resistance is bad.

Speaker 1:

Right. That yeah.

Speaker 4:

I wonder if you make your effort too smooth, your board falls apart.

Speaker 1:

Right. Exactly. I mean, clearly, there's a reason for the roughness. But for the for the adhesion figure So your board will literally delaminate if you make this thing Yeah.

Speaker 7:

That's fine. You you know,

Speaker 1:

I I've I've I've said this before. I'm a sure server say it again. I feel like the PCB is missing a definitive history. I mean, there's so much stuff. And I think I'm always learning something that is extremely important like the the the the adhesion and the and the surface roughness, which I know I'd never heard of and is yet clearly a very important factor and trade off.

Speaker 1:

I think I just found this in the

Speaker 8:

Eric mentioned fiber weave effect. That's another fun thing, you know, because the the fiberglass is literally woven in and then, cured in a layer of epoxy. And and so you get these little microdips of dielectric constant across the board depending on which axis you're on. And so one of the mitigations of that is to rotate all of your artwork by you know, ideally, it would be 45 degree and then 22 degree, but, like, that that costs a lot. So 11 degree is sort of state of the art where everyone, like, you know, manages that.

Speaker 8:

So you're trying to route it some odd angle to this orthogonal weave, which has little dips and valleys, and then, you know, you have more

Speaker 4:

fun things to deal with.

Speaker 1:

Is that something that, like, we are able to capture simulation? Do we have to Yeah.

Speaker 8:

Yeah. What not by sim simulation is difficult, but for measurement, we we we can capture it. And and that's one of the things, that one of the bits of work we will do in order to determine like, there's lots of different ways of of mitigating it, but the the impact is that if you had a differential pair and you're routing along the board, you think they're well phase matched, but lo and behold, one of them might be sitting, you know, under, like, a slightly higher dielectric constant than the other one. It will go slightly faster, and they will then be off by a picosecond or 2 picoseconds or whatever. And it you know, at a 40 picosecond eye, this stuff really matters.

Speaker 4:

You know?

Speaker 9:

Right. Man.

Speaker 5:

And and right me to add to that.

Speaker 3:

Noise in a 150 minute. We're we're we're trying to beat 300 femtoseconds of jitter on clocks. And so the one picoseconds that we get skew out of a PNN, that's a lot.

Speaker 8:

It is?

Speaker 3:

You just undid all the work that you did in the Totally.

Speaker 1:

Yeah. And then, Robert, sorry. You were you were gonna say?

Speaker 5:

Oh, no problem. I was just gonna say, I've actually seen people, define small sections of of a design and analyze the fiber weave in simulation. They'll actually draw in the fiber weave as well as the epoxy to see how different angles of rotating are gonna affect, you know, how much loss you might have or or any, problems you might have from that fiber weave effect.

Speaker 1:

Yeah. That is really cool. Did they

Speaker 8:

ever write any papers on that? That'd be fun to watch.

Speaker 5:

Go to read. I I believe so. I don't remember off the top of my head, but, I have seen it done before.

Speaker 1:

That's really neat. And, I mean, the the afterthought is, you know and this has been, you know, from from piece of paper all through simulation all the way through. I mean, we've got a then this is the advantage of doing this thing kind of from first principles and and simulation intensive and so on. We've got a backplane that seems to be doing pretty well. I mean, it's Ari, it must feel very satisfying to have taken this thing from its initial conception to, as you say, it's a measurement 2 years in the making.

Speaker 3:

Yes. Absolutely very satisfying. And, also, like I said, in the tweet, a little unique, because we know the 2 systems that will be connected. Whereas if you are building a switch that you sell in the market as a just as a standalone device, now you need to live within specification that the IEEE standard, dictates to you. So you're using, you know, some of these ballpark figures that they that they put in standards.

Speaker 3:

And then, you know, if everyone keeps adheres to them, then you you're gonna most likely end up with a with a system that works. But it might mean that for example, with a particular switch and knit configuration at these higher speeds that the DAC cable that you used to use that is 2 meter might not actually be working anymore. So now you need the 1 and a half like the the one and a half meter DAC cable might be the longest you can get. But in our case, because we control both both sides of this link, and we very precisely control what the link itself looks like because we're sourcing all of these individual pieces, and we're carefully selecting them and matching them and making sure that all the that, you know, that we ring out all the little bits of performance that we can get, we're able to build a backplane that is actually fairly complex. There's just multiple connectors in there, pretty long piece of copper to go from one board to the next.

Speaker 3:

But we're able to measure that and build some confidence that by the by the time we're done, this thing will actually work because we've seen the worst case. We understand which cable lengths will be will be observing. We can we can we can do some checks on the quality of these cables, and these are built to very, very tight tolerances. And then we can measure that and we can characterize that. And then, you know, we build we build a little bit of buffer and a little bit margin in there.

Speaker 3:

But the it it basically gives us a very good overall picture of what the system will be used, like, the context in which the system will be operating, which is which is pretty cool.

Speaker 1:

It is really cool. And it's unquestionably, this is this is a hard path to go build all of this from first principles. I also think, like, and and and so, Eric, so so, real talk. When this thing was not working, when we were not able to drive this to 100 git, what was the to what degree were you kind of vomiting in a trash can or wondering if, like, have we screwed something major up, or we do you have confidence we're gonna nail it?

Speaker 4:

Yeah. There's there's a reason why I travel with lots of.

Speaker 1:

It's not very different.

Speaker 4:

I got a lot of indigestion from this, and I I didn't realize, like, how much this is weighing on me until we got it working. And I slept really, really well. I'm like, oh, yeah. Okay. But the the the the satisfaction that you get after working on this for so long, and and I even nearly as long as Aria and you and everybody.

Speaker 4:

But the time that I've been working on, I've been seeing it come and getting it connected with our longest cable and have it basically 0 FEC correction. Because as I said before, you kind of expect FEC correction. Right. The the

Speaker 1:

the this would be forward error correction that it that yeah.

Speaker 4:

Like, you assume there's gonna be bit errors, and they have that you know, we have FEC enabled to, you know, correct for those. And base this on none. And you

Speaker 7:

just That is

Speaker 1:

really great. That sounds amazing. Yeah.

Speaker 4:

This warm fuzzy feeling. Like, it's Christmas morning or something. That's nice.

Speaker 1:

Well, that that, I think, is a great note to to end on. You know, this is a this has been a huge team effort. I mean, I think I think I think a part of what I love about this problem is that it requires every single link in the chain needs to needs to work, and anything can actually introduce, you know, insertion loss that you don't want or what have you. And I think it was really fun to watch all this come together. And, Larry, Robert, thank you for joining us as well.

Speaker 1:

It it was it was, obviously, fun to fun to put your software to work. And, hopefully, you've enjoyed getting with the getting with the team that's actually using it in anger. And,

Speaker 9:

No. We appreciate it.

Speaker 5:

Absolutely. Thank you for having us.

Speaker 1:

Yeah. You bet. And, thank you, everyone. This has been a lot of fun, Andrew. Thank you too for your work on GLSCOPE client.

Speaker 1:

I mean, you are the the open source work is really, really important, and we're very excited to see it.

Speaker 4:

And the roads that you're hiking.

Speaker 1:

Oh, and the absolutely. Yes. So, bringing open source software to a domain that that's really needs it. Alright. Well, Ariane, thank you very much for kicking us off with this tweet.

Speaker 1:

It was a measurement 2 years in the making. Adam, I I I assume I can speak for both of us when I say it's been extraordinarily educational as always.

Speaker 3:

Absolutely.

Speaker 1:

It is amazing that anything works at all. And now can we please, those lunkheads like us in software, with with our we it is our responsibility to get our software to run correctly on this unbelievable fabric. So we really need to let's let's try not to be such cools.

Speaker 3:

I don't Yeah. And don't waste any bits, please.

Speaker 1:

I don't waste any bits. So we we got a call.

Speaker 3:

Together over here. Right?

Speaker 1:

We got exactly. We we we gotta clean it up. Alright. Thanks, everyone. Next time, it's gonna be, Kate talking about supply chain.

Speaker 1:

That's gonna be a great discussion. That's gonna be next week. So really looking forward to that one. And, Robert, Larry, thanks again. Thank you, everyone, and see you next time.

Bringup Lab Chronicles: A Measurement Two Years in the Making
Broadcast by