Oxide and Friends | Transcript: Virtualizing Time

Virtualizing Time

June 12, 2023 / 01:05:38/S3 E16

Speaker 1: 00:00

Whenever we have a snappy day in the Bay Area, I'm reminded of one of my favorite 30 rock lines. It's a total throwaway line when, the Liz Lemon's rival is gonna move out to the Bay Area, and she, in in a way of disparaging the Bay Area, says, well, enjoy carrying a white sweater around for the rest of your life. And I'm just like, definitely, it was like the barrier living for the

Speaker 2: 00:21

burn. It's a good burn.

Speaker 1: 00:22

It's a good risk. Sick burn. Sick Bay Area burn. Anyway, Jordan, it's great to have you here. In preparation for this, I, you know, we did an episode on time, not on high resolution time.

Speaker 1: 00:33

This is different. But I I started getting this budding paranoia that I would repeat some of the same anecdotes. Is there a name for this phobia? There's just it's just fear of dementia, Adam, is there a specific name for this?

Speaker 3: 00:45

I don't

Speaker 1: 00:45

know what it is.

Speaker 3: 00:46

No. But I I mean, I I I think that you and I suffer from it acutely, in part also because, like, we we both have wives who are like, nope. Heard it. Nope. Heard it.

Speaker 3: 00:57

Like, I don't

Speaker 1: 00:58

I definitely did

Speaker 3: 00:58

this one before. Exactly. Page. Right.

Speaker 1: 01:00

Oh, I know. So I went back, and just to, like, cover make sure I had my anecdotal bases covered, I listened to our time episode of 3 x, which was an interesting kind of experience. It was I kinda felt like, you know, when you got, like, the cat with the laser pointer, and Jordan, you you you're a new owner of the cat, so I'm sure you'll appreciate this. You know, you get, like, the cat with a laser pointer, and you kind of, like, are dragging the laser pointer around, then you're amusing yourself. And finally, you get sick of it, and you just start scribbling the laser pointer on the wall and just watch them just just lose their minds, that's kinda how I felt like listening to us to 3 x.

Speaker 1: 01:30

But then it was actually listening to us to 2 x felt very intelligible. So

Speaker 3: 01:35

Yeah. Good. Well, I I I think it's a good announcement. This can be 0% post consumer product or whatever. All all new anecdotes.

Speaker 1: 01:43

This this is all new anecdotes. All the fresh stuff. And part of the reason I was concerned about that is because this is also kinda going back into my own, in my own way back dial in terms because this is one of the first things I worked on as a young engineer was time, and in particular, using high resolution time as a basis for timekeeping and all of the problems with that. So, and, Jordan, what if we maybe want to start there in terms of maybe you could describe what TSC is, the timestamp counter is, and, kind of what it's used for in the operating system, and then maybe we can get from there into some of the challenges we had in terms of virtualizing.

Speaker 2: 02:27

Yeah. Sure. ESE, as you called it, is the timestamp counter, the next 86. It is I think of it as just kind of the basis for monotonic time in the system. So anytime you want to measure a duration or an interval, we would use probably something that is based on the TSC.

Speaker 2: 02:49

It is also an MSR, which means that, it's it's writable as well, which adds some additional wrinkles to thinking about, virtualizing and migration VMs. Yeah. Is it like the simplest example is HR time, high res time, which is like, nanoseconds since boot. But that's been computed from the TSC in terms of, the number of ticks, which is usually thought of as, like, clock cycles and then a frequency.

Speaker 1: 03:18

Right. So I I so let's expand on on that a bit because you've got the the the frequency at which TSC is moving is Mhmm. Mhmm. And this has changed with various parts over the years, but it is more or less a base CPU frequency.

Speaker 2: 03:34

And That's right. The

Speaker 1: 03:36

back in the day, one of the things we discovered, at least when I first did this this, tick based time work on UltraSpark, is it turns out these parts were actually not a 167 megahertz, that they were actually a 166.996 megahertz, and that or or they would vary, and that difference could be really meaningful. And you you would lose the ability to keep time if you didn't. You you need to know the frequency with which this thing is and because you wanna turn it into something that is, like, meaningful to humans, ultimately, like fractions of a second, nanoseconds. So what and the but the frequency on x86, because x86 has had variable frequency for so long, you've got a defined base frequency for TSC. Is that right?

Speaker 2: 04:25

Yeah. I mean, in terms of the promises made by the architecture, I'm less familiar with that, but I do know that it's not exactly necessarily the CPU frequency. So, like, one of the things that the Lumos does on boot is calibrate the frequency of the TSC and measuring it against a different time source, the pit, I think. And and we use that as the frequency for the system forever, because there's certainly things that can introduce jitter into that, but it seems to be good enough these these days.

Speaker 1: 04:57

Yeah. The PIP thing, the programmable interrupt timer is the 8254, which, of course, hasn't existed in as a discrete part forever. It's just this a effectively an emulated part. But alright. So we so we we got the challenge, and you may you mentioned that it's an MSR.

Speaker 1: 05:11

Do you wanna elaborate a little on what an MSR is and what that means for the architecture?

Speaker 2: 05:18

Yeah. I mean, my my understanding of it, I think it's model specific is what ms and then ours register. So it's a register program by, I guess the operating system, like, it's, it's something that is in a certainly a higher privilege context. But the TSC actually is writeable, I believe from. Shit.

Speaker 2: 05:40

I don't know. Off the top of my head. But, yeah, the guest the importantly, the guest can write to it, whether that's,

Speaker 1: 05:47

Right.

Speaker 2: 05:47

Your space or the kernel.

Speaker 1: 05:48

Right. The the the the the guest operating system. Alright. So we the Yep. So so what is the challenge in terms of virtualizing this thing?

Speaker 1: 05:54

Because we've gotta what we we are developing our own hypervisor, Propellas, which is beehive based. And we need to tell the lie of the TSC, which we we we got some some hardware support from, certainly. But so you what's involved in in virtualizing it? And then as kind of an entree into the the the thorny problem that you found.

Speaker 2: 06:18

Yeah. So I think it's good to review kind of what the expectations are of system and its TSC. So I mentioned, like, think of HR time as nanoseconds since boot. So the expectation is that the TSC starts at 0 on boot, which is not actually always true due to errata, but, at least following processor reset. So it starts at 0, and then it increments forever unless it's written to or the processor is reset.

Speaker 2: 06:47

So from a guest operating system perspective, that's what you would expect. Right? Like, from the time that the guest is booted and having no real knowledge that it's on virtual hardware, its THC should behave that way. Then kind of another thing that may not be super obvious in that paradigm is the guest frequency also should not change. So when we're talking about live migration, removing guests between machines, as I mentioned earlier, the TSC between is is calibrated on boot, at least on a Lumos.

Speaker 2: 07:19

I think that's true of other systems as well. So even if 2 machines have the same CPU, they will not necessarily have the exact same CPU frequency. They're the same, you know, hardware. They're probably gonna be pretty similar, but they're not going to be precisely the same. And so that is another challenge of moving between machines.

Speaker 2: 07:40

I can get into, like, the details of how it's virtualized next.

Speaker 1: 07:43

I mean, it it and it's like it I think that the thing one of the these the things that is frustrating about this problem is that in order to be good enough, it really has to be quite good, that you need in order for NTP to be able to rely on it, you you need to actually I mean, you need to be within 64 parts per million, which is, like, very sloppy from a time perspective, but it's, like, it's pretty I mean, the approximations are often not good enough. You actually really, so the it's it's a tough problem in that regard that you can have these things that are appear to be the same frequency, but their actual their their minor differences in frequency are, in fact, enough to introduce error into guests. So it it's a problem.

Speaker 2: 08:23

Yeah. So in in terms of virtualizing it, there's sort of 2 different ops in the hardware, provided by both, Intel, VMX, and AMD SCM, for virtualizing the TSC. And by hardware, I mean, when you read the TSC from the guest perspective, generally, you will not take an a VM exit, into the hypervisor. And the first one is the TSC offset. So, you imagine, like, a simple case of just, like, a guest running on a machine, no migration, same frequency between the guest and the host.

Speaker 2: 09:01

When you think about what the guest TSC should be, you can simply take the TSC of when on the host of when the guest was booted. So if it's booted, like, I don't know, 2 days into the host lifetime for the host, then you can negate that and add that to the host to get the guest. So at the time that it boots, you know, that value plus negative itself is 0, and then it'll increment along with the host. So that the component of adding something to the host is the offset. So that's the first thing that we can virtualize.

Speaker 2: 09:36

And until I had started working on this problem for live migration, that's how Beehive things today. Was simply storing that TSC, negating it, and adding that to the VMCS when guest is run.

Speaker 1: 09:50

So, and just to define things for folks a little bit. So the VMCS, do you want to just describe that a little bit? Because the and and it me, you also wanna describe what a VM exit is, what a guest exit is, and why we wanna avoid them.

Speaker 2: 10:03

Yeah. So when I mean, pretty, like, like, basic operation of a guest is, I'm gonna talk about AMD because that's what I've been looking at more lately, but as, like, a VM the VM run instruction. And in that instruction, you provide page of data that has a bunch of stuff related to running the guest. So, you know, all those processor state, and then these things called control bits. And this the TSC offset is one of those.

Speaker 2: 10:32

And so that way, when the hardware enters guest context, it has all the state it needs to kind of construct the guest view of the world. Forgot what you're

Speaker 1: 10:41

Well, and I I mean, I'm just always kind of blown away by how much silicon support there is for this. So whenever you are running any instance that you have anywhere in the cloud, you are in one of these hardware contexts that system software has programmed and prepared, and then the hardware is doing this. And so the the work of of minimizing these guest exits where a guest can't do something. So it's like, I I don't know how to do this. And the hardware doesn't support this, so I need to kick out of this into the operating system.

Speaker 1: 11:12

And we wanna avoid those. We wanna allow the the guest to operate inside this hardware virtualized context as much as possible. So, I'd I'd

Speaker 2: 11:20

And and Yeah. Sorry. And and read is an is an instruction, an x86. And that one is generally not emulated, so we don't get the exit in that case. And if you think about how often that instruction is called, would be, like

Speaker 3: 11:34

Yeah.

Speaker 1: 11:34

But stated. Completely. Yeah. Absolutely. Yeah.

Speaker 1: 11:39

But it's so, so we we historically kinda prior to you doing this work, the what Beehive had done was a reasonable thing to do is I've got the offset. I'm gonna program them the VMCS, and off the guest goes.

Speaker 2: 11:54

That's right.

Speaker 1: 11:54

That's not good enough for us.

Speaker 2: 11:56

Yes. It's it's not good enough for migration. So this is kind of this area of the problem is where I started to get into the other thing you wanted to talk about, I think, which was the little simulator program I wrote. So, yeah, we're working kind of out the math here from first principles was something I found little difficult, in part because I don't think anyone has sat down and written formulas down that I needed before until I did for this process. So if you imagine just, like, taking a guest, and it's TSC offset and moving it to another machine, that doesn't make any sense on the context of a different machine because the TSC, is a monotonic counter.

Speaker 2: 12:43

It only makes sense in the context of a single environment. Additionally, like, the way that we are virtualizing this is through a relative time, right to the, to the guest booting on a host. And that relationship also has no bearing on a different system. So you definitely need some more information kind of in this to to start to figure out how to virtualize that in a new environment.

Speaker 1: 13:11

So in this, what was some of the new information that we, in terms of, like, what what what is the actual math of of having to to move this thing?

Speaker 2: 13:21

Yeah. So the way I think of it, is, like a new I called it effective time in my head. But basically, like, we have this relative point in time, or where the guest is relative to the host. We moved to a new host. We need to find that new relative point in time.

Speaker 2: 13:44

The one thing you can take is the the new host current TSC for when the guest says now running on this new host. But you also need to know how long guest has been running. Right? Because as I mentioned, like, the the new host has that TSC has no conception That the new host and its TSC is not sufficient to describe how long the guest has been running. So what kind of the interface I ended up landing on is taking a snapshot of host a's, TSC, or actually use HR time, but a representation of the TSC and the guest TSC because that is fully calculable.

Speaker 2: 14:22

And then we ship that over to the target host, and then using those two pieces of data can reconstruct a new offset. It's not as simple as just doing that because there's also frequency differences involved. Does that make sense so far?

Speaker 1: 14:37

Yeah. So far, I was actually gonna drop in a link to would it make it sense to drop in a link to your, the commit, the Lumos commit? That's got a lot of block comments that can help. I had to orient folks. I'm not sure if, drop that into the chat.

Speaker 1: 14:51

But

Speaker 2: 14:51

Yeah. It's it's hard to talk about this because there is a lot of math, but having some formulas in front of you is helpful.

Speaker 1: 14:59

So yeah. So it okay. So so you kinda bundle this bundle this up, and and you wanna describe describe actually just the live migration problem a little bit and why that's important?

Speaker 2: 15:10

Yeah. Will it per time specifically?

Speaker 1: 15:12

Or Just in general. I mean, like, why what I mean, why would we migrate a VM? That's like a giant pain in the ass.

Speaker 2: 15:19

That it is. But it it has a lot of benefits. Right? You know, from someone like perspective, which is running a a set of infrastructure, It is very useful to be able to do things like update, you know, servers or, you know, host operating system software, whatever it is, without inflicting downtime on customer instances. So migration is is a useful tool to have for all sorts of things in terms of managing infrastructure, in terms of, like, the mechanics of how it works.

Speaker 2: 15:55

It's like. Both very simple and very complicated, I think, sort of helpful, I think, to kind of decompose what a guest is into different pieces. So we have, like, it's CPUs, VCPUs. We have a bunch of emulated devices, which is done by both propolis and beehive. And then there's some kind of other nebulous things like a CPU state or, like, in the case of the TSC, this is sort of like a weird one that it doesn't really cleanly map to devices or CPU.

Speaker 2: 16:26

And then we also have memory. And so the live migration process itself kind of finds a way to pause all of those different things, take a snapshot, some set of state that is useful to reconstruct on the other side. So very simple, but also, no, there's a lot of details in there that can get very gnarly, such as the TSA.

Speaker 3: 16:47

And, Jordan, what are some of the constraints on migration? I mean, presumably, we want the guest not really to notice that this happened. What are some areas in which the guest can be sensitive to noticing that this happened and freaking out?

Speaker 2: 17:04

I mean, just time, Pat, like, jumping forward is one of them for sure. So if you were to imagine not doing anything with the TSC or rather the state that we stored stored previous to this work, it kind of, like, in state of what time the VM thinks it's in is completely unknown, and that can cause all sorts of problems. I

Speaker 1: 17:29

mean, it's bad when time jumps forward, and it's very, very, very bad when time jumps backwards. Like, that monotonically increase really can't go backwards.

Speaker 2: 17:38

Yeah. And to to be clear, like, time does jump forward a little bit because, in this work, we do account for the passing of time and migration, which I can also talk about. But. Yeah, like that, that would be very bad. Obviously you don't want to have state modify, like, situation where state is modified on the source after it's already been sent to the target.

Speaker 2: 18:01

So, that's why there's like a pause step in all of these different components, whether it's memory or CPU or, devices, so that you can have a clean state of the world. But, yeah, beyond that, some point you have to to stop the world, send everything over, and start again.

Speaker 1: 18:18

Yeah. And you I it's so important to have this. I mean and, Adam, you you did not. Not having this is just brutal when you can't move that. And because it it just, like, it just snowballs.

Speaker 1: 18:29

Right? Like, you can't ever take something down for maintenance, obviously, because you can't you know, no one you you there's no such thing as arranging downtime for people. You know what I mean? Yeah. But the everyone's like, well, actually, my thing is way too important.

Speaker 1: 18:41

We can't take it down. And then there's and then you end up with these kind of I when when people look at kinda utilization across a data center and see very low utilization, is part of the reason for it because you end up with these islands that are kind of unprovisionable that have, you know, a small number of resources that can't be deprovisioned because they there is no capacity to be assuming you don't have vMotion or you don't have enough technology that allows you to live migrate. It just it just snowballs. It's such a mess. So that's it as as difficult the problem is live migration is, we viewed it as a real constraint on the problem, not wanting to in the spirit of of fighting the last war, we just did not wanna have all the problems we had to join.

Speaker 1: 19:24

So Jordan, you are late. So, yes. TSC is one of these kind of oddballs that you gotta go deal with. And if you don't do so if you don't deal with it, the guest will be upset with

Speaker 2: 19:36

you. Right. And I definitely saw that in my kind of initial one one thing that I observed about working on this project, just like as a meta observation was that I did a lot of prototyping because I found the problem space very complicated and confusing and mathy, and it was helpful to just sort of build out small pieces at a time. So I started with, you know, migrating since we already had most of this migration work already done, just migrating a guest and seeing what happened. And sometimes things would work fine.

Speaker 2: 20:07

Sometimes it would just go off into space and never come back, or I would see errors from, like, get time of day.

Speaker 1: 20:15

And so this was when you're walking up to the problem. Like, so you haven't, like I have not solved this problem here. I just wanna get a flavor for what happens if we we would we don't solve this.

Speaker 2: 20:25

Yeah. That's right. And, like, I didn't expect it to work, but it was interesting how much how often it did just happen to work. And maybe that's because the uptime of the 2 hosts were similar and just kind of some implementation details and the way we're virtualizing it today.

Speaker 1: 20:38

I understand.

Speaker 2: 20:40

Definitely things could go very wrong and, like, that would be very difficult to debug.

Speaker 1: 20:46

And then so then you you mentioned, like, doing little prototypes. Do you wanna elaborate on those? Because I and I may I can drop in a link here because I I did love this little simulator you wrote.

Speaker 2: 20:57

Yeah. That one was, like, maybe kind of after some exploration a bit. So I I still haven't talked about frequency, how how we handle frequency. Yeah.

Speaker 1: 21:08

Yeah. Yeah. Go. Yeah. Go to that first.

Speaker 2: 21:11

Well, so, like, the the reason I mentioned it was that I kind of started the problem by only thinking about how to correct the offset. As I mentioned, you can compute a new one, assuming frequency is identical, using kind of a snapshot of state from when the guest is migrated, what it's tsc is and what the host is. But then you add in frequency and things get kind of weird. And so I was trying to kind of work out for myself what the math there, of how you construct the state in a correct way. So I ended up, writing, you know, a rest program to help me kind of prove out current, prove out that I had, like, done it correctly, basically.

Speaker 2: 21:52

So the answer is, like, not very complicated. Basically, anytime you're adding together values, you want them to be in the same frequency. Right? But when you have migration on the scene, and I guess boots, we give it the frequency of the post it's running on. Makes sense for the reasons.

Speaker 2: 22:15

But when it migrates, it might be running on a host of a different frequency. So the next kind of knob that is available in hardware virtualization, the frequency multiplier, which will take it's a ratio of the guest frequency to host frequency. And the hardware, when it will read the TSC, will then multiply that value by the frequency multiplier and then add the offset.

Speaker 1: 22:39

And this is really where where where the white lies turn into a real conspiracy here. I guess for the hardware is really helping you. It's no. I get it. Like, what's let's lie to this guest?

Speaker 1: 22:48

Like, I got I I'm gonna help you out. I'm gonna help you really lie to this guest. And it's like, you wanna run you wanna run this guest on a slower CPU? I gotcha. I'm gonna help you out.

Speaker 2: 22:57

Yeah. I mean, it is a straight up lie, but it's very convenient for what I needed to do. Also in the Intel manual versus the manual, they both describe this process pretty differently. So this was another error where I was just staring and, like, doing math in my notebook. But ultimately they both represent the multiplier with a fixed point number.

Speaker 2: 23:22

So, like, an example of a fixed point number is, like, dollars and cents where you have these 2 digits reserved after the decimal point that represents sense and have a different, case. And so in in binary, there's some number of bits that are reserved for the fractional component of the ratio. And then there some amount that are for the integer component. But the way that Intel describes. Frequency ratios is, like, very strange to me.

Speaker 2: 23:51

So, it seems like this

Speaker 3: 23:52

was all built and designed for this use case precisely.

Speaker 2: 23:58

Yeah. I mean, I can yes. Yeah. I assume so. The the language it talks about specifically, like, shifting the shifts that you do, but doesn't really motivate, like, what why you're shifting and multiplying.

Speaker 2: 24:12

That has to do

Speaker 3: 24:14

What's the range of ratios here? I'm sort of surprised that there's, like, a a a potentially large integer component, but maybe I'm I'm under, imagining how different various CPUs could be in terms of the rate at which

Speaker 1: 24:28

GSE is moving.

Speaker 2: 24:31

Yeah. That's a good question. One thing I thought about a lot and maybe, like, over kind of rotated on was kind of what reasonable limits were around what ratios should be allowed. Practice for, like, no, the, at least the, the focus of the work that I was doing was for oxide. So we're using, you know, the same hardware.

Speaker 2: 24:54

But, like, so this is in a limos and and should be general purpose. But I basically concluded that, like, I think I did, I forget the exact number I picked. So I think it was like, 15 X was the max I picked, and I basically went back and looked through a bunch of different, CPUs over time

Speaker 3: 25:11

to kind of see.

Speaker 2: 25:12

That's a That is, like, a whopper in the live room. The high, high bandwidth.

Speaker 1: 25:16

Whopper. You think you're on a CPU that's 15 x faster than what you're actually running on. Like, I hope I hope you are not like

Speaker 2: 25:25

Yeah.

Speaker 1: 25:26

That's like because you if you're like, wait a minute. Like, I'm looking at the passage of time versus the number of instructions that I'm executing. Yeah. It just feels like you're gonna get caught out at 15 x. I don't know.

Speaker 2: 25:37

It it's not something I expect to happen. This was me picking a limit to return an error in the kernel.

Speaker 3: 25:45

Yeah. Jordan, I think Not to put you too much on the spot in terms of numbers, but did you give any empirical data about, you know, as you say, in our lab, it's sort of all the same CPU migrating from the same flavor to the same flavor. But do you have a sense of, like, what the range of of ratios you saw even among, you know, these parts that, you know, came from came in the same box?

Speaker 2: 26:09

Yeah. I actually, at one point, did a survey of bunch of gimlets in Iraq, and they were within 14 parts per 1,000,000 of each other frequency. So definitely not a big difference at all.

Speaker 1: 26:21

Oh, yeah. I'm actually amazed at that. Yeesh. I'm amazed at that much. I mean, that's good that it's I mean, 14 BPM is well within, like, 64 BPM that is, like, the outer limit for NTP, but that is that's still a lot, man.

Speaker 1: 26:32

That's a lot

Speaker 2: 26:32

of Yeah. That was, like, one survey months ago. I basically did it as a way to quickly confirm that, you know, it looked pretty similar.

Speaker 1: 26:41

Yeah. Alright.

Speaker 3: 26:42

I thought

Speaker 2: 26:42

the NTP threshold was a bit bigger, though, like, 500 PPM.

Speaker 1: 26:47

Oh, interesting. Okay. Yeah. Maybe, well, just like my my Not

Speaker 2: 26:52

too bad.

Speaker 1: 26:53

My my fear of repeating anecdotes. I'm sure I got this wrong. But, so the but there is I mean, there is but there is variance. I mean, it's it's it's minor, but it is absolutely observable. And it will be observable, but the other problem about time is that it actually continues forever.

Speaker 1: 27:09

So a these small differences will actually add up to big differences in the future, to the point where a guest will lose track of time, or will realize that it needs it needs to make an adjustment. So it's like you it is actually important that we, even though small differences we compensate for correctly.

Speaker 2: 27:26

Yeah. So we can talk a little bit about that next if you'd like.

Speaker 1: 27:29

Sure.

Speaker 2: 27:30

Because that's a really hard problem, kind of the solution we landed on. So so, specifically, I'm talking about when you migrate between machines to the point from which, like, the guest world stops and to the point where it starts again, there's some amount of time there. Figuring out how to compute that's pretty difficult. Because, again, we're dealing with monotonic time, which has no relationship between machine a and machine b. So in the oxide back, we're running NTP on all these machines.

Speaker 2: 28:05

So as part of the interface snapshot, I also, like, added, a wall clock snapshot time, which is not a perfect way to measure thing a delta between machines because wall clock time, unlike monotonic time, can go backwards. But we assume pretty in a pretty load bearing way that NTP is running. So then using the wall clock time, we can determine how long migration took or rather how long it took between these two readings of the TSC, and move the guest TSC forward accordingly. But

Speaker 1: 28:41

the guest will see. And is that transparency that you wanna offer to the guest? That's that is just a it's kind of a courtesy to the guest. I I or does that end up being a correctness issue? Because you want the guest to make sure that if it's, I I mean, if it's using this to track wall time, then it's get completely confused.

Speaker 2: 28:58

Yeah. I mean, it it would be a pretty weird state if, like, real time moved forward, I don't know, 10 seconds or whatever, and the TSC didn't move at all.

Speaker 1: 29:07

Right. Sure. Yeah.

Speaker 2: 29:09

So that is kind of the solution we landed on, but it's very imperfect because it, again, really relies upon TP. But I think without NTP, otherwise, you end up implementing excuse me. Something like NTP.

Speaker 1: 29:24

Right. So we, and then so what the you figure out the make an estimate for what our kind of blackout time is, the time that we've been under general anesthesia, and then you add that presumably to the offset on the on the destination. And that needs to be, like, close to the last thing you do to actually run this thing. Right?

Speaker 2: 29:45

Yeah. So this actually gets a little tricky. Not it's not tricky, actually. It's this was a situation in this project where I found that my mental model of how things should work and the way the implementation should look were actually kind of different. And the result was that the implementation looked simpler.

Speaker 2: 30:06

So, yeah, my mental model was very much, like, we kind of measure this time between pausing the world and restarting the world. And then you add that difference, but doesn't really matter, like, actually, when you do that measurement, if that makes sense. Because the guest TSC is purely like, you're able to calculate it completely. So as long as you have a snapshot on the target or excuse me. A snapshot on the source, which is, like, the guest starts.

Speaker 2: 30:37

And then on the target, matter when you read or write that data, as long as you do them from one after the other, it will it'll work out. That is definitely my mental model for how I think about it, but doing it strictly that way introduces some weird complexity into the protocol that makes it a lot harder to reason about.

Speaker 1: 30:57

Yeah. Interesting. So it actually the implementation ended up being a bit simpler then.

Speaker 2: 31:01

Yep. We just you know, there's, like, an a step where we read that data, send it over, and and write it back out, and it doesn't have to be at the beginning and the end or something like that. But it's it's not intuitive. That's like a lot of the stuff I find very unintuitive. I have to stare up for a while.

Speaker 1: 31:19

Yeah. And and then so what were some of the issues that you had when you're developing this? I mean, I did you at at one point, you're like, I need to go write this simulator, which I love. I feel like that's Adam, we I I know that that Josh did that when we were working on on the the storage subsystem. I know that it's done that a bunch of times in our careers.

Speaker 1: 31:38

You're just like, alright. I need to go write a simplified version of this outside of the system. Like, forget the item where I can go just iterate on the kind of the core principles really quickly. And, Jordan, have you done that? I mean, I assume that's a that's a technique that, have you used that technique before, and kinda what, was it valuable here?

Speaker 2: 31:57

Yeah. I've I use that technique a lot in terms of, like, implementing a small piece of something to kind of verify it works on its own. You know, I'll I'll write a lot of, like I have a a directory on my dev machine called play that's just, like, filled with small c programs or rest programs or whatever. I see. Trying stuff out.

Speaker 2: 32:15

This is definitely a little bit more involved in what I normally do, but it proved to be really useful in part, I mean, for a bunch of reasons. One of them was that the, some of the math here, ended up needing to be done in assembly because it had to live in the kernel, so it wasn't going to be in rust. Well and also because some of the intermediate representations are in a 128 bits, which you can do with Rust, not the kernel. And so I I wrote this simulator first and had all this, like, math written in Rust and kind of had cases to handle overflow and help me, like, test out the edges sort of where those, like, different ratio limits should be. But then when it came time to actually write assembly code, I was able to just kind of plug it in to this Rust program and test that independently, before I even tried, you know, seeing it in my kernel change, which was really cool.

Speaker 2: 33:13

And I ended up writing a bunch of tests to run against it in that little simulator as well, which was mostly just for my own sanity. But, yeah, it was definitely valuable for many reasons.

Speaker 1: 33:25

Yeah. That's really cool. And then you've got something that is, like, a small in front of you that you can reason about. So when you do have an issue, you can really understand, like I mean, so when you're doing a 128 bit, so you're going from 64 bit to a 128 bit to do this math. And, I I mean, there's there's there's edge conditions they need to deal with.

Speaker 1: 33:42

Right? I'm sure it was much easier to deal with that at user level and kind of a use of a program.

Speaker 2: 33:47

Yeah. And, also, like, you know, again, starting, like, very prototype heavy, like, doing Rust first to help me really think about, like, where all these edge cases are. And then when it came time to kind of do it in a more unsafe environment, I felt like I had a good understanding of where things could trip.

Speaker 1: 34:06

It's very cool. I I don't know how many times yeah. I mean, it's kinda funny that we're using Rust, to prototype aspects of the system that have to be done in assembly or in c. That's where we are. I mean, it definitely makes sense how we got here, but, that's what I need.

Speaker 1: 34:20

So the and and did you end up finding bugs in that the simulator that proved to be useful?

Speaker 2: 34:27

Oh, definitely, when I was writing the assembly. The math itself isn't, like, that complicated. So I don't think I found any bugs necessarily with that. But when I was testing my changes on, like, real machines, it was really, really nice to be able to walk up to the system and grab things off of construct VM and MDB, and then pass it into this calculator I'd written. Like, another thing, I I was reviewing the code for the simulator before this, and I remembered that I added, Clap has a feature that lets you it looks like called maybe hex So you can pass in, like, either decimal or hex, which was so useful because I I tend to, like, think in decimal more, but then when I was getting stuff from MDB, it was always in hex.

Speaker 2: 35:10

It was very nice to be able to throw both

Speaker 1: 35:13

in there. Clap at this. I've been using parse int for this. I'm not so Clap's been a built in

Speaker 2: 35:18

for this. Yeah. That's how I discovered it. I got really tired of

Speaker 3: 35:23

Percent.

Speaker 2: 35:23

Which one I wanted.

Speaker 1: 35:24

What's this clap numb maybe hex? How long have you existed? Oh, sorry. I guess, for

Speaker 2: 35:30

clap numb. Yeah. Not clap, but still.

Speaker 1: 35:34

That's and have you used clap before? Clap is really I I know people got I I really like clap. I'm just so glad. I like clap. Yeah.

Speaker 1: 35:41

Adam?

Speaker 3: 35:42

I'm really into I'm really into clap. Also, clap makes use of, auto ref specialization, which is my one of my favorite, like, crazy macro y Rust hacks.

Speaker 1: 35:56

Go on.

Speaker 3: 35:56

What do

Speaker 1: 35:56

we what's autoref?

Speaker 3: 35:57

So have you ever seen, like, have you ever seen, like, an error from Clap that's like, I don't know how to convert this thing to a string. I tried ref the thing. I tried double ref the thing. I tried triple ref the thing. I tried quadruple ref the thing.

Speaker 3: 36:09

And, like, jeez. Like, clap. Give up already. Like, that's that's not right. Autoref specialization is a way of sort of narrowly implementing specialization.

Speaker 3: 36:18

That is to say, you know, kind of override like, deciding which trait impl to choose, and you can do this based on sort of prefixing a number of, you know, ampersands. And then it'll it'll try you know, the compiler will, if you have, like, ampersand ampersand ampersand, it'll try it with 1 fewer and then 2 fewer and so forth. So it's a way of narrowly kind of choosing whether in clap, for example, it uses, from str or it uses one of its built in, you know, wacky kind of format, you know, string parts or things.

Speaker 1: 37:00

Yeah. I did not notice that at all. And the I mean, I it it a a clap is it just makes it super easy to churn out these, like, programs that have kinda reasonable behavior that I really and I mean, with with only some things that drive me nuts.

Speaker 3: 37:16

It's it's great. And I'm with you. I mean, it's it's Jordan, not surprising at all that you turn to it here because, I mean, I would much rather like, I'd rather use Rust and Clap than, like, anything else for, like, a a tool that I run more than 3 times, I guess.

Speaker 1: 37:30

You know, the only thing I will say about Clap that I really like is that they've been really good about, pulling in things that obviously work. Because it used to be that Structopt was this kinda separate crate. And, Jordan, what you're using here as Clap directives were actually Structopt directives. And I thought and it was good that claps, like, you know what? We that's useful functionality.

Speaker 1: 37:49

We need to pull that in. And it's, it's good. I mean, I I think it's, clap if you're listening. I'd like to be able to use the minus h option for something that's not helped. There, I've set it.

Speaker 1: 37:59

Okay.

Speaker 3: 37:59

Here we are.

Speaker 1: 38:00

I'm done.

Speaker 3: 38:00

I actually thought I'd get a new clap clap feature just today. It's like defer. Basically, rather than having to populate all of these commands a priori, it gives you an opportunity to only populate subcommands or whatever when someone has invoked that subcommand. I just feel like they're always adding Oh,

Speaker 1: 38:21

I I would go to Clapcon.

Speaker 3: 38:25

Yeah. For sure.

Speaker 1: 38:26

No. I just feel like

Speaker 3: 38:27

I I

Speaker 1: 38:28

feel like the hallway track of Clapcon, I would learn a lot. I feel like there's like like like like this like, maybe Hex. I, which, I I I feel there's there there are a lot of, like, little doodads. Like, I use argument groups a lot.

Speaker 3: 38:41

Maybe hex. Yeah.

Speaker 1: 38:43

But are

Speaker 3: 38:43

these the Yeah.

Speaker 1: 38:45

You where you say, like, you know, these kind of commands are all in the same group, and then you need to have one of these, or you you can have multiples of these. I love the ability to set to specify that I that you can have a single value here, or you can have multiple values. This can take multiple values and set the delimiter. I mean, it's it's really I like the bells and whistles. It's fancy, and I like it.

Speaker 1: 39:03

And, I'm, so I'm yeah. Jordan, had you done clap based stuff before, or is this that was a

Speaker 2: 39:12

Oh, yeah. I've used it before. I definitely, like, stretched, you know, the features a bit that I used a bit more on this. They had, like one of the things I really wanted to be able to do was pass in a single command line, like, a specification for, a guest moving between multiple machines. So, like, start with, like, this guest frequency and this, you know, initial TSC, and then this host has this frequency and this TSC, and then at time e, like, migrate to this one.

Speaker 2: 39:42

And so that that proved to be a bit more difficult, but I landed on something I was happy with.

Speaker 1: 39:48

That's very cool. And then you've been able and then if you do see a problem kind of in the wild, it's you were mentioning pulling this kinda actual data from an actual machine with with the debugger, with MBB, and then being able to feed that into your simulator and know exactly what it's actually gonna go do.

Speaker 2: 40:03

Yeah. I think I did find at least 1 or I debugged at least one thing with it. So if I remember correctly, I did not, actually set the frequency multiplier on the system, and that resulted in some pretty confusing behavior.

Speaker 3: 40:19

I mean, it was just, like, had whatever garbage was there. So it was, like, running at some random multiple.

Speaker 2: 40:26

I think it was a reset value is one. So I think it was.

Speaker 3: 40:29

Oh, okay.

Speaker 2: 40:29

That was fine. But the the data I was seeing, like, on the structure didn't make any sense. Yeah. And it was, yeah, very useful. I at some point, I thought about, like, actually kind of writing something to pull stuff from MDB automatically and dump it in there, and I never quite needed that.

Speaker 2: 40:46

But it's definitely doable, you know, with a landline tool.

Speaker 1: 40:51

That's very great. So you're able to get comp you're able to get confidence in the assembly. You know, the assembly for this thing works. And then what would what were some of the the the latent kind of TSC problems that once you had kind of the the math figured out and the simulator figured out, what was involved in getting it kind of all the way integrated?

Speaker 2: 41:10

Yeah. I mean, I it pretty much worked. The interface and, like I said, math are not that complicated once you kind of work them all out. But it was helpful for verifying that the data I was seeing was correct and kind of add mock experiments. I did a lot of a test I ran all the time was like running ETSC.

Speaker 2: 41:34

I wrote like a program to all read ETSC directly and just running that binary in a loop every second. Then looking at the output on the console, migrating the guest, and then seeing where it picked up and making sure that those deltas looked about right, because it's not gonna match precisely, but, like, it should be about a couple seconds or whatever, and that they continue to increment at that same, frequency. That was fun. I did a lot of that and a lot of, like, doing that back and forth between multiple machines and, that was very useful for those quick calculations.

Speaker 1: 42:10

And then we will be able we so right now, our multiple SLEDs of variable frequency only by a couple parts per million. But we know that in the in in the future, when we have, for example, general based sleds, we you might be able to take a there may be a a much more significant delta in frequency, and we'll be able to effectively accommodate that with all of this.

Speaker 2: 42:32

Yeah. Because it's all in the hardware. There's also, like, anything that is hardware virtualized, I think, can basically also be software virtualized, and that, like, it's possible to kind of turn off, this feature for a read TSC, say. So we do emulate, read TSC instruction or because it's an MSR, also read MSR of the TSC MSR. And so all of that also does all of this math too, which is why it ended ended up needing to be in the kernel since, you know, most of this is done on the hardware.

Speaker 1: 43:06

Yeah. And can you can you speak to a little bit of of the the beehive testing? Because I mean, a lot of this was just actual testing apparatus in Beehive to test this. Do you wanna talk about that a little bit?

Speaker 2: 43:17

Yeah. That's I think that's really cool. I I demoed this internally. It was not something I wrote, but I used it a lot for testing. Basically, like, in our beehive test framework, we have the ability to kind of spin up a guest.

Speaker 2: 43:35

And it's a super simple guest with just, like we take the text of it and just smash a little test program in there. So it's, like, literally, you know, you know, maybe a 100 instructions or something, depending, what it is at most. And so

Speaker 1: 43:51

I don't

Speaker 2: 43:52

unit kernel. Oh, wow. So

Speaker 1: 43:56

It's alright.

Speaker 2: 43:57

It's alright. The as I wrote some tests that will, like, you know, do pretty basic stuff, like, one that I thought was kind of fun was the testing frequency control. So basically, like, changing the frequency through the through a search. I guess let me back up. Structure of this test is there's, like, a a guest file of, like, assembly or c, that's doing what the guest is doing.

Speaker 2: 44:24

So it's probably, you you know, reading some instruction or whatever. And then there's an actual test that will has simple interfaces to call into that. So through IO ports, the test can send data to the guest or read data from the guest. And so, frequency control test would basically, read the TSC and then wait, some number of or ticks on the test side. So every, like, in seconds, the test would see what the guest thought its TSC was, which is actually running in a real guest context, and then do some calculations to see whether that's within an acceptable range.

Speaker 2: 45:06

So we could do things like even though the system has whatever frequency it has, we can change the guest frequency through this interface, added for migration, and observe that the guest now was, like, seeing a different TSC frequency.

Speaker 1: 45:22

That's pretty cool. So this is actually having the guest report into effectively the hypervisor. Here's what I am seeing in terms the passage of time. Yep. And then we can check that to that that yeah.

Speaker 1: 45:31

That's really neat.

Speaker 2: 45:32

Yeah. It's super cool. And it made me feel a lot more confident, obviously, that it was working.

Speaker 1: 45:38

Right. Because, I mean, we I mean, we are obviously setting up these structures. We believe correctly, and we're but it would be nice to just rely on to know that the guest is seeing the correct passage of time. Again, and this is where we would see if we had incorrectly set up with the VMCS or if the hardware itself were broken, we would we would see it in this test, I assume.

Speaker 2: 45:57

Right. And it's also like the I had test systems, but they were also the same kind of CPU hardware. So even if the, you know, the frequency's different, it's like it's a lot harder to assess whether there's other things are incorrect if they're that close in frequency. But if you change it to 2 x, then that should, you know, definitely be noticeable. So did

Speaker 1: 46:19

you kinda early experiences where this kinda, like, it did, like, work okay some of the time before you done any of this work? Is that did that kind of change your disposition in terms of testing this? Like, okay. Like, it seeming to work is really not special. We've really got test the crap out of this to actually know that it's actually correct.

Speaker 2: 46:35

Yeah. I mean, I still have other types of tests I wanna do. Like, it is it's just very hard to to verify, but it like, I think I've I did everything I could from, like, a kind of basic testing perspective. I think something that would be able to do over time is more like stress testing or, like, even guests running for a really long time. Definitely moving between more interesting hardware would be cool.

Speaker 2: 47:08

Yeah. It's it's very it's something that's, like, very easy to see that the math looks right, but then, you know, these, like, small variances that can accumulate over time, I think, are harder to reason about, and that's why I'm kind of interested in doing longer running tests or, like, stress tests.

Speaker 1: 47:26

Yeah. Totally. And then and you did this for both I did this for Intel and AMD, I mean, because we obviously wanna support, for the the upstream work.

Speaker 2: 47:34

So I actually only did it for AMD because I didn't have access to an Intel cluster, but it is written such that, like, like, it's not going to break on an Intel machine. You can use these interfaces on Intel. The only thing you can't use is the frequency control. That was part that I did not go do.

Speaker 3: 47:55

Got it.

Speaker 2: 47:55

So you could, in theory, migrate between 2 Intel machines, but they would have to have the same frequency. Or we could, you know, change that to be more lax if that's something that would be useful, but it'd probably be better to just do that work.

Speaker 1: 48:07

Okay. So but you you've got the work where someone could go do that and kinda plug it in if they cared about

Speaker 2: 48:12

Yeah.

Speaker 1: 48:12

The and I get I guess this would be someone who's running, I guess it would have the Propellus. Something else that's doing live migration. Some other because Beehive is just the in I mean, maybe it's worth distinguishing the difference between Beehive and Propellus.

Speaker 2: 48:26

Yeah. So Beehive is the in kernel VMM portion that's, you know, in upstream of Lumos, and propolis is the user space component. So it there's a, lot of, like, emulation that happens at both layers. So some devices are done in propolis. There's some stuff around, like, device interrupts and sort of common shared things you might need between devices that is in Beehive.

Speaker 2: 48:53

The the interface that added for migration is at the Beehive level. So if someone wanted to do it's a different user space or use Propellus, that that's all it's all at the user space, like, boundary. So it's available.

Speaker 1: 49:09

And and one of the problems we're trying to solve with propolis is that yeah. I don't know if folks have been into either chemo. Have you been in the chemo source, Adam?

Speaker 3: 49:19

No. Never. Oh, god.

Speaker 1: 49:21

It's it's really it's fair. I mean, like, c is, like, okay. Fine. Like, c has got some, some safety problems, but you go into a queue

Speaker 3: 49:31

and you're

Speaker 1: 49:31

like, I mean, this is just and I mean, it it in part, it just means that it's pretty easy to actually make chemo croak, by doing things that are out of bounds to respect to a device, for example, which is, on the one hand, not something you would do on actual hardware because, like, the device will I mean, you you you're going it's gonna have ill effects on actual hardware. We're just kind of disincentivizing it, you know, on on Camu, but it's we really wanted something that was much better. And the Beehive's way and had a lot of the same problems. So that's part of the reason why we when we started the company, Patrick, it it Propolis, by the way, is BeaGlue.

Speaker 2: 50:10

It's important to mention. Yeah. It

Speaker 1: 50:12

is important to mention. I think it's a it's a very it's a great name. So, that's the part of and we, we looked at Firecracker, but it was Firecracker's fine, but it's definitely not doesn't have the same objective that we had around running big VMs that are full featured VMs, running Windows, or running Linux, or running VST, and can run for a long time, and can live migrate, and it's it Firecracker was designed for much smaller kind of things.

Speaker 2: 50:38

Earlier, you asked about, like, the testing pattern of where things kinda work sometimes, so it's, like, hard to tell that it's fixed. And I was thinking about how a lot of debugging I did for this was actually not around the TSC at all. It was all for other problems with migration because that is still something that was leading edge, at the time. So a lot of what I was debugging were, like, serial console issues or, like, issues with the way we migrated certain devices. And that was all very valuable, but it's kind of, like, funny to me that most of the debugging I did was for other things, but that's just life at the bleeding edge, I think.

Speaker 1: 51:13

Yeah. So it's life at the bleeding edge. So you yeah. Describe that a little bit because, I mean, it's, like, it is actually really hard to migrate a bunch of stuff. And, like, migrating a serial port or a serial console is actually really hard.

Speaker 3: 51:24

Mhmm.

Speaker 1: 51:24

And, like, why would you bother like, why do we even like, why migrate the serial console? Why is that important?

Speaker 2: 51:30

Mean, you know, as engineers, I think a lot of us love the serial console as an out of band mechanism. Right? It there's there's, I mean, there's a lot that goes into migrating the console. The itself is backed by couple devices, like the u r simulated device. But then also, like, you know, we have all this this control plane upstack work around all of this that kind of provides the, you know, the actual feature to users to provision instances and use the console.

Speaker 2: 52:05

And all of that has its own state that also gets dealt with in migration. So it's actually like a pretty complex stack in the end, but it's it was it is very nice how much effort, particularly Lyft has put into making migration of the console work so well. And it made it a lot easier for me to test this actually because I was you know, I didn't have to set up networking on the guest. I could just hop on the console. It was super simple and, like, run whatever test I was gonna run, migrate while those were running.

Speaker 2: 52:36

Yeah. I mean

Speaker 1: 52:37

I gotta say, I I I love our collective emphasis on the serial console. I I just feel like it's like It's incredible. Yeah. It it it really is incredible for, like, a, I mean, for a a machine that has no serial port, and there is no serial port on a Gimlet, we we love us, the serial console for our serial consoles for for guests are really valuable. Because the other thing is, like, you can then plumb that through into the actual web console.

Speaker 1: 53:06

And and David Crespo and and his team have done such a tight job on that stuff. I just think it they just it's we I I I, for 1, support our our emphasis on the serial console, and because it is it it it's it's really essential if you're actually looking after a a VM and trying to understand, in particular, like, why does networking not work? If you don't have a serial console, that is a VM that's lost at sea. So it actually is really, really important that the serial console work all of the time and be really robust. And that that means

Speaker 2: 53:36

development. Right? Like, for a while, networking wasn't super stable in migration, and it would have been hard for me to run those tests if I didn't have the console. It's really load bearing. I was actually also thinking about how, kind of a lot of the ways that we as a team worked on migration, I think are some good patterns, I think, to emulate in terms of engineering over a long period of time.

Speaker 2: 54:03

But for a while, we had this, like, variable in the kernel, basically, that was like, allow writing state to these kind of interfaces that are used for migration. And by default, that was off because we were still you know, it's still, like I said, bleeding edge work, and we were finding issues and didn't want to break upstream users somehow accidentally use that interface, I guess. That really enabled us to continue merging upstream, like, you know, and iterating quickly on all these different pieces of migration that were required. So, again, just like looking back on this project, a lot of things I observe are like how much their work enabled development of this, and how, like, I think we continue to do that as a team, which is pretty cool.

Speaker 1: 54:47

Yeah. Definitely. So you elaborated that a little bit. So we had this effectively a flag that when you say upstream, you're talking about upstream, you must be. Yeah.

Speaker 1: 54:56

And so we we really want to live upstream as much as possible, and upstream as much for our work as possible. Propellus is obviously all open. We just have we've seen the the disadvantages of, of kind of unintentional divergence when you don't really aggressively upstream. So it's a that's been really important for us.

Speaker 2: 55:16

I mean, if I think about us all having to work on a project branch for as long as migration work has been ongoing, it would be unbelievable, the diff. Like, so difficult to manage.

Speaker 1: 55:30

Oh, man. Yeah. There were and, I mean, Adam, you remember when when DTrace, SMF, CFS, fire engine were all targeting the same release of the operating system, and it got it it it it got real rocky. We went in first. Right?

Speaker 1: 55:44

I'm not I'm not just Brett on either.

Speaker 3: 55:45

I think that I think that's right. And not by accident because we were kinda forced to merge with us. Right. And other folks had actually preemptively merged because they wanted to use teachers in their own development.

Speaker 1: 55:59

That was great, actually. The fact that the the the folks so I guess we were actually not really suffering with that problem because we're the, but it is yeah. It is it is tough, but when you've got a big body of work that's living downstream, it is really and and, Jordan, I think in terms of, like, finding you, like, what are some iterative paths that you can use to get stuff upstream, is really important to get, like, to be able to to, it it because you also don't wanna have this kind of this problem where you still have, like, the last 10% is actually the 90% of the work. You really need to kind of polish bits as you go and get that complete upstream. That's always a challenge for something that's I mean, this is a a big build, and then it's kind of a a multiyear project.

Speaker 2: 56:46

And it enables new testing too, anyway.

Speaker 1: 56:50

Wait. Wait. What do you mean?

Speaker 2: 56:51

Oh, just that, like, the law if it's in upstream, you know, we're we're collectively testing it more if we're using those interfaces. I don't

Speaker 1: 57:01

know. Yeah. And when I also feel that, like, the the other advantage of upstream this stuff is it really forces us to kind of explain everything that we've done to kind of the world. And, I mean, this this work that you did had a lot of really great block comments in here explaining not of course, we were done that. We're not streaming it, but, very, verbose comments that we obviously all love, and kind of explaining what the problem is and and the solution.

Speaker 2: 57:28

Yeah. For sure.

Speaker 1: 57:30

It was great work. Well, Jordan, I I this is a lot of I I think this is one of these problems. Do you think this problem was gonna be, like, I mean, surely, this is, like, deceptively complicated. We're just like, how complicated can it be? You're like, oh my god.

Speaker 1: 57:42

This actually is pretty I mean, doesn't this problem fall into that category for you?

Speaker 2: 57:46

I think so. It's us I think the hardest thing for me is, like, a lot of it is not intuitive if you're like, I tried to write things down in a very, you know, authoritative way. Right? But, coming into that, I didn't I couldn't find a lot of writing about this problem even though surely other people have done it. Maybe it's proprietary.

Speaker 2: 58:07

And so coming up to it for the first time and kind of working through everything from first principles and going off of existing documentation and the manuals or whatever, it was like a lot of things just didn't feel intuitive to me. So writing it all down and and block comments was very important. Like, before this call, I reviewed my own block comments because it's been a little bit. I'm like, I probably forgot all these details.

Speaker 1: 58:32

Oh, yeah. And Yeah. You and me both. I was reading my my blog comment in cyclic.c as of this writing in 1999. And I I it's like, alright.

Speaker 1: 58:41

Yeah. Exactly. Right. I know. Stone.

Speaker 1: 58:44

Yeah. I know I know that you were you were definitely alive in 1999, but, like, certainly my own children were not. So it it's, it it it it's it's turning into a while ago. But the the it's very nice to have actually, Jordan, I gotta ask you this because you're I I mean, you're someone who who writes very well and fluidly. Do you when you are looking at an old block comment of yours, do you remember where you were when you wrote the block comment?

Speaker 2: 59:10

Like, physically?

Speaker 1: 59:11

Yeah. Am I the only one that does this? Am I being weird?

Speaker 2: 59:14

Time.

Speaker 1: 59:14

Am I doing the weird thing right now?

Speaker 2: 59:16

Pins. Yeah.

Speaker 1: 59:18

Are you just saying that for my benefit? Am I being really weird?

Speaker 2: 59:20

Do you I only I only work in basically two locations, so it's it's pretty easy.

Speaker 1: 59:25

I remember, like, I and I don't know maybe if it's the if it's the time I was where I wrote wrote a lot of stuff, but I just, like, I can remember, like, being I wrote so many so much code and especially comments on Caltrain, Adam.

Speaker 3: 59:39

Right. Yeah. I think actually from that kinda era, I have, like, stronger memories of, like, being on a plane to Shanghai or, this part of DTrace I wrote when a house 2 doors down caught fire, and I couldn't get back to that. But, maybe less for now. It's become more, like, homogeneous.

Speaker 1: 59:58

We're not gonna just drive past that one, are we? We're just gonna pretend that we

Speaker 3: 01:00:00

actually remember that fire. It was, like, 2 blocks from you too.

Speaker 1: 01:00:02

It's a huge fire. Remember that fire. My first thought on that fire was not, like, time to write some code. I was just, like, back to you. Food during that fire?

Speaker 3: 01:00:12

No. No. No. Not during the fire. I think, like, after it was in hand, it was, like, 4 in the morning, and I was, like, okay.

Speaker 3: 01:00:19

Now what?

Speaker 1: 01:00:22

Okay. I different result. So just for context, you and I lived only, like, we that fire was, like, right between us.

Speaker 3: 01:00:28

That's right. That's right.

Speaker 1: 01:00:30

And the that is still, like, the biggest house fire I feel I've ever seen. I woke up to ash going through our apartment.

Speaker 3: 01:00:38

Yeah. 2nd closest for me now. The closest was one where I came back home and I was like, okay. Let's make sure we know where the dog's leashes are and everyone's shoes are, like, because I was only a few doors down from us. But,

Speaker 1: 01:00:50

that one was only that one was kitty corner across the street and one house down, and it was Yeah.

Speaker 3: 01:00:54

There you go.

Speaker 1: 01:00:55

That that house was engulfed in flames. I mean, that was and the house is all about one another in San Francisco. And, man, people can talk about the decay of San Francisco all they want. Man, that city knows what to do when it's burning to the ground. I mean, they are there is something deep in the DNA of that city.

Speaker 1: 01:01:11

It's like, here is what we do, and they I mean, I actually just still remember really viscerally because they this thing is, like and that was we had, like, ash coming through the apartment. I could hear the fire. I could hear it crackling. I could feel the heat and watching, and it was a, you know, a 3 or 4 alarm fire. So you had, like I mean, they brought the the the cavalry to that thing.

Speaker 1: 01:01:34

And watching the the the the firefighters, because I had looked never been so close to something where it always, like, felt like fear. You know? And watching the firefighters, like, put on their gear to jump into the building, I'm like, man, that is that is, that's crazy. That feel I mean, the the the and I've I've always kind of, like, you know, in whatever domain you are, the courage to kind of jump into the blaze is I mean, that's the real hallmark of the professional, you know, is, like, jumping into the blaze and not being afraid of it and not being afraid. And for us in software, it's Jordan.

Speaker 1: 01:02:08

It's like you not being able to, you know, not being afraid of, you know, with of jumping into the problem or starting the prototype or what have you. I think I feel it's, like, less less bodily harm involved in general.

Speaker 3: 01:02:19

That's how you feel. I just

Speaker 1: 01:02:21

can't believe that after that fire, you're like, okay. Well, time to write some code. I'm like, time

Speaker 3: 01:02:26

to, like, begin to, like,

Speaker 1: 01:02:27

begin to fire the fire.

Speaker 3: 01:02:28

Jump into the blaze. I was like, these guys look at me. Just napping in my back.

Speaker 1: 01:02:33

Oh, well played. It's time to jump into detracedubber.s. I'm gonna put on my oxygen tank and go into detracefish.

Speaker 3: 01:02:42

Yeah. There you go.

Speaker 1: 01:02:45

Well, Jordan, this has been I I I I don't still don't know exactly how we got to house fires from from this, but, metaphorical or otherwise, but this has been great, and it's really great work. It's exciting to see, and, I mean, you've been doing a a bunch of work on the on the Propos side. So this is, you've got more work in this department coming up, presumably.

Speaker 2: 01:03:08

Yeah. I've, been jumping around the stack a little bit lately, but definitely got lots more on my mind about propolis.

Speaker 1: 01:03:14

So Did I mispronounce Calvary? Cavalry. Oh, no. We're doing this again. I'm sorry.

Speaker 1: 01:03:18

I'm in the chat. I I, you know, I didn't even thought that when I was pronouncing it. Cal how do you pronounce it? Calvary. Adam?

Speaker 1: 01:03:26

Cavalry?

Speaker 3: 01:03:27

I would just I just avoid it. I don't know. I definitely one of those ones.

Speaker 2: 01:03:30

Put it wrong.

Speaker 3: 01:03:32

Our our marine in chat says cavalry.

Speaker 1: 01:03:36

Cavalry. Did I pronounce it Dan, did did I pronounce it correctly or incorrectly when I initially pronounce it? That's actually all I need to know. It's just that high order bit. Why am I even asking this question?

Speaker 1: 01:03:46

I don't wanna know. I I I do pronounce the oh, yeah. Exactly. There you go. I Dan, well played.

Speaker 1: 01:03:54

And there's the block on man Adam from yeah. 1999. The Go Go Days 1999. I remember where I was when I wrote that thing. I remember I was at a cafe in Palo Alto.

Speaker 1: 01:04:04

Alright. Well, this has been a lot of fun. And, Jordan, again, great going on this work, and, a lot of fun other work besides, in terms of not simulator, but all the testing you did in the stereo console in the works has been, has been been a good a good little, whirlwind tour of what we've done for for VM migration.

Speaker 2: 01:04:24

Oh, thank you.

Speaker 1: 01:04:25

You bet. And then, Adam, next week, I think we are gonna have our colleague, Greg Colombo to talk about TLA Plus and formal methods and some of the work that he's done in that that department.

Speaker 2: 01:04:38

So hypervisor special.

Speaker 1: 01:04:40

Another hypervisor special. It it it is it's hypervisor month here on Oxide and Friends.

Speaker 2: 01:04:45

I'm into it.

Speaker 1: 01:04:46

Exactly. This is like Shark Week for Oxide and Friends. We do we we do hypervisor month, and we is Tom Lyon here? I'm so Tom, I hope that's you in the chat. I I gotta tell you, I've been listening to a lot of the I mean, Adam, we were listening to a lot of the back catalog in preparation for, with when we did our on the metal on Oxide and Friends.

Speaker 3: 01:05:10

Trust her. Yeah.

Speaker 1: 01:05:11

I like, we gotta get Tom in here. So, Tom, I'm glad you're here. And we'll we've gotta, I definitely, missed you and, loved your line about the the funeral for Opteron being a goldfish funeral. Still one of my all time favorite lines. So on that note, we will see hypervisor month continues next week.

Speaker 1: 01:05:33

Join us. It. Alright. See you next week everybody. Take care.

Creators and Guests

Host

Adam Leventhal

Host

Bryan Cantrill

Broadcast by

Creators and Guests

headphones Listen Anywhere

Listen Anywhere