Helios

Speaker 1:

Because right? Yeah, your mech. I what's going on?

Speaker 2:

Is this a Gen X thing?

Speaker 1:

What's this? What is this?

Speaker 2:

Like, just to get that out of

Speaker 1:

the way?

Speaker 3:

Rolling Stones? No.

Speaker 2:

Oh, Michael Jagger.

Speaker 3:

I'm excuse me, Michael. My

Speaker 1:

that's a are we you actually making a Mick Jagger reference? That's so weird.

Speaker 2:

Mick, that just happened.

Speaker 1:

That's not even a Gen X reference. That's not even

Speaker 2:

Isn't that a boomer thing? Totally.

Speaker 1:

A boomer thing.

Speaker 3:

Look. I'm trying to appeal across generations here. I feel like we've been too Zoomer recently.

Speaker 2:

Okay. Okay. We need a we need a demographic to skew Alda.

Speaker 1:

That's Look.

Speaker 3:

I'm looking at the stats.

Speaker 1:

Listen, I'm just just running the numbers, and

Speaker 2:

Too many years.

Speaker 1:

Today's been brought to you by Propecia. We can just have, like, pharmaceutical ads, I guess, is propitious.

Speaker 4:

Brian, didn't you read the comments today? This is not the podcast we put ads on. It's the other podcast we

Speaker 1:

put ads on.

Speaker 3:

I do.

Speaker 1:

I I it's true. I know. I know. It's it's the other podcast. But would I I sort of know if, like, this is just a commitment to the bit, Adam, or if you were actually referring to me as Mick.

Speaker 1:

Is that is this where we are? A 100%.

Speaker 3:

Not only was I referring to you as Mick, I have referred to you as Mick as Mick. Oh my god. In this context

Speaker 2:

many times.

Speaker 1:

Though There's as they say, there is always a chat that includes everyone except for yourself, and in which in which you are talked about. And in the like, clearly, this in this chat, like, I am known as Mick.

Speaker 3:

It's mostly a chat with between me and nobody else.

Speaker 2:

So who's this with?

Speaker 1:

Well, I mean, there's some I'm Keith Richards.

Speaker 3:

Jeez. Do I have to draw a picture?

Speaker 1:

Alright. Alright. Alright. Come on. Why do why

Speaker 3:

do you think I've been wearing all that shit on my head?

Speaker 2:

Walking around with your arms in front of you a little bit. Yeah. I mean, if your

Speaker 1:

teeth Richards, I do kinda want you to draw draw a picture. Can you give can can you get, like, midjourney or Dolly on you as Keith Richards?

Speaker 2:

No. You as Keith Richards drawing you as Keith Richards. I think

Speaker 1:

Oh, there we go. Very very very meta. Alright. Well, look. Yeah.

Speaker 1:

I guess I guess Nick is here. I I still feel extremely uncomfortable, but, here we go. Welcome, everybody. Hey. How's my is my audio okay?

Speaker 1:

Is this sound It's fine.

Speaker 3:

It's fine. It's kinda good.

Speaker 4:

Professional podcast quality audio.

Speaker 2:

The the pressure

Speaker 1:

that goes heavy. I know

Speaker 2:

you did the you did the voice slight intonation that happens at the start of the episode with the hello thing. So I think it's all Hello. It's all we

Speaker 1:

Is that is that what we're talking about?

Speaker 2:

That's the yep. That's the one.

Speaker 1:

Is this retribution for me pointing out that you when you say good meeting, you are indicating that it's time for everyone to leave? Is that what this is about? A little bit. It's fine.

Speaker 2:

Good good good podcast. Good podcast. Alright. Well, that's it, folks.

Speaker 1:

Alright. Well, we are here. Actually so before we get into Helios, I am actually reading a, so, Adam, I I read High Noon, which Oh, nice.

Speaker 3:

You can

Speaker 1:

talk about for. Right? And then we we talked you talked about the books in the box episode, you gave or regifted to me for my birthday.

Speaker 3:

Re gifted. Yes.

Speaker 1:

What's perspective as long as you're comfortable with that. The and then that book refers to another book about the early history of sun called sunburst. Apparently, you cannot write a book about sun without making a sun pun. And I don't know if you saw this one. They only made like one reference to it in high noon.

Speaker 1:

And this is a book written in 1990 and it's it's it's really quite good. It's it's interesting. I mean, it goes into, so it covers those early years, and it's fascinating. You know, it kind of decried the the lack of an early history of Sun. And, this one's got a got a lot of good stuff on it.

Speaker 1:

And in particular, there is a lot of good early Unix history and why and kind of the the Unix wars and why it was such a big deal that Sun was all in on UNIX, obviously. And this is kind of before the before the rise of Windows as a server alternative. So this is really the the UNIX wars are strictly the the the the Unisee the proprietary Unisees all fighting among themselves, and then kind of fighting VMS. But it's, felt very timely because we, talking about the operating system that comes with the computer, and it's a big deal. So and and it was a big deal then, and they they talk about the AT and T deal a lot, where the and, I mean, the the very ill conceived AT and T deal, which is really, where SunOS 4 dot x, becomes it it it's thrown into a with a particle accelerator and collided with, SVR 4 to generate what becomes source 2 dot x, which is pretty interesting.

Speaker 1:

So which they call system by release 5 dot o, is what Bill Joy calls it in this book.

Speaker 3:

Nice. Well, my birthday is coming up in 6 months or so if you need to

Speaker 1:

regift it.

Speaker 5:

Thank you.

Speaker 1:

Yeah. That's that's, that's an excellent suggestion. I think I might I might actually have to regift it. I might have to to, regift it. Well and it's this is one of these, it's a library copy.

Speaker 1:

How do you feel about when you're reading these library copies? Do you feel that you are an accessory to a crime, or do you feel I are engaged to like cycling of our precious literature?

Speaker 3:

I think there's a little upcycling. And for these kinds of books, actually, I feel like there is a sad, like, there's a there's a person in the back office of the library who has thought, you know, this high noon book, or sunburst, like, people are gonna love this thing.

Speaker 1:

People are gonna love this one's gonna go.

Speaker 2:

This one's

Speaker 1:

gonna go. This one's gonna go. Right.

Speaker 2:

Yeah.

Speaker 1:

We're gonna order 2 copies, like, and Yeah.

Speaker 3:

And you look at like the, you know, the stamps in the back of the book, as you're never checked out, nobody has ever read this thing. And, you know, it's it's either me or the short leg of somebody's coffee table. So Yeah.

Speaker 1:

I was just gonna say it's, like, not even really fit for a monitor stand, unfortunately. Like, the the just the geometry is wrong for a monitor stand. These poor books, and they end up they end up here. Alright. Well, yeah, that's a good idea.

Speaker 1:

That is it's a good good re gifting idea from, from Jack from the library. So we opened up Helios over the weekend, something we'd been actually meaning to do for quite some time. Josh, were you surprised?

Speaker 2:

Years, I suppose, really.

Speaker 1:

Years and

Speaker 2:

where it was. Think about it.

Speaker 1:

Were you at all surprised when this became the top story on Hacker News today? And by the way, what did

Speaker 2:

you see? I really surprised. Found out about that afterwards, like, later.

Speaker 1:

Yeah. That's for the past.

Speaker 2:

There were already a 100 comments. Yeah. I I I said this earlier, Brian, but the thing I wanted to get

Speaker 4:

off my chest is I've been responding to people who've been asking on places like Hacker News about it open sourcing and being like, yeah, I I also think it should be open sourced. I should, like, ping the appropriate people and see what we can do to get that done, and then have not done any of that over the last couple months. And then today, I'll just open source it.

Speaker 2:

So I

Speaker 4:

was like, oh my god. I didn't even do anything. But also, like, everything's saying that, but I had nothing to do with this whatsoever. I did 0 work.

Speaker 3:

Don't worry.

Speaker 1:

That's not because it's good news for people.

Speaker 2:

People have other people other than you have remembered to ask me, and then I have also not done anything about it. So I'm like,

Speaker 1:

it's Well, and this is what it is.

Speaker 2:

Dragging on a bit.

Speaker 1:

Yeah. It's dragging on a bit. And and for good reason. Right? I mean, the just to be clear, like, there's nothing this was not like hand wringing over the top secret proprietary stuff, in Helios.

Speaker 1:

And maybe to kind of explain why it was just a little more work than you might expect to get this thing open, of the of the purely of the kind of, just quotidian kind of work. Right? I mean, it was just a hammer swing to get this thing open just because you gotta make sure that it can build when you're not in our organization and so on. Josh, do you wanna describe, like, just what Helios is? And in particular, I think it's important to clarify the relationship between Helios and a Lumos in between a distribution and an operating system.

Speaker 1:

I thought that's a I thought that's a lightning rod, but what is Helios, Josh?

Speaker 2:

Jesus. Alright. I'm not gonna touch the operating system question.

Speaker 1:

Okay. I think we will I mean, you can touch it for you? I will touch it.

Speaker 2:

I I'll go first.

Speaker 1:

Yeah. You can.

Speaker 2:

Then you can

Speaker 1:

On second thought.

Speaker 2:

Then you can bring in the the supervisor bit decision. Yeah. There you go. Yeah. Alright.

Speaker 2:

Yeah. So it's a distribution of the Lumos like, Ubuntu is a distribution of Linux, I guess, in a sense. And then which really is I mean, I think a distribution is just a whole bunch of software from a lot of different places. And integrating them in some way that they can be used together without like, with orders of magnitude less work by the person who's consuming the distribution than the people that put it together to begin with. So, like, you know, you'll have install media that might let you install it on a physical desktop or something.

Speaker 2:

You might have disk images that work in VMs in the cloud or something and you didn't have to put those together because someone has made a distribution, like, a collection of source and binary software that all sorts of works together already. I think it's probably really the the most concrete part of it. I

Speaker 1:

think it's a very good definition.

Speaker 2:

Yeah. I mean, that's, you know, that's like that's nothing new. People are doing that all the time. Like the best days are a distribution. Yes.

Speaker 2:

You know, that all the different Linux distributions, Windows is a distribution of, you know, itself, I suppose. But yeah. I mean, it did so we took the Lumos, which delivers, kernel and a c library and hundreds of other libraries, and a lot of cool Unix system utilities like grep. It's pretty similar to freeBSD or netBSD or openBSD in that regard in that there's a lot of kernel and user land stuff, and the user land stuff is both programs and libraries, and they're all kind of meant to work together.

Speaker 1:

When I I do think that this is a really important distinction, and this is what I meant. This is maybe this is the voltage that I that you're that I intended to touch. But the that I think when people hear if you're coming strictly from kind of a Linux perspective, a distribution is a massive engineering undertaking because Linux itself is just a kernel, and you actually have a whole bunch of other decisions you need to make. There's a lot

Speaker 2:

of owl left to draw.

Speaker 1:

There's a lot of owl left

Speaker 2:

to draw it with that horse picture. Right? With the horse Totally. The horse is different at both ends or whatever. Yeah.

Speaker 1:

A lot of LibOWLs that must be that that must be integrated. Well, because you you gotta figure out, like, you know, which Libc are you gonna use. And then you need to do risk management around that. So you need to be like, okay. So now we're gonna, like, we are gonna, you know, we're gonna use glibc, which would be kind of common, but people do use muscle.

Speaker 1:

It's like, okay. Now we're gonna we need to, do we get flip patches on that? Are we gonna how are we gonna do risk management? Gonna kinda test this thing as a unit. And it's like, you're

Speaker 2:

not How much how much harder do you want dynamic linking to be than it needs to be? Like, I think it's really the decision with the the Libsies over there a lot of the time.

Speaker 1:

Right. And the importantly, because Lumos, like the BSDs, does include system libraries, does include commands, does include a whole bunch of stuff that is designed and is kinda risk managed as a unit. It is this is less bonkers than one might think, to

Speaker 2:

You know, they only need, like, 10 or 20 additional packages to to really make a relatively complete base system. And and that's, you know, sometimes only because people expect things like bash, right, as a shell. Right. So, like, bash is, like, one of the things that you would need to to to get a a build off together. And then a handful of things like, you know, live XML 2 and like a TLS library and a couple of other bits and pieces that, that, that we don't deliver out of the core operating system.

Speaker 2:

We depend on, external packages, you know, for, for a handful of things, you know, and like if you need a jetty K or something like, or C compiler or whatever, like the runtime for the seat compiler comes from the c compiler, not necessarily like the that is the GCC specific parts of the runtime or whatever. Like, I'll lay it on top.

Speaker 1:

Right. And so when we I was trying to remember kind of the history of this thing because we I mean, we I mean and we can kinda get in, and I think it it I know, Steve, this is definitely a lot of folks on the Hacker News thread. We're definitely asking not questions about the mechanics of the distribution, but more asking broader questions about the most and and what we've done there. And we can kinda talk about that. But, Josh, do you wanna talk about, like, just the the mechanics of OYOS and kinda when we started with that and what, you know, how you kind of iterated in terms because it is it's a it's a tough cold start program the problem to actually build something that's that's bootable here.

Speaker 2:

Yeah. I started with I started looking at some of the bits of OpenIndiana originally. I spent a month or 2 looking at that just because they had some aspects of their build system made a lot of sense to me, and I feel like bits of it are, like, similar to some of the stuff that Oracle is still pushing out open source wise, which is interesting. But in

Speaker 1:

the end, they're being a another distribution rush. Of of a Lumos, a a Debian like, this was original. Yeah. I think it was it was right. I mean, where those the intent?

Speaker 1:

Right? The intent this was originally Certainly. By the late Ian Murdoch, when he was his son.

Speaker 2:

It was definitely meant to well, no. So, like, I mean, open slars, right, was was like that. And then open Indiana is just, like, the gate closed, so people were trying to do something else. And it it has many echoes, I think, of the original, but it was meant to be the name, I think, is an homage more than necessarily direct continuance of of all that stuff. Interesting.

Speaker 2:

Yeah. Interesting. But there are many things that are similar about it, certainly. And, but it was and is really, I think, like, the people that there aren't that many people that work on it, and they do as much as they can, but there's so much software in it. And there always has been yeah.

Speaker 2:

Like, they have build recipes for everything, you know, like lots of many more things than we would care to ship. And I think that they struggle a little bit to get around to bumping versions on things and security responses and stuff. And so I tried to make that work and didn't work out that well. So we ended up instead looking at OmniOS. This is another distribution of Olimos and they have an LTS release that gets supported for, I think 3 or 4 years or something.

Speaker 2:

You can see on their website they have one of those Gantt diagrams that overlaps the LTS release schedule and stuff. But, it is a much smaller body of software, which is kind of a thing that I think Theo, particularly at OmniTI had set in motion, right, was like, you know, we're just not going to have that many packages. And the Omnios community has more packages now than they did then, but it's not like not like hundreds of times more. So it's still not that much software to look after and they do an extremely good job of security response, like timely responses to CVEs and getting, you know, like new versions of open SSH chat or whatever it is, you know, that, so we are based Helios in August of 2020, I guess at this point, 4 years ago ish.

Speaker 1:

The the hot days of the pandemic, Aaron.

Speaker 2:

Yeah. It was a rough time. We were we basically I took Omnios would have been the LTS 38 release, at the time and started just hitting it with a hammer until it was the shape that we wanted it to be.

Speaker 1:

Still it still it stopped screaming.

Speaker 2:

Yes. Right. I mean, like the because I did basically a complete build of many of the central chunks of just on the OS like, as the OmniOS people would build it.

Speaker 1:

Yeah.

Speaker 2:

And I took those binaries and packages and doctored them significantly to have different versions and dependency structures a little bit here and there, and then turn that into something that would boot. And then I built on that in situ.

Speaker 1:

Right. Because there's a there's a real right. There's a real bootstrapping bootstrapping problem in that, like, there is You really need to, like, have a build machine that's kinda prescribed. And so you really need to kinda build you kinda have to hand build a a VM that can actually build this thing, and then you can actually,

Speaker 2:

I think it's easy to do that.

Speaker 1:

Yeah. Lios NVM repo, if I recall correctly. Right? I mean, we made that open. Yeah.

Speaker 1:

Yeah.

Speaker 2:

I think so. That's probably true. Yeah. Maybe.

Speaker 1:

And then so we, and then got going with actually getting this thing where we could. Because we're trying to also and, actually, another thing I don't wanna lose on this is because kind of in parallel around this time, you and Patrick in particular are doing a lot of work to, to get Rostup support on a Lumos, which was gonna be we we knew it was gonna be obviously important for us.

Speaker 2:

The crate safari.

Speaker 1:

The crate safari.

Speaker 2:

So We went, around the countryside hatching things.

Speaker 1:

I just have an image of you in a pith helmet on your crate safari. Is that, Ed? A little

Speaker 2:

I mean, sometimes

Speaker 1:

when you

Speaker 2:

go into someone's open source repository, you do feel a little bit like you, shooting at things with a with a weapon. Something that Kind of the opposite. Yeah.

Speaker 4:

It's kind of the opposite, though. Right? Because you're bringing things to life. You're not killing them. So you're you're sort of Yeah.

Speaker 2:

That's positive. That's a positive message. I like that.

Speaker 1:

That's right. This is gonna we will live forever in this zoo. I'm I'm I'm really apprehending it.

Speaker 2:

Right. Sometimes it feels like getting patches into crates, though, is a little bit more like reverse osmosis, and it's not, when that was the long pole,

Speaker 1:

I think, for if I recall correctly, the long pole was getting making sure that that creates that

Speaker 2:

A lot of letter rotting campaigns. Yes.

Speaker 3:

A lot of letter

Speaker 2:

Please, just my password. You. Hope this patch finds you well.

Speaker 1:

That's what I'm trying to do. I know. And it could be it was a challenge. Right? Because we're obviously a smaller community, and people like, who the hell are you?

Speaker 1:

And Yeah. But we try to do all of the work for them. And then

Speaker 2:

Yes. As much as possible. Right. It's just all you really do need to do is push the little green button maybe, and then make a release is the other thing that then they have to, like, get both of those done. Lots of people were very, very friendly.

Speaker 2:

It's just that some people were extremely busy or or honestly, like, were the maintainer of a popular crate and then disappeared from the face of the earth one way or another. Like Right. Regardless of why or how, just uncontactable, like, for years since then. And it's been a challenge.

Speaker 3:

Were there some that were effectively abandoned or you couldn't turn to hunt down?

Speaker 2:

Yeah. Like the time

Speaker 5:

FS2. No, the FS2 crate is still that that person just disappeared. So it was wander around the countryside and, replace usage of that crate.

Speaker 1:

Oh man. And, you know, it kinda it's kind of interesting because I I I think and, Josh, just to your point, just so we don't get, like I mean, there were challenges, obviously, and that's what you focus on because everything else is pretty smooth. I mean, I felt like relative to some other, at least, language communities we've seen, that people were broadly pretty receptive. And

Speaker 2:

Oh, yeah. I don't think I've ever had to fight anyone to get anything in. Right? Like, I I don't think anyone's ever been like, what fucking operating system is that that you're talking about?

Speaker 1:

Get I

Speaker 2:

would, like, get out. Like

Speaker 1:

Right.

Speaker 2:

No one's certainly no one has created a second class to your reports to put us in specifically, like or anything like that. None of that. Right. But none of that has occurred.

Speaker 1:

And it's and I do think that, like, if you get a project, whether it's in Rust or something else, and you got someone who is coming in from a small system and has done the work to support said small system and is not looking for anything from you to support said small system, I do think it's kind of revealing of, of one's character about how you treat that system. And, yeah, we we it's not we've seen there has been heavy treatment in the universe, probably not from the Rust ecosystem, which has been great. So I thought that that was that was uplifting. Maybe I'm

Speaker 2:

Yeah. Well, I think we did the right thing too by getting the upstream tool chain work

Speaker 1:

done. Yeah. Yeah. Right.

Speaker 2:

Rather than just attempting to carry patches on it ourselves, we we got the Rust project, have for a long time now been building, like the binary compiler that we use, the Rust C and cargo and so on, and the Rust binaries, first, like, I mean, I don't know, 2nd class, I guess, binaries or whatever built by the because they don't run them. They just build them in a Docker container. They cross compile or whatever. And then but then that's what we use to we use those official binaries. And which means that when people look at the platform metrics or whatever, they see that it's like, you know, we're not, like, the second coming of BOS or something.

Speaker 2:

Like

Speaker 1:

Right. I love that Q, first of all.

Speaker 4:

The platform matrix is actually, I plan on writing a blog post about this recently because I think it's interesting that the Rust project gives so many guarantees. You see tier 2 support and you're like, oh, that must not be that great, but it's actually like far better support than most other platforms.

Speaker 2:

Right.

Speaker 4:

Anyway, don't wanna derail that, but that's also an interesting No. No. No. I'm sure they're getting smaller targets.

Speaker 1:

Like that's

Speaker 2:

You got to see it too. Are you gonna keep climbing? And I'm like, no. It seems pretty good. It's well defined.

Speaker 2:

Like, the you know? Yeah.

Speaker 1:

Well and I think that, I mean, there's certain action. And, Steve, we've talked about this before about, like, I have never developed software for Windows before implementing in Rust, and, you know, things broadly worked. And it's it's a real testament to the abstractions that Rust has, which is this is not every system. Like, this is there are plenty of other systems, where you might say in a comment to know which platform some software is for, Go looking at you. We this is any, like, nonspecific complaint for the last 10 minutes has been about Go, by the

Speaker 2:

way, just to just for a reveal on that. I just don't know what you're talking about. I refuse to be drawn into any discussions about that.

Speaker 1:

Maybe only for me then. Fine. Some of

Speaker 2:

us have to support our friends in the Go community who have been Our our good actually have been very good to us in the last

Speaker 1:

few years.

Speaker 2:

So that's, very nothing but warm thank yous to them.

Speaker 1:

Please disregard my colleague and his, I am so sorry, Go community for my my editor. So we getting Rust up working was a big deal. Getting it going on the crate safari, and, because we knew we were gonna use, I mean, obviously, we're gonna use Rust all over the place inside of oxide, but we also knew we're gonna use it just to, like, the mechanics of building the image. Right?

Speaker 2:

Yes. That was definitely, a whole thing. Each distribution of the Lumos, like I suspect most distributions of Linux and, and so on, had their own sort of strung together process for taking packages from the packaging system and laying them down, you know, in a disk image somehow and making it bootable in a particular configuration. And I wanted something that was definitely, like, that did those things without being particularly, like, chiefly composed of bash scripts and stuff. Right.

Speaker 2:

I wanted something a little more declarative.

Speaker 1:

Correct. But I

Speaker 2:

also didn't wanna use, like, Packo because I just didn't. And, also, I feel like the thrust of things like Packer is like, well, I'm gonna boot something in a machine and install it interactively a lot of the time, and then seal the thing up at the end. Whereas I wanted to build disk images that had never been booted.

Speaker 1:

And may never boot depending on the quick the quality of the software. Put at them. I mean

Speaker 2:

Certainly. Right. Some images may never quite get off the ground, but the ones that do, it will be like the first time that they're executing, like, they are pristine, essentially. I think that that's important, because I think like any process where you'd, like, take a base image, boot it up, do a bunch of stuff in it while it's running, and then, like, take a snapshot. You then have to have, like, a process for, like, cleaning out the identity of the machine that it decided on less HQ stuff, like all kinds of things, which kind of sucks.

Speaker 2:

So I wanted just definitely wanted like a hermetic, offline build thing. So I spent

Speaker 1:

a bunch of

Speaker 2:

time on that and, and like, and we, that mechanism has grown to support both or more than both, I guess, the, you know, the ISO install media that we, that we have that, that boots a small ramdisk to to do the install and it it also the produces, pre installed disk images for use in virtual machines. Because that's really,

Speaker 1:

like, a yeah.

Speaker 2:

And and then also the RAM disk images that we use in the product. So

Speaker 1:

So do you wanna describe a little bit about the the kind of the model we have in the product for actually booting this? Because it is, it's different than other systems and Yeah. It put constraints on Helios, obviously, for sure.

Speaker 2:

Yes. So we ditched all of the UEFI firmware that you would normally expect to find in a server in 2024. So there's no there's no like additional Josh,

Speaker 1:

what are the vulnerabilities? Can I get to can I have the vulnerabilities at least?

Speaker 2:

I don't know if we have time to, but the, so we just all of that stuff. Right. So, so the, the first instruction that the, so that that's the AMD CPU turns on, it goes into the PSP, the little management, the smaller CPU inside the big CPU, that's like responsible for turning on the big CPU, does the DRAM training, and then vectors the CPU towards our code that lives in no flash. Right? It's it's no flash.

Speaker 2:

It's the little chip, the 33 meg thing. Yep. And then we load the, a small image out of that, which would ordinarily contain the bios of the UEFI firmware in a, in a server. And that that less than 32 megabytes of binary stuff has to get the rest of the computer started. And so we actually put, so like, you know, if you think about how a Linux distribution usually boots, I think these days, most of the time it's a kernel and an init initial RAM disk image, like a little blob of kernel modules and configuration and stuff that that has to select the the bootloader will load those 2 things into memory and jump into them.

Speaker 2:

We have a similar construct, ultimately. We have the copy of the kernel and, effectively, a CPI archive, but, like, basically a RAM disk of sorts with a very small subset of things required to get the whole system, like, bootstrapped, all smooshed together into into that 32 meg or less image with some compression and stuff. And then a very small bootloader, foiblephbl. It's written in Rust that sits on the front and provides the very first instructions, jumps into locates and jumps into the into the Illumos kernel, basically the UNIX file. And then that UNIX file looks, that the the program text in that binary, locates the CPI archive in Ram, and it's able to get like the disc driver and the PCI subsystem drivers and the ZedFS kernel module and a bunch of other stuff, loaded.

Speaker 2:

And then, then we switch on, you know, PCIE and locate the NVMe device. It's an M2 form factor device that sits inside the system and contains the RAM disk image. And we didn't want to go backwards and forwards on initialization a whole bunch. So we do have to put a fair amount of the operating system into the flash RAM thing. Yeah.

Speaker 2:

So p PCIe stuff particularly is, like, not reversible, I think. Right? So

Speaker 1:

Well and I think we and that was just a big kind of principle that we had is that we wanted and and other systems don't do it this way. And Right. For reasons that are that are not invalid. Yeah.

Speaker 2:

I mean, part of it's Conway's law. Right? Like, you could the the system the system vendor provides when we're up to a contract and interface contract point, right, where it's like, which is like what UEFI is. And then you, the operating system vendor, provide a UEFI compliant application that the firmware will is willing to load into RAM and kick off. And there's a whole bunch of stuff then in that contract, both explicit and sometimes implied that initialization has to have been done in a particular way beforehand.

Speaker 2:

And some of it is not always that great or that crisp or that reliably done. And so we do all of that stuff ourselves in one body of software. So it's rather than, like, a firmware bootloader thing made by group a and an operating system made by group b. We just have group a, and they made the operating system kernel do the things that the firmware would have to have otherwise done in the in the old model. That's right.

Speaker 2:

Yeah. One thing.

Speaker 1:

Yeah. And then and then importantly, like, we are not seeking to so in a traditional UEFI world, you've got this operating system that runs before anything does. This this platform layer that makes available UEFI as an abstraction. Right. This platform initialization layer.

Speaker 1:

And then it boots often a bootloader that then it does some things to go find what you wanna load, and then it pretends that the system is freshly reset and boots an actual operating system. And we, again, understandable why you wanted Conway's Law and other things, why you'd wanna have a bootloader in there, but we actually know what we wanna boot. We wanna boot Helios. So we actually Right. Don't want any of that stuff.

Speaker 1:

We wanna go straight from the PSP. We wanna boot the operating system and boot it all the way up and not have to go backwards at all and not have to. And so the the the challenge of that, then there's a question in the chat of, like, wait a minute. Could you, could you actually take an absolutely minimal read only support for ZFS and load the the the kernel? It's like,

Speaker 4:

yes, we can.

Speaker 1:

You could.

Speaker 2:

Right. You could certainly go and implement, like, ZedFS, a read only ZedFS, like, so out in in the Illumus OS itself on a PC, the bootloader has a second implementation of ZedFS in it, like a small read only one that is able to locate the kernel and the boot archive out of a ZFS pool on the disk that the bootloader is on using the firmware's disk drivers, you know, like bias or or you if I to to read blocks out of the disk and it has just enough code in it to pause the disk structures without, like, needing to write to the disk. It's all read only, and that's how that works. But that requires that you can get to the disk, which is one of the real the real chicken and egg problems we had. Right?

Speaker 2:

Is the disk in the system we're able to build, the disk is a PCI device, an NVMe device, and in order to get to the disk to get the large quantity of data that you would need for the pool, the the RAM disk image, you would need to turn on the PCI stuff. You'd need to, you know, do any attestation you were hoping to do of firmware blobs that are in the PCI path. Like there's all kinds of work that you need to do that often you can't undo and let go of and have the operating system then do it again, because some of these, like, registers you set them or whatever and the only way to get them back to the power on state is by powering with like power cycling the thing, which obviously is somewhat counterproductive when you're trying to get it to turn on. But the which is different from the PC world where, where the bootloader, well, the firmware makes promises to the bootloader about like, look, look, I'm going to do enough PCI that you can see all the PCI devices and they're going to be turned on. They're going to be left in some state.

Speaker 2:

You know, bus mastering has probably turned on all kinds of things. I've maybe I've configured the IO and you, or maybe I've turned it off or something like, you know, good luck here. It is. You you can assume some of those things have been done correctly. You can go and redo bits and pieces sometimes, but, like, if you are hoping to test the firmware of all the PCI devices before you allow them to talk to main memory, like, that's not a thing necessarily that you can do.

Speaker 2:

Like, there'll be a gap right between when the firmware maybe turn things on and when your operating system gets in and is able to like shut it off. Right. It's not great. So to close all those gaps, we decided to do initialization just once. And because some bits are not reversible, we have to do them just once in the, in the NOR flash and then eventually we get to the disk.

Speaker 2:

But but we we effectively like, the the files that go in the CPI archive that are in the NOR flash are just a cache basically of like or a copy of of a small subset of files that are also in the pool so that we can load them before we can get to the pool, basically, is a good way to think about it.

Speaker 1:

And I mean, because we are, like, space constrained there. Right? It's 32 megs, which is a lot for hubris, but not as much for a Lumos. The the p. Right.

Speaker 2:

Although, honestly, like, we're only using about 8 meg.

Speaker 1:

I know. I noticed that. That's gotten Andy's done a great job, I think. Because Andy has been slimming that down.

Speaker 2:

That's got some It was already it was already 8 meg. Like Oh, it's alright.

Speaker 1:

I'm sorry.

Speaker 2:

Because the it's alright. I did that bit. Absolutely. And and, you know, we I think Dan wrote the compressor or whatever, right? Like the pinprick pinprick thing.

Speaker 2:

But but that is it's like a CPI archive that we then compress with some kind of defliet thing that Foible is willing to unpack. So it's, like, bigger before that, but but it is compressed. It's only about 8 and a half megawatts, I think. And I don't expect it to grow a whole lot Because like, you know, if we if we added an, you know, like, consider OPTE is a relatively big kernel module,

Speaker 1:

Right. OPTE being the the oxide, the the Packet transformation

Speaker 2:

engine that we use as Not about software networking thing.

Speaker 1:

Right? So we're fine networking,

Speaker 2:

right? So like, imagine if we added another feature like that, but it wasn't in the boot path. It was like, it was a kernel module, but we didn't need it to get to the disk, we wouldn't put it in the in the CPI archive. So like the ROM wouldn't get bigger just because we had another kernel module.

Speaker 1:

Right. It it it's not and and so you so this is a very kinda custom thing that we've got where we are saying we are kind of, bifurcating the kernel modules about you this kernel module needs to be in that the the Norflash. And this other one is actually so, I mean, in particular, the

Speaker 2:

It's a pretty specific allow list of specific modules that that we determined were required before route the root file system was mounted in order to get it mounted.

Speaker 1:

Right. And so, and then we, we also don't actually execute user level processes from there, if I recall correctly.

Speaker 2:

That's true. We don't create any processes until root until way after root is mounted. So, like, this is all in kernel stuff.

Speaker 1:

Which I I mean, I think it was I don't know. I thought that was a great discovery that we could actually do all of this inside of the the constraints. And

Speaker 2:

Yeah. Because that's part of why it's so small. Because, like, when we were trying we we did have a swing at putting a small user land in there, and it did it was bigger, certainly, I think, than the 8 meg in the in the end. Because it, like, obviously, it's, like, all the kernel stuff that you need and some additional user space programs and libraries. And, like, a lot of our programs and libraries are sort of built with the expectation that they're all gonna be installed and the disk is not that tight.

Speaker 2:

So like, you know that and also like our startup process is really one of my goals is certainly not to like change at all necessarily if we could avoid it.

Speaker 1:

Right.

Speaker 2:

From the way that it works on a PC, like, there are things to chuck out that definitely, like, are counterproductive things, but there are so many parts of it that are just, like, inconsequential. Like, it doesn't matter that they work the way that they do, even if it's not the way that people would have necessarily picked if they'd done it from scratch. Like, it's already been done and it works, so if we can just be like, if we can be different where it makes sense and then get to the point where we, where we're not different, then I think it's just from a maintenance perspective, it's better for everybody. And so like on a, on a PC, when you reboot, we create re recreate this CPO archive, right? As it as it is Acacia things, we just get all the kernel modules on that you happen to have installed on the assumption that some of them will be needed to boot and that thing can be 40, 50, 60 megs right which is fine when it's on the disk, Just wouldn't fit in the NOR flash, so we had to pair it down.

Speaker 2:

But but we produce at build time the archive that you would normally produce on reboot after a software update where you've added some kernel modules or updated something, but it is otherwise extremely similar.

Speaker 1:

Yeah. Interesting. And then so the artifacts coming in because you also wanna allow Helios to run on non oxide hardware. Right?

Speaker 2:

Yes. I thought that it was very important that we kept providing good ways to do that. Because as much as I I like the hardware that we have built, people are gonna wanna do things on random POS of desktops. So, like, you know, we wanna be able to run it on commodity servers that are co located in the data center. You know, like, you know, if we, if we had established a point of presence somewhere far away and we don't want to put a whole oxide rack there, for instance, like we might just put 2 servers somewhere.

Speaker 2:

Right? And at least right now, we don't have a product that fits that bill, so we have to do that on commodity stuff. And I think that that would be true for quite some time.

Speaker 1:

Think it's always gonna be true. Right? Because we, at least because we also run Helios on the manufacturing stations.

Speaker 2:

We do. The Lumos on the desktop 2024.

Speaker 1:

You betcha. You heard it here first. There you go. Give give Hacker News something really get their like, it's stuck in their craw.

Speaker 2:

There you

Speaker 1:

go. We we gotta get a a I I know we talked about this under our manufacturing episode, Adam, but, and I know I think you used the image of that, right, with the with Josh Oh. The Yep.

Speaker 3:

Yeah. Josh's, the station. Yeah.

Speaker 1:

But so that thing is and so that's running on some I mean, ironically, I mean, that's running on the kind of commodity machines trying to replace

Speaker 2:

the shittiest Dell.

Speaker 1:

The shittiest Dell. That's a lot

Speaker 2:

from 2,000 one of our core values is thriftiness?

Speaker 3:

City hardware. Oh, yeah. Thriftiness.

Speaker 2:

Right. Right. Which rhymes with shitty hardware, I guess. Right. So they I like, also because we were trying to get this stood up in the pandemic.

Speaker 2:

Right? So there was, like, that we were in the middle of the I mean, we were having oxide hardware supply challenges. Right? But but, like, separate from those supply chain challenges, I couldn't buy, like, desktops. Well, there was In

Speaker 1:

the in

Speaker 2:

the form factor and price range that I wanted to buy. So I ended up I found that, like, there there was this place in Arizona that would do off lease, like, Dell, Optiplex, I don't know, 700 to 900 to 90 20 something like that. Like, from about 2014 to to 15, about that era, for, like, a $100 shipped.

Speaker 1:

And I believe you And

Speaker 2:

they had 100 of them.

Speaker 1:

That that some have up rev firmware and some have down rev firmware. And I believe you discovered that the hard way, if I recall correctly.

Speaker 2:

Yeah. There was some differences in the way the USB ports were enumerated, which is a bit sad. But the but the they're all same revs

Speaker 1:

now. Right. Right. Yeah. I mean, there was some down

Speaker 2:

road for a while and then back to up road because we fixed the driver that the and yeah.

Speaker 1:

I I did, think it was I mean, because you're getting Helios to run on these commodity servers, commodity machines, and we're also, like, their machines that you bought from some aggregator in Arizona for a $100, so they're like

Speaker 3:

Not stolen. I wanna emphasize that.

Speaker 2:

Yeah. From an air conditioned rubbish bin, basically. I mean Right.

Speaker 1:

Not stolen because it's like, it wouldn't be worth anyone's time to steal them. No. So, like, they they you would literally steal anything else around them.

Speaker 2:

For Honestly, like, if you were a business and you had a 1,000 of these things on a pallet, it's, like, actually, at some point expensive to dispose that many of them.

Speaker 3:

I see.

Speaker 2:

Right. No.

Speaker 5:

It's like we'd be stealing

Speaker 2:

them would be a benefit from a tax perspective. Like, yeah.

Speaker 3:

Josh, if I remember correctly that we were a little bit constrained on the, on the 3 by 4 monitor that you had selected? Or am I making that up

Speaker 2:

constrained in what sense?

Speaker 3:

Like we couldn't, we couldn't obtain them.

Speaker 2:

Oh, no. We could. That was fine.

Speaker 3:

Okay.

Speaker 2:

It was like, there are only no no one makes them anymore, I guess, I think the constraining you're thinking about is I run out of them because we use them all. I because I bought off lease Dell, like Ultra Ultra Shop. I think they're like 190 7 190 8 FP panels also from the same era. They're all, like, contemporary with one another. To create

Speaker 1:

the to create the future of hardware, we must consume the past. We must do we must consume your e waste and actually Honestly,

Speaker 2:

like the like the first monitor that I bought with my own money was one of those. So, like, so it's like So

Speaker 3:

you bought it for nostalgic factor to

Speaker 2:

to do something? No. I'm like sitting in front of this thing. It's like it's it's it's reminds me of, you know, being, not as old as I am now.

Speaker 1:

Wait. Do you I think I'm watching is this like Citizen Kane for TTYs? What is this? This is like your

Speaker 2:

It, don't get that reference, but, you know yeah.

Speaker 1:

So Yeah. The but you you were and in particular, though, you were having to deal with so much bias and BMC and UEFI pain.

Speaker 2:

When you say bias, you mean

Speaker 1:

bias? Yes.

Speaker 2:

Bias. He's

Speaker 3:

he's that responding to his unconscious bias.

Speaker 2:

No. Good. Bias. Very good.

Speaker 1:

I am prepared to buy us pain. But there were just so many times where you're just, like, in this excruciating pain. I'm like, you know, Josh, you know what we need to do? We need to start a computer company. It's the Yeah.

Speaker 1:

Yes. I know. I'm trying to get

Speaker 2:

I'm looking at it.

Speaker 1:

I'm working at it right now.

Speaker 2:

Yeah. Like, we have

Speaker 1:

to, like, try to keep

Speaker 2:

the whole committed to it. Future. Yes. Yeah. Exactly.

Speaker 1:

But it is extremely handy to that that we can have Helios in the same kind of, the the same distribution mechanism can create a a a package effectively that can be dropped onto a GenWeb under our our compute sled. Then it can also be dropped we can put that on a server. We can put that on a on our manufacturing stations. We I mean, it's One another thing

Speaker 2:

that I was keen to get done too is that I wanna run the same binaries on all of those things as well. Yes, I'm not particularly interested in rebuilding the software for every different shape. So we do actually we have like a central packaging repository that contains like the current set of OS packages, you know, and all the adjunct stuff that we layer on top like shells and, you know, like tmux and git and all the other crap that you need to make a usable Unix computer. And then we our image process is, you know, a list of packages from that, from that core repository generally speaking and plus some oxide specific applications like our control plane for the for the production image, but for, you know, workstations and manufacturing stations, different sets of packages overlaid and then some like a finalization step where we maybe remove a bunch of things that we, though they are in the package, they are not necessary and they make the image too big or something like that. So there's some like a, like an additive packaging step and then a, the other word.

Speaker 2:

Subtractive. Subtractive, that's the one. Subtractive, like a trimming step where we we remove or and then and then like a final, like customization thing where if we wanna put pre bake some specific config files into the image that are different from the the ones that you find in the base OS repository. But the same, you know, the same kernel and loop c, like the same copy of grep or whatever sits on the production system and, you know, in in, virtual machines in AWS and on people's workstation desktops that they use to develop the software and, you know, whatever. So I think because then, you know, you can take a core file or whatever from any of those things and, like, you definitely be experiencing the use you're reproducing the same, issue or whatever in different contexts.

Speaker 2:

It just it's I think it's better to have less different specific binaries than more of them floating around generally.

Speaker 1:

Totally. Well, and especially as you've got you know, one is developing software that is kind of interacting with the surround. It's like you don't want it's important that the thing that you're actually testing on is the thing that actually is gonna be deployed. Right.

Speaker 2:

And our our Solaris heritage comes from a time when the operating system was proprietary. So there's a lot of, like, they too wanted everybody to be using the same binaries for things. But then in order to make them flexible, it's very modular and configurable in many cases, where sometimes software that didn't start out proprietary, I feel like, you know, the like, it's like, well, you want it to be different. You should, you know, enable this if def or something, you know, and and build it again, like, or build it against a different set of libraries or whatever. Whereas we have a lot of, diamond dynamic, linking and modules and plugins and configurable stuff.

Speaker 2:

So it's been pretty easy to make that happen in reality. Been good.

Speaker 1:

There's a question in the chat about what, packaging what what does the package repository run? What is the actual substrate you use there? How do you actually deliver it with

Speaker 2:

Packaging system is called, IPS, the image packaging system, and, which was an artifact of the open Solaris era at at some, which and they I believe they also use it to deliver Ark of Slarus 11 today. But, the it is a pretty light entrant to the packaging world because you think about like the other stuff like, RPM and Yum and yeah, de de package and apt, like had all been around for quite a long time when IPS had rolled out in, you know, 2,000 and what, 5 ish, 6 ish, somewhere around there, probably.

Speaker 1:

Yeah. Somewhere around there.

Speaker 2:

And it would yeah.

Speaker 1:

It it when I think it's this kind of, like, funny consequence because on the one hand, because say, the now at that time, you're doing with, like, OpenSolaris and Solaris before that. Right. Because the it did include everything. There's so much software that was already included. Yes.

Speaker 1:

There wasn't the you could actually have a usable system without kind of solving the packaging problem. And one of the kind of interesting consequences of Linux only defining a kernel is that actually everyone really needed to solve the packaging problem, and you got some very good kinda competition in these different packaging ecosystems. And I think that would it really because there was indisputably, packaging was the very far ahead on Linux based systems than on any than any other kind of system circa early 2000. So IPS is coming, I mean, package is I mean, the sun, the the and that is actually SVR 4 packaging. SVR 4 packaging.

Speaker 2:

It's just No. It's, yeah, like, it's totally just divorced from the SVR 4 stuff that came before it, which is Right. The IPS yeah.

Speaker 1:

Yeah. Yeah. Right. The the but in terms of, like packages.

Speaker 3:

Yeah. The Solaris packages, like before IPS were total trash, such trash. I remember in particular the, the bug, I guess, that we fixed where

Speaker 1:

you

Speaker 3:

couldn't r minus r f slash was a consequence of, like, if you ran the packaging script just like and didn't specify certain things, it would start chewing away on all of your data trying to delete everything. Like, it was these were

Speaker 2:

not Yeah. So the the thing that you're talking about, right, is is a part of the problem with the classic approach to packaging that IPS did away with to mixed reactions, I guess. I am hugely in favor of it, but I understand that it is a change that it represents like a change to the way people have to think about packaging. But the SPR for packaging, and I assume, like deep package and RPM and stuff, like, allow you to, like, lay down a bunch of files, but then, like, maybe you need to do something after that. Like Right.

Speaker 2:

You know run some program to add a user or a group or something or reconfigure something or you know change something other than just a flat file that you happen to be delivering in order to make the thing usable. And so that, you know, you you would have these post install scripts ultimately delivered in the package, but they had to be able to run-in several different contexts. One of which was on the live system, another one of which is like not on the live system in, you know, like assembling an alternate route or whatever like we would do during an image creation, right, right, like I'm I actually want to create a system over here in a directory, Please don't touch the running system. But like you've got root in the post install script. Good luck.

Speaker 2:

Like preventing them from doing the wrong thing and definitely like removing files, altering shit in the, you know, in slash instead of slash a or wherever the image happened to be mounted, like, very common with post and sole script. So IPS IPS is like, no, I will not allow them in the house. Instead, we will provide a number of, more like declarative actions. So, like, oh, you need a user account with this name and this user ID. We'll do that for you.

Speaker 2:

Just tell us what it is. Right. And groups and profiles and, role based access control and driver definitions was another classic one. Right? If you, you know, had a package with a driver and it you needed to go and run, add DRV and a bunch of other stuff, like, no.

Speaker 2:

We will do that for you. Give us the metadata. And you know there are like, I don't know, 5 or 10 or 15 things like that. Basically they did a survey I think of all of the common post install scripts they could find and made declarative versions of all the things that they were doing that were reasonable and then banned the idea of post install scripts altogether. If you need to have behavior that occurs, you need to deliver a service that runs on the system.

Speaker 2:

And then that service is just like any other service. Right? It runs in exactly one context, which is once the system is booted at the appropriate point in start up with the appropriate set of privileges and stuff. So that's the that as a core I feel like that's the core value prop of IPS really, that and its strong interaction with boot archives and stuff. Pretty good.

Speaker 2:

Like not boot environments, like the snapshot stuff, like being able to, like if you need to update the kernel or something, then we will make a clone effectively of the current data first data set, the system is booted from and, alter the clone and then you reboot into it rather than doctor the files in place, and hopefully you don't get interrupted halfway through and the system is unbootable or whatever. And that's been pretty good. I feel like updates are a lot. You can just tell someone to update and not expect that their computer will break necessarily.

Speaker 1:

It's been pretty good. It it has been pretty good. I gotta say because I have actually I mean, maybe this is a little bit ridiculous, but because that work got started, Adam, when you and I headed off to Fishworks. Yeah. So we we were at Sun, but we were not we were kind of doing building an appliance, doing our own thing.

Speaker 1:

And then and then it was a joint where I was using package source, and there are a lot of, you know, there are a lot of advantages of package source too, but I had not really used IBS before Helios, Josh. And that's, like, I had to you know, we we had a big flag day where build systems all had to be upgraded. And that's the kind of thing where you're like, okay. Oh, god. You know, like, I can basically this is this is more or less I'll be lucky if this only nukes my day.

Speaker 1:

This is gonna be

Speaker 2:

I guess we're spending March doing that then. Right.

Speaker 1:

March's update the build server month. It was it was super smooth. It was great. I feel like

Speaker 2:

we got that done in a couple of days. And, really, like, the hardest problem was remembering to do the handful of things that people don't normally update or use at all.

Speaker 1:

Yeah. I mean, it was very tight end. I I when I've had the to update that stuff, and it was just it was great. I mean, it was really it was a great user experience. And I think

Speaker 2:

critically also, like, we we don't use it on this on the product system. Yes. We, like, we use this on workstations and and manufacturing line systems, things that need things that are, like, treated as classic immutable, sorry, classic like mutable install to disk Unix systems. And then for the product, we use IPS as a build time step just to assemble the RAM disk and then the RAM disk is like a sealed entity. Every time you boot it, it's the same, which is which has other good properties for the actual oxide sled as a sort of an appliance.

Speaker 1:

Yeah. When I think and just in general, because I know there's definitely a point of confusion in the Hacker News comments today that the, Helios itself is not user visible in the oxide rack. It is an implementation detail of the rack.

Speaker 2:

And it's really not meant for users to be honest, like even even though we have open source it because we want people to be able to, replicate what we've done, you know, we want to give people the the open source promise of like, you know, you buy this bunch of hardware and you get upset at Oxide in the future. Like you can still have the software like it's out there, go support it yourself. Like I mean, you know, that's the I feel like that's the core promise of open source ultimately from a business perspective. Like, but, oh, what was I saying?

Speaker 1:

We I we but we yeah. We're not in fact, we we've opened it. But, yeah, I mean, we've Alright. But it's not it's not

Speaker 2:

I wouldn't yeah. I wouldn't think of it as being we're not super interested in making it a good UNIX distribution for people to install on computers and use for web servers and stuff like

Speaker 1:

industrial manufacturing station. If you're a manufacturing station, get in touch.

Speaker 2:

Yes. If you

Speaker 1:

You do support manufacturing stations and build servers. We do.

Speaker 2:

They have

Speaker 3:

a strong I mean, if you have a 4 by 3 monitor, I can't emphasize this enough. Yeah.

Speaker 2:

If you actually will work on I know. I know. I know. I know. You know the giant fucking clock thing that sits in the corner is also the Is that That's right.

Speaker 1:

I did not know that.

Speaker 2:

That's a Lumos on the on the the wall.

Speaker 3:

The most expensive bedside clock at

Speaker 2:

Yeah. It's, it's one of those someone at some point had one of those Dell monitors. That's like

Speaker 3:

8 feet wide.

Speaker 2:

Five units wide instead of 3 units wide or whatever. And like it sat on the desk for a long time and then someone mounted it on the wall and it was never clear to me what it was on the wall for. But in the end, it's like, actually, it's a really good place to put a clock. We don't have a clock. I can never tell what time it is in the office.

Speaker 2:

Yes.

Speaker 1:

Yeah. We're gonna have to, there's

Speaker 2:

a little Ram Disputed, like a it's it's a Dell wise 30 40 thin client terminal that the SD card doesn't work in, but I boot it from a RAM disk on a USB stick or something, and then it just runs a Rust program that poops all over the frame buffer at the time.

Speaker 1:

That is great. We have our Helios and Rust powered clock in the office. Yeah. The world's weirdest clock. And probably and, like, and, like, not exactly I mean, only only, thrifty for us because the monitor was otherwise not gonna be used.

Speaker 1:

But It

Speaker 2:

was already on the wall, so it's like It's

Speaker 1:

already in the wall. Right.

Speaker 2:

It'll take you more time to get it down than to to put a clock

Speaker 1:

on it.

Speaker 2:

So that's yeah. The

Speaker 1:

clock is is very useful. I can say. It's a very it's a very large digital clock, which has been I noticed I did not If

Speaker 2:

you look at Helios and you'd like to use it for something in production, it is, in many respects, extremely similar to OmniOS LTS, the latest LTS release. So, like, OmniOS LTS is probably what I would recommend. Certainly we use OmniOS LTS to deliver like web facing things. You know, because their package set is slightly larger and sometimes it's good to not have to look after the distribution they use for critical things.

Speaker 1:

Right. Yes. So Yeah.

Speaker 2:

If you're busy looking after distribution. Now the people are using for critical things like the Right.

Speaker 1:

Right. And so in in terms of these kind of or so, I mean, leveraging IPS, leveraging, OmniOS for sure Yeah. Leveraging, Illumos and Lumos upstream. I do think that one, I just on the Illumos point, we can can talk into more depth if people are curious. But, I I do think that that one thing I did wanna call out is the in addition to us, like, really focusing on getting making sure Rostock worked there.

Speaker 1:

I I think the the other thing that we learned the hard way from the, SmartOS days, which is another, downstream effectively distro of Alumos, We tried to upstream things, but we weren't necessarily fanatical about it. And it it it created its boy, if you're not fanatical about it, it really accumulates quickly.

Speaker 2:

Yeah. And it's like it's like a snowball. Right? Like It is. If you're not melting it, it just keeps getting bigger and like, it accretes.

Speaker 1:

It accretes. And and the second you get, like, one significant thing and we had a couple of these, Patrick. I was I I think a couple of these kinda come to mind. We had a couple of things that we had decided, like, we're not gonna upstream this. It's like it's just for a variety of reasons that were not, like, necessarily good.

Speaker 1:

It was more like this is just not ready to be upstreamed or

Speaker 2:

this is just sufficiently general or It's

Speaker 1:

not specifically general. It's really kinda specific to us. And and if if these things are, like, are creating a lot of diff traffic, it just becomes really, really easy for the stuff to accrete. And so one thing that we've really tried to to be, strict about is getting stuff. We a lot of stuff goes upstream, like, just flat out first.

Speaker 1:

Yeah. And we we are, so we I know

Speaker 2:

Patrick Patrick is, you guys push up a lot almost all of the beehive delta straight up, right? I mean, I feel

Speaker 5:

like Yeah, all all of that goes straight in. There's it never touches St. Louis first.

Speaker 2:

Right? So we actually pull it back into our back frame for when we merge everything that's in in the upstream fork each, you know, day or week or what however often we do that.

Speaker 1:

Well and I I I also think which is also important is that then we also, in part, because we are going upstream first. Like, we stay really, really current. And, the existing

Speaker 2:

much easier.

Speaker 1:

Oh my god. Well but, Adam, you and I lived this at Dishworks where at the time so we are banking an appliance. I mean, not not wholly dissimilar to what we're doing in oxide, making an appliance based on Solaris, and it and but not doing it, you know, doing as well in many different dimensions. But one of the there is this any or there's much more churn in the operating system. And, Adam, do you remember syncing up?

Speaker 1:

I mean, it was brutal. It was absolutely brutal. And we got broken all the time, and we would get broken really, really badly by upstream in ways that we can particular,

Speaker 2:

did you merge?

Speaker 1:

The what this was

Speaker 3:

like frequently as possible? I don't know.

Speaker 1:

Well, no. This was like this terrible balancing act because, you know, of course. Right? It's like it's like very painful, so I don't want like, how often do I wanna go to the dentist? Like, I don't know.

Speaker 1:

Not every day. You know? And so but then you so of course. Great. It's like you are deferring that pain.

Speaker 1:

And then when you could do go to sync up, I remember we had a another piece of software that was very much developed at cross purposes to ours, and, it just deleted all of our software. So the it it deleted all of in particular, more concretely, it deleted all of our SMF manifests. So it was just like, you have no services. And you're like, I I But

Speaker 2:

I did delete services.

Speaker 1:

Because it didn't recognize them. It's like, I don't know who this is. I'm deleting it. And we're like, but hi. It's like we're a user of the system.

Speaker 1:

And it would and I just feel like there were so many of those out of where it would just take I mean, it would take a I mean, the fact that I can still remember build 135. Build 135 was was was a you know, I okay. The the am I am I I know I'm not in a safe space. I just remember I remember the day I was is something that, like, really chips away at your soul because is something that, like, real chips away at your soul because it feels like there's just this giant leak at the bottom of the bucket, And it feels very you know what I mean? Maybe this is just me.

Speaker 1:

It feels very despairing to to be broken by upstream. Because you're like

Speaker 2:

Like, part of it's like you're being broken by things that you don't really have particularly intimate knowledge of. And it's like, I just this was a thing that worked yesterday. Hey. That's exactly what what was the value

Speaker 3:

you're making in your mind? Exactly.

Speaker 2:

And it's like it's awesome. The value was like, well, they were trying to, you know, make it better or do something, you know, but but, like but that's not how it feels on the receiving end.

Speaker 1:

It feels.

Speaker 3:

Like yeah.

Speaker 1:

Well and I also feel that, like, when you have a large number of people developing a body software, you know, people think that they're adding value when they're not always bluntly or, like, or the worst thing is, like, no. No. This is the phase one of the project where we just break everything. The the the gloriousness happens in phase 5, which, of course, you never get to phase 5 because, like, you've already funded before then. So it's like, well, this is great.

Speaker 1:

We just have, like but that was why we got for a reason.

Speaker 2:

That's a balancing act that we've faced with the Lumos all the time. Right? People are like, couldn't you just make it easier for me to, like, thrust my change into the repository? Like, I just wanna put it in there so that people can, you know, people everybody can have it and they can see how it is. It's like, right.

Speaker 2:

But if they see how it is and it's not good, that's gonna make it really sad. So, like but, like, we wanna encourage people to have, you know, room to explore. So it's, like, it's a hard problem because it is the operating system, and sometimes it's the file system. Right? It's, like, the most crucial thing at the bottom of the stack that just, like, can't be broken really or everything's, like, screwed up.

Speaker 2:

And so it's hard sometimes you're like you gotta tell people, you know, like I want to take the thing that you've made. But it's not quite ready yet. We need to test it more or something, or you need a little more review, or we need to, you know, make it slightly more robust or slightly more backwards compatible or something so that people don't notice when it goes in. Because that's ultimately what you want with operating system evolution, I think. You want it to do everything it did yesterday and to do marvelous new things in the future, but you don't want anyone to notice the transition between those.

Speaker 1:

Well and you also really want people to be able to run the latest all the time. And you want to have people need to be fearless about running the latest because they know that, like, latest, it's my life only gets better. And, you know, this is this is what, you know, and this is kind of very deep roots for us because this goes back to, some very dark days for for Sun, when this was not necessarily the case. And it's what our our former colleague, Jeff Vonwick, coined as the the FCS, the or rather the quality death spiral. Right.

Speaker 1:

When the when you start to quality begins to slip, and as a result, people are like, I'm not gonna run the latest because it burned me last time. It's like, well, now there are many fewer people running latest. And that that kind of that long tail of really tough problems are it's gonna get less exposure. It's gonna get exposure later. And now the quality of the the quality actually begins to degrade.

Speaker 1:

And this idea of, like, f FCS quality all the time, you may also hear first customer ship. FCS quality all the time is the way we avoid the quality dust spiral. So very, very important to us.

Speaker 2:

I think the version of that, like, the the problem that we have in the open source world having a number of downstreams. Right? Like like, we have the Lumos up at the top, and then we have our our our, you know, our Helios downstream, and then there are a number of storage vendors that don't talk that much about what they do, but they have downstream forks of the of the operating system, and they're often really not current at all. And, you know, like, they experience this pain all the time. Right?

Speaker 2:

It's like the it's like, you gotta merge upstream changes into your downstream every day. Really like as soon as they happen ideally or close to that. Right? Like, you know, weekly, whatever it is, but certainly not like every quarter that's too that's too infrequent because you're not going to notice when someone does something that doesn't work for you for starters. If you want to notice as soon as possible after someone that does something that they thought was good and everybody thought was good and doesn't work for you, you need to be able to sing out like asap because for starters we have to back it out.

Speaker 2:

We only want to revert things like that just went in. So, like, if it goes in and you don't notice it's completely broken for you for 6 or 7 months, it's really hard to justify reverting it. Yeah. At that point. So like then, so well, then we have to fix it, but now you've got to debug it because you're the only person for whom it's not working.

Speaker 2:

And, you know, like also your appliances proprietary or whatever. So like my advice to everybody, anyone doing anything like this, is, merge as soon as possible upstream changes as they arrive. And and then that also makes it easier for you to take your changes to the operating system downstream that you have made and send them upstream because the delta between the thing you're working on and the thing up there is small. And like we have a whole tree of hardware support in St. Louis, the UTS oxide stuff, like the oxide machine specific bits of the kernel, which we intend to upstream.

Speaker 2:

Like, we're not upstreaming it now because it's still in flux, but, like, it's in the public repository. Like we don't have a secret repository where we work on it and like we occasionally expose it to public. We just push to GitHub, basically, you know, everybody can see things as they're being worked on and we are developing it in such a way that when it goes upstream, we don't have to fight with anybody because it's like in its own directory tree alongside

Speaker 1:

directory. Right.

Speaker 2:

We're not gonna have to, like, convince people to stop doing ACPI altogether in order to ourselves not use ACPI, for instance, things like that. So it's all aimed it's all aimed at being one code base that everybody can use and that we don't have to maintain patches and delta and stuff because that really sucks. Do we

Speaker 3:

Yep. Do we already explain what Saint Louis is?

Speaker 1:

Yeah. I was gonna say, Adam, thank you. Probably not. I think we we we we spoke past a little bit. Yeah.

Speaker 1:

Do you wanna explain Saint Louis, Josh?

Speaker 2:

I you know, I should say to this question. It's an arch. It's a gabbler. Is what it says in the That's

Speaker 1:

a new arch.

Speaker 2:

In the ring me, which stands for which is really ARC.

Speaker 1:

Yes. Right. It's a architecture. It's a it's a new architecture. So Right.

Speaker 1:

Historically, there you have the the ISA, the instruction set architecture, and then you've got the machine architecture. And these are and for And

Speaker 2:

then sometimes you have platform specific bits as well, but

Speaker 1:

That's right.

Speaker 2:

PCs doesn't don't tend to have any of those differences.

Speaker 1:

And for x 86, there was 1 effectively, one machine architecture ID 6 PC.

Speaker 2:

Right.

Speaker 1:

And damn, I mean, it it says it right really

Speaker 2:

1 yeah.

Speaker 1:

It it is a PC. I mean, it's basically the personal computer architecture is what that was effectively in training. And in in order there's a bunch of stuff that we just don't need in oxide, and the way we did that is by having a new oxide architecture in this St. Louis branch of Olumos. So Right.

Speaker 2:

Keith, one of our engineers, took took a copy of the I 86 PC stuff, I believe, and and put it alongside and called it oxide and then started deleting things. Yes. Yeah.

Speaker 1:

And just started ripping things out. And so we've got, in that way, we you can have multiple architectures kind of side by side. So, we can

Speaker 2:

Lots of common code though, like PCI is PCI pretty much wherever you come from, you know, things like that.

Speaker 1:

Wanna leverage as much common code as we can. So the the, that's been very important. I would also say that the part of what I actually love about this, Josh, is that it has, really forced us, not that we weren't doing this before, but, really, it's very important that we're able to contribute to upstream Alumos. So there's been a lot of effort on on making that easy, And, the docs on this because I think Brian Horsman, I went to the chat, dropped the links earlier, but the would drop them again. But the links for contributing to the docs for contributing to Lumos are really good.

Speaker 1:

And, Some

Speaker 2:

of that stuff was, like, between the tail end of the Yeah. Samsung trauma.

Speaker 1:

Right. And

Speaker 2:

that's Like, there were, like, 3 or 4 months between then and and when when when I started here at Dockside, and I did a fair amount of like HTML work on, on the front page things, trying to make it look like it, like it like like we've done anything on it

Speaker 1:

in the

Speaker 2:

last decade. And there are links on, like, the lumos.org. The front page has links to, you know, lots of different pieces of documentation that we have that are that are not like, there's lots of chunks sort of here and there. And and like Robert spent a lot of time on the developer's guide stuff. And, I did some of that with Brian, Horstman Allen, and a few other people, I think.

Speaker 2:

I can't we did, the ilmusterdog/docsstuff where we talk about, like, the project structure and, you know, like who's who's responsible for things and how to contribute and the Garrett guide and a bunch of other stuff. So trying to and you know if any people, want to contribute and they they find that something is missing or was hard to find, we're always happy to hear, like, feedback to try and make it better. It's certainly our intention to make it, as discoverable as possible.

Speaker 1:

And and there's a lot of good stuff there. And, yeah, I think it's just your point that, like, that's an area where we're always looking for if if if you do find something was was difficult. But the, definitely, there's a pointers to the repo there and and the docs repo and so on. And I think it's that that's been actually it's been really useful just to kind of force us to do that because, I think there are plenty of people who've had their first Illumos commit well at Oxide. So, it's a good way of of, you know, making sure that that this is a a process that people can, engage with easily and without too much, it's not too arduous and so on.

Speaker 1:

So that's been, really terrific work.

Speaker 2:

That's definitely our goal is to to get as many people as possible sort of comfortable with it enough that when they eventually have something that they find that's broken, they can fix it on their own or with, you know, some help, but not like they don't need to file a bug and have someone else fix it, basically. Which is, I think, something that we did at Giant, and, you know, like the sort of general thrust of like it's it's, you know, it's a shared responsibility, all of the software, and maybe you don't know that much about all of the bits that you don't work in, but certainly you should feel empowered to, to become, you know, as experienced experienced as you would like in something that's maybe outside of your regular swim lane and that no one else has time to fix or whatever. Like, it's it's it's really all good for everybody, I think.

Speaker 1:

So XP, which I actually need to open up a a PR to more prominently link what I feel is like the the hidden jewel of the lumos dotorg site, Josh. And I just dropped in the chat, but the the books link is extremely valuable.

Speaker 2:

Yes. The books are good.

Speaker 1:

Books are good. So these are the this is a basic these are full books on various aspects of the system. Adam, it's got our obviously, the the the dynamic tracing guide that we did for dtrace, but also, like, the, if you're new to the module debugger to MDB, there's a there's actually a really good book on it, and a lot of perfect chapters on on debugging memory corruption and so on. So, and then when I'm running device drivers, like, there's a lot of lot of good stuff in there. So if you're if you're new to a Lumos or you are a Lumos curious, that's a good, good source to check out.

Speaker 2:

The source the source for those books is, like, all available. We got those from sun and then have transformed and restyle and updated them over the years because they're under a particular open source license. So

Speaker 1:

pretty great.

Speaker 5:

That that link is important because the those books don't come up in search results, which is extremely frustrating. If if you're looking for a D trace thing that we have added, for example, you find the Solaris docs where they may not have added that thing.

Speaker 1:

So Yeah.

Speaker 2:

If you type Olumos into any this is some kind of Google thing that's occurring where we're, like, it's decided that Olumos and Solaris are probably synonyms, I guess, or something. And that, like, there are, you know, that the

Speaker 1:

page ring, right?

Speaker 2:

No. I I I cast my sabot into the machine, Brian.

Speaker 1:

No. No. No. Go ask Chachi Petillo Lumospace questions. It's

Speaker 2:

me, but I'm not gonna log into it anymore.

Speaker 1:

You you're it's gonna give you such complicated feelings because it is it's like good. It's amazing.

Speaker 2:

I tried to get it to write a particular limerick that I was looking for.

Speaker 1:

The try try g p four, but also

Speaker 2:

It was unwilling.

Speaker 1:

And, and then I would also say try, perplexity dotai for your Olumos related searches. That's what I gotta say.

Speaker 2:

Okay. Also, though, if you do use Google, if you type Olumos in in the front of the query before the thing that you would otherwise wish to be searching for, it tends to

Speaker 1:

And a little bit better.

Speaker 2:

Better results. So, like, there's always the site colon illumas.org thing, definitely better results because it, like, it will constrain it to

Speaker 1:

Okay. So if you place. If you go to Perplex I know. It sound like I'm, like, an investor or just maybe a shill. A free shill.

Speaker 2:

When did that happen?

Speaker 1:

When did I become a free shell for flexi.ai? Well, you know, it was

Speaker 2:

Literally a word that I've never heard you say until

Speaker 3:

this 5

Speaker 1:

Okay. That must have gone. Gosh. Yeah.

Speaker 2:

I missed that one. Look.

Speaker 1:

Like, I'm a cheap date. Like, you buy me a ham sandwich. You got me for life. And, you know, it was good.

Speaker 2:

And a and a Lumos ham sandwich.

Speaker 1:

And a Lumos ham sandwich. So if you go if you go to perplexy dotai and search for a Lumos detrace, the first hit that you'd get, I mean, it's it get I mean, it's pretty wild. Like, it gives you a good answer. And the first and it these things are sourced, and the first source, Patrick, is the actual book.

Speaker 2:

That's good.

Speaker 1:

It's it's good. And you and this is the kind of thing where it's like and the great thing about that is, like, because it clearly it recognizes that as authoritative. And if you recognize that as authoritative, that's great at some level because there's a lot of good stuff in there. So, I think I've had my I've taken you up on your on your challenge that I surely can't make every single episode. It's a tie in.

Speaker 1:

Yeah. Exactly. Watch

Speaker 2:

me. Haste hastening as you are. Yeah.

Speaker 1:

Yeah. Exactly. I know. I'm so sorry, everyone. But and so, Josh, in terms of the experiences with this, I mean, I think it's been just as a because I feel like I'm, like, broadly a user of this.

Speaker 1:

Yeah. I actually needed to, I I needed to make an extension, where we are having to change some tunables on our system.

Speaker 2:

Right.

Speaker 1:

That we we know we we got some tunables. Oh my god. This is where it's like some to I a tunable that was introduced, a bug that was arguably introduced in 1991,

Speaker 2:

So Are you are you sitting comfortably sort of bug? Yeah.

Speaker 3:

Right. You gotta talk about some of the some of the hilarious assumptions about, like, what a large amount of memory was.

Speaker 2:

Just Oh

Speaker 1:

my god. Yeah. That was just, like, wild. The the and this is I mean, I think it's a great, you know, strength of the system that it has. The deep roots, I generally think, are a strength, but boy, it can be a parody of itself.

Speaker 1:

And, in particular, the, we were trying to understand why we were seeing much worse performance, of the much worse IO performance, kind of in the rack, and we were on the bench, and Matt Keeter and and Alan and Josh, you did some just a lot of debugging to figure out what's going a lot of detrace to figure out what's going on. Ultimately, figuring out, like, wait a minute. The arc is, like, been slammed to its minimum size on this. The arc is down to, like, it wants to be a gigabyte on the system, has a terabyte of memory. Like, what the hell happened?

Speaker 1:

And what happened is 8 and that, you know, we've seen this a couple of times in the history of the system, but, you know, it feels like a good idea to make tunables scale with physical memory.

Speaker 2:

Racial values critically, like

Speaker 1:

threshold values,

Speaker 2:

making a decision based on whether some observable property like the amount of free memory that's left is more or less than a threshold. And if it's less, we're gonna panic and do something like potentially detrimental to try and save the system. If it's more, we're gonna do nothing.

Speaker 1:

We're gonna do nothing. Right? So you get these kind of these threshold values. And the the the the problem is that you you almost want, like, anytime someone's gonna index anything off of a physical attribute, the amount of memory, the number of cores, you know, what have you, the number of PCIe lanes, you you kinda wanna, like, appear, like, an apparition from the future at their side and say, what is the most ridiculous number for this right now? And they would say, like, god, I don't know.

Speaker 1:

Let's see. The year is 1991. What is the most ridiculous amount of DRAM I can have in the system? Like, I like, 64 megabytes? It's like, okay.

Speaker 1:

And if you're gonna take a fraction of physical and you're gonna indicate that as a threshold under which you're gonna get extremely concerned, you should put a maximum on that threshold at whatever you feel that bonkers amount of physical memory is. And they're like Right.

Speaker 2:

Whatever your prediction is, anchor like Whatever prediction. Pin it

Speaker 1:

to that. Who are you and and and what is that garb that you're wearing? And you say, I come from the year 2024 when 8 when we have a terabyte of memory on all these compute sleds. And Oh,

Speaker 3:

and in the meantime, folks in chat are asking for a terabyte and a half. So it's, like, not even a half.

Speaker 1:

Yeah. But that's not even a word.

Speaker 2:

Yeah. Exactly. These computers had 14 megabytes. Right? Right.

Speaker 2:

So

Speaker 1:

And as a result, because then we, like, appropriately, and Patrick did some great work on the reservoir, we want to reserve that memory for, like, a guest memory. We don't want that for the the operating system kernel. And so we take 80% of that memory. It was, like, nope. Can't touch it.

Speaker 1:

We're gonna, like, back

Speaker 2:

to where 800 gigabytes, Like,

Speaker 1:

8 gigabytes.

Speaker 2:

Like, it's gone. Yeah.

Speaker 1:

And the and in particular, this this ancient ancient tunable, which had said, like, listen. I think that, like, an if you've consumed 7 eighths of your physical memory, if you only have an eighth of memory left

Speaker 2:

That's not much.

Speaker 1:

That's not much. And it's like, ah, yeah, that's like 125 gigabytes, actually. It's actually, it actually is a lot as it turns out. And so when you have a system that and so the the behavior of a system like this ends up being really odd because the, like and this is where you get to the the the kind of the this what what's the this old house equivalent for a software system, Adam? I feel like this is what we're it kinda like I think it's like Bob Dua, and we're kinda like, alright.

Speaker 1:

We're gonna go into this VM system, like, oh, the whole wall comes down. Like, oh my god. Okay. Yeah. There's, like, bugs in here.

Speaker 1:

Yeah.

Speaker 3:

This is all asbestos. Yeah.

Speaker 2:

It's full of bees.

Speaker 1:

Right. It's all bee it's all full of bees and dead bees. The and in particular, the most things in the system don't react to this, but one thing that does react to it is the arc, the adaptive replacement cache, and it's kinda like looking around being like, look, I'm basically like, I don't actually need this memory. It's a cache. So I wanna be super sensitive to these various elements of the system.

Speaker 1:

I wanna how much free memory do we have, and it it looks at this threshold swap of s min free. And if the amount of free memory dips below swap of s min free, it's like, oh my god. Okay. Wait a minute. I can I

Speaker 2:

will make shots more?

Speaker 1:

I I was shooting hostages. Like, don't worry. I can shoot some hostages. And, and so

Speaker 2:

Meanwhile, many gigabytes of physical free memory just sitting off in the corner just sitting there.

Speaker 3:

Yeah, I do love this. I love this code where there's a there's basically an if statement that's like, if the system has more than 16 megabytes of memory, is that you okay, then please continue. Just how long has it been since, you know, that other path has been executed?

Speaker 2:

We've had so many things like this, though. I would observe 2 things. 1 is that like, it shouldn't have been a straight line at a minimum. Right? Like, it's like Right.

Speaker 2:

You because you basically, all of this thought was done in the 25 minutes that you happen to live in the what appeared to be the linear region of the graph, right, that you're trying to express as the scaling function. And it should have, like, taped it off like a smooth log thing or something. Right? Like, pretty soon after that, but we never did that part. And 2, it's not even measuring the right thing in a lot of cases.

Speaker 2:

Like yeah. It's like, yes. You like, because it's like a complex result, I think, but, like, particularly with the page out stuff, the the amount of physical memory is not relevant in many cases. It's actually the rate of page out that is possible.

Speaker 3:

Like But other than being wrong in concept and implementation, are there other problems?

Speaker 2:

No. Other than that, missus Lincoln, it's fine. But, like, the I mean, the it it just it's like you you're worried about the system being overwhelmed by things being paged out on on one hand. Right? Because that absorbs system resources.

Speaker 2:

But but on the other hand, if that if we don't do that page out, aren't you going to run out of memory? Just do it. Like and on the other hand, the thresholds, it's like, it depends on the, like, the the the derivative, right, of the value that we're otherwise looking at. And it feels like it is at best, like, a proxy measurement that we that was cheap, so we just did that. And it was not that complicated to think about.

Speaker 2:

So we instead of, like, looking at, like the the real problem is, like, what's the what's scan? Right?

Speaker 1:

How quickly can you evict

Speaker 2:

pages to make room for the memory

Speaker 1:

you're gonna need later is really the

Speaker 2:

thing that sits at the, the core of a lot of these thresholds, I feel like. And it's just like the numbers, like, are nothing to do with that, which is unfortunate.

Speaker 1:

No. And it's like this also just like dates back to an, you know, an earlier time. This, you know, we talk about the kind of hexagonizing the system as kind of fetid immaculate and grimy, and these are all the grimy bits of the system. Like, there is the stuff that, like, works well enough until it doesn't. And you think that that on this particular issue, and there we've linked it to it in the chat, but the, on on this particular issue, it's like the system is, like, totally fine until you kind of dip below this this somewhat magical number.

Speaker 1:

Right. And then it just loses its mind.

Speaker 2:

Right. And because it's a bang bang control thing. Right?

Speaker 1:

It's a bang bang control thing.

Speaker 2:

It's like under the thing, we're gonna only shoot hostages. There will be no memory handed out to anybody. Go home. The bank is closed. Like Yeah.

Speaker 2:

It was not good.

Speaker 1:

It it was not good. And then we we

Speaker 2:

Also, like, in that era, it's not just, like, 16 megabytes. Right? It's also the 20 megahertz uni processor system or something as well. Right? So it's like not just not that much memory, also, like, incredibly slow.

Speaker 2:

So it's not like you can even do a more complicated scheme necessarily than a few integers calculations here and there. So it's like it is a struggle. I understand.

Speaker 1:

Yeah. And and it was definitely interesting to go back to, like, where is the origin of this? Because there's a comment talking about, you know, the ability to to boot on 16 meg systems. You're like, okay. This is a long, long time ago.

Speaker 1:

It did indeed goes back to does date back to 1991. And, I'm sure we've got some listeners here that for whom that is older than they are, but that is, I I think we, certainly have some colleagues at Oxide who had I think this is an older version there. So this has been around for a been around for a long time.

Speaker 2:

And we we have the bug. Right? The one of the original bugs from the from the old system with the tiny RAM, like the the text of the sun bug in the archive, which is definitely like What's the thing that we'd linked to? The the open source history bug DB thing with the

Speaker 1:

Oh, yeah. Yes, exactly.

Speaker 2:

Ancient ancient evidence of the systems. Yeah. Yeah.

Speaker 1:

No. It

Speaker 2:

is definitely like that and the arc history is like definitely motivates a lot of like things like getting the Helios repository open and trying to do as much of this stuff in the open as possible with the Illumos stuff and things like that. Like, because you think about 25 years from now. Yes. There's going to be so much more evidence for what was going on for, like, historians and No. People Engineers alike.

Speaker 1:

Podcast. People are gonna be like, you know, they're gonna be let's say the the worry. We talked about it in the podcast. You can figure out the broken thing we did. But the partly, I got on on on this thing in particular, the swap of Spinfree, because we need not that valuable.

Speaker 1:

We could we just like it tune properly upstream. There is a another value that we the size of the debuff cache. We actually did wanna tune differently for us than on upstream. So we need to go, you know, have our own little Etsy system, and there's a way of doing that with directors and so on. So, Josh, this is the first time I I've been I've been using Helios a lot as a user of Helios and building a lot of images.

Speaker 1:

And I do I love by the way, I it is it's it's chatty in a delightful way. I I have to I wanna tell you that the, which video? It's just when you're building the image. It's like it does not it doesn't disappear in silence. It's it's definitely, like, telling you what it's doing, and it's like, you

Speaker 2:

know It keeps you in the loop. Yeah. It keeps you

Speaker 1:

in the loop. It's very nice.

Speaker 2:

It tells you which file step 74 came from as well. It's It's it's very important. Yeah.

Speaker 1:

Yeah. It it it's it's very, accent. It's very very delightfully chatty. And so but this was kind of my first time extending it and using your declarative mechanism and

Speaker 2:

okay. Was it like to

Speaker 1:

say, like, you know, a file that I'm gonna have here in Helios and I'm gonna deliver it over in another location. It was just all it

Speaker 2:

was great. It was really In this location with these permissions and yeah. That's yeah.

Speaker 1:

That was it it was really, really nice. It was fun to fun to get into, and, I mean, I think it it's you just done a a great job. I think the developer experience has been really, because it had been really I was gonna note, it's been a ton of of thankless work. I know you've done a ton of work, Andy's done a ton of work, and bunch of other folks obviously as well. But, Yeah.

Speaker 1:

I think that I mean, from your perspective, I mean, are you aside from the the the surprise of it being the top story on Hacker News today, I mean, it it it feels pretty vindicating of the approach. I mean, are there things that kinda stand out to you as as being particularly vindicating or or things that we were wrong about that we had to kinda change gears on?

Speaker 2:

So I think, it was we did some important derisking stuff really early. Critically, we got Cockroach and and ClickHouse DB ported to the OS. They had only previously run on, like, Linux and and maybe the Mac and maybe maybe free BSD in the cockroach case. I think ClickHouse is probably more cross platform, but I did early ports of those just so that we didn't have to, like well, we're gonna, you know, we're gonna evaluate doing everything that we did, but but we'll we'll have to run these bodies of software, like, you know, VM or something on some other operating systems. Like, we're able to just run them all natively.

Speaker 2:

I think that was super important. And all the work we did on getting Rust to work really well. And and and, like, I mean, we talk about Rust a lot, but we also we also have done a lot of work to get Go working well and other bits and pieces and tool chains and,

Speaker 1:

you know Yeah. And see, Dave Pacheco's odyssey of a bug episode. Good lord. Adam, I feel that now that you're numbering the episodes, I wanna do, like, the over under, but I I I've got, like, I I don't know, 30, 25? No.

Speaker 1:

Because you already

Speaker 3:

talked about around 30. Well, I think, yeah, we had 30 earlier. Yeah. I don't know. I'll go I'll go dig it up.

Speaker 2:

I was reading back through the RFD that we worked on for quite some time, like, trying to decide what to do with respect to operating systems and hypervisors in the 1st year. And I mean, there's a lot of technical comparisons and, stuff in there, which are not that probably just I mean, anyone could make a different inference, I think, from any set of facts, obviously. But I think that the the thing I put in the conclusion, the point that I'd made, I think I have a copy over here, like the particularly when selecting an operating system, that there are thousands of dimensions, right, on on which these choices can be evaluated. And each dimension is a new question that you can ask. But because we can only pick 1, the options are effectively mutually exclusive.

Speaker 2:

All of the thousands of questions ultimately have to go the same way. So, like, we're not picking a Lumos because it's the best at everything.

Speaker 1:

That's right.

Speaker 2:

Just the same way that, like, if we had picked Linux, it would not be the best at everything or FreeBSD would not be the best at everything. It, you know, it meet a number of our needs on a certain set of axes, and we felt, I think, at the time that we could fill in the rest, basically. And I do think that we've done a pretty good job for our goals specifically, which might not ever be the same as anybody else's goals when they're making these decisions, but like I I feel good. I don't feel like we've made a decision and we're having to like make embarrassed, like, look at your shoes whenever anyone asks a question about it instead of in the future trying to, you know, pretend that we didn't make a mistake or something. I think we've done I think we've done well.

Speaker 2:

So I I do feel good about

Speaker 1:

where we're at. I'm glad you mentioned the RFP. It came up a bit in the Hacker News discussion today. And Yeah. Steve, you and I were that were were talking about this.

Speaker 1:

I that, I because I think we wanna get RFP 26 out there, actually. I was going through today. I'm like, we gotta open this one up and

Speaker 2:

Today was me keeping

Speaker 4:

on being, like, oh, this RFT is closed and this RFT is closed. We have to finish off our open source policy RFD and then that I think polishes that too is, like, also now on my list.

Speaker 1:

We not is is our open source policy RFD is not okay. We've that one we should be able to do.

Speaker 4:

To, like, accepted or whatever. Like I, I wrote most of it and then it was clear everybody was gonna just do that. And so I forgot to like move it into the to do it publicly.

Speaker 2:

I will say that there's something in the RFP that I wrote at the time was I really didn't feel like Linux were gonna push too hard into the, like, putting rust in the kernel. And I I will say that in what has now been an intervening number of years that they have actually done much better than I expected on that front. Like, they're actually giving it a spin and and, you know, it's it's it's impressive to see that. So that, like, that's one thing we had in the RFD where I feel like the predicted reality has not really met.

Speaker 1:

I would With their expectations. Yeah. Yeah. And I think that

Speaker 2:

But, otherwise, I think it's probably pretty good.

Speaker 1:

Like You do. It is good. I think that it's, and I think that we we will put it out there because I think we want, fortunately, Augustus and Ben and Dave Cross, a bunch of folks, inside of Oxide, have done a great job allowing us to share RFTs on a much smaller granularity. So it's really easy for us to make that make a couple of these key ones public. Yeah.

Speaker 1:

And Steve Steve, we should go through certainly make the open source policy public, but we should make the RFP 26 public. And I because I think that it's it's helpful for people to see, our thinking. I think that there was a there was a little bit where, definitely the Hacker News said that he's like, well, of course, you guys are doing this because, like, Brian told you to do it. And

Speaker 2:

Right. And then I think Which is just ridiculous.

Speaker 1:

Like It's ridiculous. I think Rave Don't tell me what to do. What do you think you are? My boss? Come on, man.

Speaker 2:

Honestly, Ray did

Speaker 4:

a good job as well. Yeah. Yeah.

Speaker 2:

I remember Ray

Speaker 1:

did a good job basically being like, we kinda don't listen to him. I I mean There is that.

Speaker 2:

But even setting that aside, I mean, I remember when we started, right, we'd just been through the ringer at Samsung where they blamed us and our dumbass operating system for everything that was wrong with the world. Do you I mean, there was a slide in the deck or something like,

Speaker 3:

you know,

Speaker 2:

we had we were engineers are taught to feel shame when they do as badly as you have done or whatever. It's like, okay. Good good meeting. Good meeting. Right?

Speaker 2:

But, like, you know, it it, I think we were all a little gun shy coming out of that experience. No one was feeling in a particularly, like, advocate sort of mood. Evangelical sort of, like, let me tell you the good news about the Tomas operating system that is the cause of all problems. Like, it took a while to feel confident again, certainly, after that

Speaker 3:

experience. And

Speaker 1:

it's kinda funny you mentioned that because I actually feel that, like, I felt that especially strongly. I think we had made a bunch of decisions, like, very implicitly in the past, and we wanted to be very explicit about a bunch of these big decisions. Because I also felt honestly the same way about about, about the database decision because we kind of, like, made the we kinda defaulted into a a lot of Postgres without really surveying

Speaker 2:

It's not the worst default that you To be clear,

Speaker 3:

this is the giant way.

Speaker 1:

I just wanted to clarify. And not an oxide. Yeah.

Speaker 2:

So we definitely did not wanna make the mistake of assuming that that was the best decision, which is why we ended up without post growth. But yeah.

Speaker 1:

Well, and we just wanted to, like,

Speaker 2:

take things

Speaker 1:

take things to their paces and really explore. And I think we did a you know, on a bunch of these different axes on on Cockroach and the and the database, I mean, Dave's got a terrific RFP, both on the rubric there and what we explored. Ben Nacker's got a great RFP and what we explored for and how we landed on the ClickHouse decision. And I think, Josh, your RFP 26 was very great on how we landed on Lumos. And it was not like

Speaker 2:

Not just the Lumos though. Like, we also we made that decision along 2 pretty different axes. Right? We also were considering the hypervisor at the same time because it was gonna be really hard to pick an operating system without also picking a hypervisor, like, by default ultimately. So, like, we we were really looking at 2 pretty different things and trying to rationalize both of them at the same time, which was part of why it was complicated and took a long time, I think, to It was.

Speaker 2:

And I think decision.

Speaker 1:

No. And I think that, you know, we didn't call it out as much in that RFD, if I recall correctly, although I wanna go pull it up. But the, the kind of the other, you know, the thing that is really important when you're especially when you are looking at an open source project that you're gonna be, it's gonna be really core to what you're doing. You wanna understand, and this actually goes for any component that you're gonna integrate. It's like, what are the values that I have?

Speaker 1:

What are the values that it has or its community has and or its sponsor has? And how much do those, overlap? And I think that we, you know, I think that we feel like the values that we have and the values that Rust has, for example, have been really great. You know, it's been a really good overlap, and we unlike unlike in other lives in previous worlds, we felt like we're kind of, like, butting heads with people because we disagree on the importance of certain things. Like, we haven't had that happen, and I feel like that's been true for, for Alumos.

Speaker 1:

I think it's been true for for certainly for Beehive and obviously for Propellas where we kinda gone I've gone our own way, but I I just feel like a lot of these things that we that's been part of our rubric, and, that's I I think that that that's been that's right. You know? And,

Speaker 2:

I particularly like the the focusing act of writing down the positives that we in the because certainly like the hackney's comments, are familiar refrain, right? So you're just picking the thing that you like, that you like, right? It's like, you know, actually through therapy in a number of years, I am picking the thing that I like, but also I like it because it's good. Like, that's the Right. Like, I, it's not like I like it because I've picked a sporting team.

Speaker 2:

Like, I picked a tool that I think is good at certain things. And the reason I enjoy using it is because it, like, it helped like, the there are many technical properties of the tool that help me do the work that I'm trying to do. So, yes, I do like it, but but there are also a number of, like, complex technical reasons that that that's true and we should explain those. And we, we owe it to, to people to at least explain why they don't have to agree, but like we should explain our thinking. I think that's the, Yeah.

Speaker 4:

I think the sports team analogy is definitely accurate. And I've used it myself, when thinking about like, why this is sometimes a touchy question. It it sort of feels like there's an implication that we are not sober technologists evaluating tools and choosing the right one, but instead of rooting for sports teams, and that's kind of, like, professionally a little insulting. And so it's easy to get a little defensive about this, because of things like that, or that, like, we only know things about Lumos and not about other operating systems. Right?

Speaker 2:

Like Right.

Speaker 4:

Laura in particular was, like, a very big part of of talking about this decision, and she has an incredible depth of Linux experience. So by suggesting that, like, oh, we only know and like this one thing and picked it is, like, also sort of low key insulting the people who, you know, have a Yeah. Great breadth of knowledge with many of those people at oxide. So I think that's at least why it, like, gets my hackles going a little teasing.

Speaker 2:

It's also kind of insulting, you know, like, on some level. Like, we supported guest operating systems at Giants. Like, we we were running many different Linuxes and and Windows, and we had I mean, for god's sake, there was a company company that didn't end up buying anything from Giant, but it really seemed like they were gonna buy something.

Speaker 1:

Yes. And Yes.

Speaker 2:

I enthusiastically Oh, it.

Speaker 1:

Hold on, everybody. Get out your bingo card. Because

Speaker 2:

I I They have an old old system in some, let's say point of sales sort of area, I think, that that was running on SCO Open Server. And they were running that under, I assume, VMware or whatever. And they were like, because they were, I assume, using us to, get a good VMware deal or whatever, like had no intention of buying anything, but they but they gave us really quite a lot of technical detail about what they were not gonna buy from us and they're foolishly putting our back into so we we supported for a good week there, like, scope and server under the joint smart OS thing.

Speaker 1:

You, I think, the phonetically did the work to get and going,

Speaker 2:

wait a second. It worked. It definitely worked. I remember, when people talk about the glory of the UNIX past, I would like to direct them to SCO Open Server in which one has to relink the kernel to change certain aspects of the IP address configuration of the machine. So, you know, it's not all great in the past, actually.

Speaker 2:

Certainly, there was a lot of, like, you know, at some point we stopped panicking and starting returning EIO, things like that. Like, let's not let's not think too too well of the the the ID number before.

Speaker 1:

The the the Wikipedia page for open server. It's like open server, source model, closed source.

Speaker 2:

Source model, none.

Speaker 1:

So it's all not it's like, what okay. So when we say open server, what do we actually what is the open? An open server

Speaker 2:

There was that period of Earth's history where open VMS and open server and, Oracle's open open systems.

Speaker 1:

No. This is I

Speaker 2:

remember. Yeah.

Speaker 1:

No. This is this early history of Sun is is about the open system. Open system being like, no. Because I told you, like, I've told you what the API is, it's open. That's right.

Speaker 1:

It's like

Speaker 3:

There's like a book and everything.

Speaker 1:

And and there is some truth

Speaker 2:

to that, actually. Like, I mean, you think about, like, there is actually some value in Sure. Even if you're gonna deliver a proprietary component, like it having a documented API that other people can interact with and interoperate with. And like, because at the time, recall, Microsoft didn't even have that. Like, it took acts of parliament in foreign countries to get them to talk about SMB.

Speaker 2:

Right? I mean, basically. So anyway, I don't yeah. It just Adam, I I I Degrees of openness.

Speaker 1:

I see you denigrating Haiku with the chat. I love Haiku as a guest. I I I have so many great memories of Haiku. You know what Haiku's got a really good kernel debugger, actually. If you need to bring up of a of a new hypervisor, like, you could do worse than Haiku.

Speaker 1:

Haiku is which because you kinda need a a something with a good kernel the bugger because

Speaker 2:

I I like things like haiku. Because even though I'm not really interested in using it, right, I enjoy the it's kind of like a all manner brothers kind of thing. It's like you too are a minority operating system that 18 other people are interested in and somehow have survived 10 additional years beyond. Like, all the while people telling you, like, it's really sad that that operating system died. It's like, I'm still here.

Speaker 2:

What are you, like I'm still here. Want me

Speaker 1:

to hear you right now. Like, can you

Speaker 2:

Am I invisible? Like Right. I know. God. You know, it's like yeah.

Speaker 2:

Yeah.

Speaker 1:

I know. I think you always find us having a I I and I think it it has kinda given us a reverence for all systems great and small. Yes. Because, there's a lot of great stuff that's out there that is, we're trying but as you say, I mean, just kinda the we had to make a big decision, and it's and we made a big decision around around Cockroach and ClickHouse and Rust and all these other things too. And when you're making a

Speaker 2:

big decision, you know, you're Building your own computers and stuff and not having any buyers and

Speaker 1:

Alright. And, you know, we're talking about PMI. The right. No IPMI, no ACPI, no UEFI, no 4 letter acronyms whatsoever, actually. No bias.

Speaker 2:

Except so all the other ones that we've been invented. Okay.

Speaker 1:

Fair enough. Yeah.

Speaker 2:

It's that. Yes. You you ought is a 4 letter acronym, unfortunately.

Speaker 1:

You know, how can you come up with these so quickly? This is like the, you know, it's you got, like, a crossword brain that can go to this quickly. Yes. Alright. Fine.

Speaker 1:

We have many 4 letter acronyms.

Speaker 2:

None with a consortium.

Speaker 1:

That's true. But, Josh, this has been this is great work. It's been as a as a user of Helios, it's been, I've been really excited to get this out there and to have, go have this conversation. And I, again, I was surprised that there was, but I but I shouldn't have been because

Speaker 2:

People were waiting for it. I feel like, you know. And it's been leaking out the side. Like, you like, if you you know, the package repository was available, obviously, because it's like, again, none of it's proprietary. It's just that we were more embarrassed about the mess and the source base than anything else.

Speaker 2:

Right? But the, yeah, like the, I personally am super excited for a future in which people who are like 22 or something, right, in something in this magical future go on eBay and they can buy like a gimlet that fell off the back of something and they can take this offer and they can put this offer on it because like it's open source and they're allowed to do that. Because I mean, when I when I was when I was young, I had, like, a bunch of, like, awfully some gear in my garage. Right? And, I think that it's like you've made it as a computer company when there is a bunch of old hardware that you made in the past that still works because you've made it well, but it's really cheap because it's a hold and there's a bunch of it floating around.

Speaker 2:

So like that's I'm excited for that that pointed out.

Speaker 1:

So somewhere somewhere out there, there's there's there's there's an infant who is currently crying to be changed. But before you know it, that'll be a 22 year old hipster taking a gimlet and getting it.

Speaker 2:

Hipsters Finding a power supply that works and

Speaker 1:

Exactly. It'd be, like, 2043, and they are, playing this podcast at at at 3 x trying to find the actual bit where they get, like, I need a bit where they tell me how to get the archive working.

Speaker 2:

Isn't that isn't that a glorious thing, though, to think about? Like, that that's yeah, I don't know. I'm excited for that. I'm excited. That's that's how you know you've arrived.

Speaker 2:

You're like

Speaker 1:

That that's right. When you're when you're, retro hip.

Speaker 2:

It's true.

Speaker 1:

Yeah. Right.

Speaker 2:

Ebay hip. Ebay hip.

Speaker 1:

Well, it's been, Josh, again, this has been awesome. And, Steve, thanks for joining me at the Hacker News comments. You've definitely you've been Yeah.

Speaker 2:

We're in the trenches together. Once again, just just when you think I'm done with

Speaker 4:

my tours of duty, there's another tour showing up.

Speaker 2:

When I saw that there were a 132 comments or whatever, I wasn't quite expecting that, like, 45 of them would have been from Steve, but he's doing a lot more over there. Oh, it he's

Speaker 1:

really, really good stuff. And, Patrick, thank you as well for for joining us. This has been a lot of fun, Adam. This has been it's been great. So thanks for, and and I I know you look forward to your regifted birthday present.

Speaker 1:

Yeah. Unbursed. Yes. Scent of Sun Microsystems. That's right.

Speaker 1:

It's The

Speaker 3:

page is nice and nice and tight for

Speaker 1:

me. Exactly. I'll just cover over this library marking so you think I bought it new.

Speaker 5:

Good meeting.

Speaker 1:

Good meeting. Absolutely. Alright. Well, we, I think the I know a bunch of people have asked, so look for, I think we wanna do a deep dive on Crucible coming up soon. So, the stay tuned for Our storage

Speaker 3:

service, typically.

Speaker 1:

Our storage service, with Alan Hansen and crew. So, look for that in a in a future episode. We also wanna get to I think we've got a bunch of, Patrick, we're gonna have you come back and talk Propellas. I think we got a bunch of things we want to talk about. So stay tuned.

Helios
Broadcast by