More Tales from the Bringup Lab
Members of the Oxide hardware team talk about their recent bringup struggles and triumphs with the server sled (Gimlet) and rack switch (Sidecar)
Oxide and Friends Twitter Space: April 18th, 2022
More Tales from the Bringup Lab
We've been holding a Twitter Space weekly on Mondays at 5p for about an hour. Even though it's not (yet?) a feature of Twitter Spaces, we have been recording them all; here is the recording for our Twitter Space for April 18th, 2022.
In addition to Bryan Cantrill and Adam Leventhal, we were joined by members of the Oxide team: Arjen Roodselaar, Nathaneal Huffman, Robert Mustacchi, Aaron Hartwig, Steve Tuck, Matt Keeter, Eric Aasen, Rick Altherr, and RFK.
Some of the topics we hit on, in the order that we hit them:
- @2:25 Overview of upcoming themes related to the bringup lab
- @4:28 Defining the different terms and code-names of the hardware in development at oxide
- @4:40 Gimlet, the compute node
- @5:10 Sidecar, a board based on a switching ASIC from Intel
- @7:24 Arjen's twitter thread with details related to the bringup and Eric's description of the challenges in designing the PDN (Power Delivery Network) ATT
- @15:34 The load-slammer, an electronic load to simulate the power draw of an ASIC / BGA-part LS
- @19:06 Bouncing supply cables on load steps
- @22:27 FPGA that controls everything on the Sidecar board
- @24:05 TOML's unstable table order made the team pop a couple ICs off the board searching for bugs
- @31:41 Brown-out in the hotel during first bringup session from a blown bus duct
- @33:45 Debugging ground bounce issues while testing the PDN with the load-slammer (phantom over/undershoot)
- @40:15 Hardware team pranks the management during a meeting with a potential investor
- @43:20 Chonky heat sink that weighs 8 pounds / moment arm crisis
- @48:19 First time powering up, checking temperature with thermal camera, learning about "puppy dog warm"
- @52:12 Matt talks about the second, "lesser" network switch on the Sidecar board
- @57:28 Secret 8051 cores, slew-rate woes: impedance missmatch on SPI traces that manifested in unreliable communication in full bandwidth mode of the SPI/GPIO driver
- @1:03:19 PLL config issues and Matt's verbose config tool to fix them
- @1:04:26 Load-bearing dongles
- @1:06:37 Debugging PCIe link, Arjen's Frankenstein PCIe analyzer/exerciser
- @1:22:36 Gimlet, stumbling blocks found in January
- @1:30:08 Arjen's big breakthrough on the Sidecar, shouting at the T6
- @1:32:08 Cursed pull-downs, Rick's remote hardware debugging support by incrementally breaking his T6 boards to find issue with the DUT
- @1:36:24 T6 finally comes out of reset, "we're gonna live! we're gonna live!"
- @1:41:06 Rick reworks gnarly footprint error, on multiple ICs, to verify design for Rev. B - dead bug style.
- @1:53:12 Sidecar progress continuation, cable oupsi, off-by-one error
- @1:59:42 Dedicated support by IC vendor with very understanding wives
- @2:01:20 Summary and parting thoughts
If we got something wrong or missed something, please file a PR! Our next Twitter space will likely be on Monday at 5p Pacific Time; stay tuned to our Twitter feeds for details. We'd love to have you join us, as we always love to hear from new speakers!