Post Syndicated from Matthew Garrett original https://mjg59.dreamwidth.org/50924.html
The X210 is a strange machine. A set of Chinese enthusiasts developed a series of motherboards that slot into old Thinkpad chassis, providing significantly more up to date hardware. The X210 has a Kabylake CPU, supports up to 32GB of RAM, has an NVMe-capable M.2 slot and has eDP support – and it fits into an X200 or X201 chassis, which means it also comes with a classic Thinkpad keyboard . We ordered some from a Facebook page (a process that involved wiring a large chunk of money to a Chinese bank which wasn’t at all stressful), and a couple of weeks later they arrived. Once I’d put mine together I had a quad-core i7-8550U with 16GB of RAM, a 512GB NVMe drive and a 1920×1200 display. I’d transplanted over the drive from my XPS13, so I was running stock Fedora for most of this development process.
The other fun thing about it is that none of the firmware flashing protection is enabled, including Intel Boot Guard. This means running a custom firmware image is possible, and what would a ridiculous custom Thinkpad be without ridiculous custom firmware? A shadow of its potential, that’s what. So, I read the Coreboot motherboard porting guide and set to.
My life was made a great deal easier by the existence of a port for the Purism Librem 13v2. This is a Skylake system, and Skylake and Kabylake are very similar platforms. So, the first job was to just copy that into a new directory and start from there. The first step was to update the Inteltool utility so it understood the chipset – this commit shows what was necessary there. It’s mostly just adding new PCI IDs, but it also needed some adjustment to account for the GPIO allocation being different on mobile parts when compared to desktop ones. One thing that bit me – Inteltool relies on being able to mmap() arbitrary bits of physical address space, and the kernel doesn’t allow that if CONFIG_STRICT_DEVMEM is enabled. I had to disable that first.
The GPIO pins got dropped into gpio.h. I ended up just pushing the raw values into there rather than parsing them back into more semantically meaningful definitions, partly because I don’t understand what these things do that well and largely because I’m lazy. Once that was done, on to the next step.
High Definition Audio devices (or HDA) have a standard interface, but the codecs attached to the HDA device vary – both in terms of their own configuration, and in terms of dealing with how the board designer may have laid things out. Thankfully the existing configuration could be copied from /sys/class/sound/card0/hwC0D0/init_pin_configs and then hda_verb.h could be updated.
One more piece of hardware-specific configuration is the Video BIOS Table, or VBT. This contains information used by the graphics drivers (firmware or OS-level) to configure the display correctly, and again is somewhat system-specific. This can be grabbed from /sys/kernel/debug/dri/0/i915_vbt.
A lot of the remaining platform-specific configuration has been split out into board-specific config files. and this also needed updating. Most stuff was the same, but I confirmed the GPE and genx_dec register values by using Inteltool to dump them from the vendor system and copy them over. lspci -t gave me the bus topology and told me which PCIe root ports were in use, and lsusb -t gave me port numbers for USB. That let me update the root port and USB tables.
The final code update required was to tell the OS how to communicate with the embedded controller. Various ACPI functions are actually handled by this autonomous device, but it’s still necessary for the OS to know how to obtain information from it. This involves writing some ACPI code, but that’s largely a matter of cutting and pasting from the vendor firmware – the EC layout depends on the EC firmware rather than the system firmware, and we weren’t planning on changing the EC firmware in any way. Using ifdtool told me that the vendor firmware image wasn’t using the EC region of the flash, so my assumption was that the EC had its own firmware stored somewhere else. I was ready to flash.
The first attempt involved isis’ machine, using their Beaglebone Black as a flashing device – the lack of protection in the firmware meant we ought to be able to get away with using flashrom directly on the host SPI controller, but using an external flasher meant we stood a better chance of being able to recover if something went wrong. We flashed, plugged in the power and… nothing. Literally. The power LED didn’t turn on. The machine was very, very dead.
Things like managing battery charging and status indicators are up to the EC, and the complete absence of anything going on here meant that the EC wasn’t running. The most likely reason for that was that the system flash did contain the EC’s firmware even though the descriptor said it didn’t, and now the system was very unhappy. Worse, the flash wouldn’t speak to us any more – the power supply from the Beaglebone to the flash chip was sufficient to power up the EC, and the EC was then holding onto the SPI bus desperately trying to read its firmware. Bother. This was made rather more embarrassing because isis had explicitly raised concern about flashing an image that didn’t contain any EC firmware, and now I’d killed their laptop.
After some digging I was able to find EC firmware for a related 51NB system, and looking at that gave me a bunch of strings that seemed reasonably identifiable. Looking at the original vendor ROM showed very similar code located at offset 0x00200000 into the image, so I added a small tool to inject the EC firmware (basing it on an existing tool that does something similar for the EC in some HP laptops). I now had an image that I was reasonably confident would get further, but we couldn’t flash it. Next step seemed like it was going to involve desoldering the flash from the board, which is a colossal pain. Time to sleep on the problem.
The next morning we were able to borrow a Dediprog SPI flasher. These are much faster than doing SPI over GPIO lines, and also support running the flash at different voltage. At 3.5V the behaviour was the same as we’d seen the previous night – nothing. According to the datasheet, the flash required at least 2.7V to run, but flashrom listed 1.8V as the next lower voltage so we tried. And, amazingly, it worked – not reliably, but sufficiently. Our hypothesis is that the chip is marginally able to run at that voltage, but that the EC isn’t – we were no longer powering the EC up, so could communicated with the flash. After a couple of attempts we were able to write enough that we had EC firmware on there, at which point we could shift back to flashing at 3.5V because the EC was leaving the flash alone.
So, we flashed again. And, amazingly, we ended up staring at a UEFI shell prompt. USB wasn’t working, and nor was the onboard keyboard, but we had graphics and were executing actual firmware code. I was able to get USB working fairly quickly – it turns out that Linux numbers USB ports from 1 and the FSP numbers them from 0, and fixing that up gave us working USB. We were able to boot Linux! Except there were a whole bunch of errors complaining about EC timeouts, and also we only had half the RAM we should.
After some discussion on the Coreboot IRC channel, we figured out the RAM issue – the Librem13 only has one DIMM slot. The FSP expects to be given a set of i2c addresses to probe, one for each DIMM socket. It is then able to read back the DIMM configuration and configure the memory controller appropriately. Running i2cdetect against the system SMBus gave us a range of devices, including one at 0x50 and one at 0x52. The detected DIMM was at 0x50, which made 0x52 seem like a reasonable bet – and grepping the tree showed that several other systems used 0x52 as the address for their second socket. Adding that to the list of addresses and passing it to the FSP gave us all our RAM.
So, now we just had to deal with the EC. One thing we noticed was that if we flashed the vendor firmware, ran it, flashed Coreboot and then rebooted without cutting the power, the EC worked. This strongly suggested that there was some setup code happening in the vendor firmware that configured the EC appropriately, and if we duplicated that it would probably work. Unfortunately, figuring out what that code was was difficult. I ended up dumping the PCI device configuration for the vendor firmware and for Coreboot in case that would give us any clues, but the only thing that seemed relevant at all was that the LPC controller was configured to pass io ports 0x4e and 0x4f to the LPC bus with the vendor firmware, but not with Coreboot. Unfortunately the EC was supposed to be listening on 0x62 and 0x66, so this wasn’t the problem.
I ended up solving this by using UEFITool to extract all the code from the vendor firmware, and then disassembled every object and grepped them for port io. x86 systems have two separate io buses – memory and port IO. Port IO is well suited to simple devices that don’t need a lot of bandwidth, and the EC is definitely one of these – there’s no way to talk to it other than using port IO, so any configuration was almost certainly happening that way. I found a whole bunch of stuff that touched the EC, but was clearly depending on it already having been enabled. I found a wide range of cases where port IO was being used for early PCI configuration. And, finally, I found some code that reconfigured the LPC bridge to route 0x4e and 0x4f to the LPC bus (explaining the configuration change I’d seen earlier), and then wrote a bunch of values to those addresses. I mimicked those, and suddenly the EC started responding.
It turns out that the writes that made this work weren’t terribly magic. PCs used to have a SuperIO chip that provided most of the legacy port functionality, including the floppy drive controller and parallel and serial ports. Individual components (called logical devices, or LDNs) could be enabled and disabled using a sequence of writes that was fairly consistent between vendors. Someone on the Coreboot IRC channel recognised that the writes that enabled the EC were simply using that protocol to enable a series of LDNs, which apparently correspond to things like “Working EC” and “Working keyboard”. And with that, we were done.
Coreboot doesn’t currently have ACPI support for the latest Intel graphics chipsets, so right now my image doesn’t have working backlight control. But other than that, everything seems to work (although there’s probably a bunch of power management optimisation to do). I started this process knowing almost nothing about Coreboot, but thanks to the help of people on IRC I was able to get things working in about two days of work and now have firmware that’s about as custom as my laptop.
 Why not Libreboot? Because modern Intel SoCs haven’t had their memory initialisation code reverse engineered, so the only way to boot them is to use the proprietary Intel Firmware Support Package.
 Card 0, device 0
 After a few false starts – it turns out that the initial memory training can take a surprisingly long time, and we kept giving up before that had happened
 Spread over 5 or so days of real time