Thursday, March 19, 2009

How big is HelenOS? How small it can get?

After some discussion with Jiri, I thought it would be useful to know how big is the HelenOS kernel. I've done several measurements and obtained interesting and surprising results. For each architecture, I have built the system using the default configuration, and a configuration with all options disabled. Both these builds would use the -O3 optimization. Then I changed the -O3 to -Os and rebuilt the minimal configuration. On some platforms, the -Os could not be used right away due to missing parts of the softint library, so I used -O2 instead. The HelenOS build system does not strip the kernel.raw binary so I did that manually and wrote down the final size obtained. These are my results:


What is immediately clear from the chart is that there is something wrong with amd64 as it gives a kernel which is about three times as big as the other kernels. We'll clearly need to investigate this.

All 32-bit kernels and all 64-bit kernels except the amd64 kernel are comparable in size. The ia32 kernel is the smallest one, even though it would be interesting to know how small would be the arm32 kernel with the -Os optimization. When the minimal configuration is optimized for size and stripped, the ia32 binary is only 116 KiB. When I alter the kernel Makefile, I can get additional 4 KiB away by not using frame pointers (-fomit-frame-pointer). Even though not supported right now, further 27 KiB can be shook off by not including the .bss section in the kernel image. This will give us quite respectable kernel size of just 84 KiB! This 84 KiB kernel would be completely headless (almost no drivers compiled in) and trimmed down (e.g. without SMP support), but it would do its job well. I believe further improvements are still possible, but the continued effort will most likely start to have diminishing results soon.

Friday, March 13, 2009

Interfacing with the user in HelenOS

For about a month, Martin, Jiri and me have been torturing HelenOS sources in a distributed attempt to improve the subsystems responsible for processing user's input and output. Things are still not perfect now, but before these changes, this part of HelenOS was a complete disaster. Here are few examples of how badly designed this part of HelenOS was:
  • There was no layering between the hardware drivers and the code which interpreted characters received using these drivers - everything would happen inside the driver itself. With this setup it was not possible to support different hardware configurations in a clean and generic way. For instance, the ns16550 driver assumed a Sun keyboard attached to the ns16550 serial port. This was, of course, something which complicated the use of the driver for plain serial communication between two computers interconnected with a serial cable.
  • Neither the kernel nor the userspace drivers supported more than one instance of each character device (e.g. i8042, ns16550, z8530).
  • Because of only one instance was supported, each driver inferred its role as stdin or stdout without asking.
  • The way how interrupts were sent to userspace device drivers required a little brother kernel driver for the same device which would accept the interrupt and send it to the userspace server.
  • Both kernel and userspace drivers were rather platform specific, either using memory mapped accesses to device's registers or separate I/O instructions.
  • There was also some duplication of code, both on the physical device level (e.g. duplicate ns16550 driver) and also on the character interpretation level (e.g. several occurrences of code which processed serial line input).
Jiri was the first to do something about it. He focused on the userspace kbd server and introduced layering into it. In his design (see the picture below), there are port drivers, that control the physical devices such as the i8042 or z8530. In combination with my later changes, the port drivers use generic PIO operations to directly interact with the device. Each port driver would typically register an interrupt handler for the device's interrupt and provide an interrupt top-half pseudocode. Upon interrupt, the port driver reads data from the device and pushes it to the next layer. The next layer in this case is the controller layer. It assumes a PC keyboard data on input. If the input data does not come from a PC keyboard but, for example, a serial line, the controller layer driver transforms the stream of ASCII characters into PC keyboard scan codes using a scan code simulator. This layer would also convert other forms of input such as that coming from the Sun keyboard to the set of PC keyboard scan codes. Thus, the PC keyboard scan codes form a common representation of data coming from the lower layers. In order to support multiple keyboard layouts and to distinguish physical location of the pressed key from its label, the common representation is referred to as to key codes rather than scan codes. Key codes are further pushed to the next layer called the event layer. The event layer keeps track of the status of the Shift, Ctrl, Alt and the Lock keys and performs the layout translation. So far, two layouts are supported:
  • US QWERTY
  • US Dvorak
After the layout translation is done, the key event is sent via IPC to the only consumer of kbd - the console server.


Encouraged and inspired by Jiri's success, I wanted to fix the layering and all the other above mentioned problems in the kernel too. I started by converting the kernel drivers to the PIO (Programmed I/O) interface. The PIO functions abstract the implementation of the machine's I/O space away and allow the drivers to be written in a generic way, regardless of the fact whether the device is in a separate I/O space or is memory mapped. Once I had PIO for all platforms, I started to convert all character device drivers to it. That was probably the easiest part. In parallel, I began to slowly move away from the one-instance per device driver model and free the kernel drivers from the duty to notify their userspace counterparts about interrupts via IPC. That was probably the hardest part as it required:
  • extend the interrupt top-half pseudocode to support independent userspace drivers, and
  • rewrite the way how interrupts are dispatched in the kernel, and
  • fix the userspace drivers to use the new pseudocode.
Extending the pseudocode was rather fun. I took the chance and fixed the existing pseudocode to only make use of PIO when accessing the device and added operations for testing bits, conditional execution of blocks of pseudocode commands and accepting the interrupt.
Rewriting interrupt dispatching looked like a real teaser to me because the kernel interrupt structures are never deallocated while the userspace interrupt structures need to be allocated and deallocated dynamically on demand. It was also interesting from the synchronization point of view as I didn't want to deallocate a userspace interrupt structure while the interrupt is in progress. In the end, I came up with a solution with separate hash tables for userspace and kernel interrupts. My change broke some things such as klog and kconsole notifications as well as switching between kernel and userspace drivers. The latter used to work thanks to the grab and release methods in each kernel driver. But since the kernel driver became a distinct entity and these functions were removed, the driver toggling had to be solved in another way. This is were Martin got involved and implemented a clever fix:
if the silent variable is true, search the userspace hash table first and the kernel hash table second
if the silent variable is false, search the kernel hash table first and the userspace hash table second
This gives a userspace driver a chance to process the interrupt if the kernel console is inactive, but if there is no userspace driver to claim the interrupt, the kernel can still react to it, and vice versa.

Having freed the kernel drivers from the burden of interrupt notifications for userspace brothers, I could finally proceed to fix the other problems and layer the kernel input subsystem into several components. At the lowest level, there are serial controller drivers, much like the port drivers in the case of the userspace kbd server. Each driver connects either to a keyboard input module or a serial line input module. Contrary to the userspace server, the kernel input modules convert raw data from the serial controller drivers to a stream of ASCII characters and feed it the connected component, which is most likely the kernel console.

After my change, Martin noticed that the data structure which has been used to pass characters between various components - chardev_t - is bidirectional in nature, even though it is mostly used for one direction only. After some 30 commits, the whole kernel and all drivers and input modules were converted to use indev_t for input devices and outdev_t for output devices instead.

On the userspace side, the kbd server is still a monolithic piece of software with a limited hardware support and flexibility configured during compile time. We are considering changing this towards a more modular scheme in which there would be one running port driver task per device instance. That would allow us to move the configuration from compile time to runtime.