Monday, September 29, 2008

New subject at Charles University

It looks like I'll be teaching this course at Charles University in the summer semester. The course is basically an extension of the mini CDA course that I and my two colleagues delivered at MFF UK in April. It is also an extension to the internal Sun x86 CDA training, that I am going to deliver with my companions this October.

In addition to the mini version, the full-fledged university course will cover 32-bit x86 architecture too and compared to both the mini and the internal courses, we'll also venture into CDA on SPARC. We don't want people to suspect us that we are using this university subject to market Solaris based on its superior observability features, so the course shall teach people how to tackle crashes on Linux and Windows as well, even though at least in the Windows case we plan to invite an expert in that area.

Wednesday, September 3, 2008

Keeping the balance right

In order to balance my previous blog, I decided to sum up some recent achievements on the HelenOS front. These things are well known to people who follow our mailing list, but I suspect that the majority of people out there are not even subscribed.

I scratched two of my long time kernel itches: the infamous "Sleep not implemented" panic and the kernel's inability to allocate memory from a low-memory region.

The "Sleep not implemented" panic occurred every time when the kernel invoked frame_alloc() in its blocking mode under very low memory conditions. That happened when the (virtual) machine had less memory than what the system required. After my modification, the offending thread would be simply put to sleep instead of panicing the kernel. When there is enough memory, the thread can complete the allocation request. The catch is that when all of the threads just allocate and none of them frees memory, all the threads will be put to sleep and none will run. You still have the possibility to break into kconsole and kill some of the tasks so that its allocated memory is freed and others can run. You are out of luck when the one which went to sleep is the kconsole thread itself (that could happen when running the kernel tests which allocate lots of memory). The situation can be further improved when we have support for paging pages out and back in. Btw, this improvement finally found good use for condition variables.

As mentioned above, the kernel can now theoretically allocate memory from the lowest 4G (contrary to allocating from all memory available). This is necessary, for example, for allocation of secondary processors' GDT on amd64. The fix works in the following way. When a memory zone is created, it is associated with flags that describe it. The frame allocator was slightly altered to only consider zones matching flags passed along with the request, if some were specified by the caller. What remains to be done is to change the way how memory zones are created in the architecture specific initialization code. In particular, it is necessary to avoid merging zones with incompatible flags.

My biggest achievement, however, is the read-only support for FAT16 as the root file system. Together with Tim Post's addition of a simple command line interface, Martin Decky's boot-from-ramdisk infrastructure and Jiri Svoboda's task loader, the system is now much more usable than before. The biggest differentiator is that you can now interactively walk the file system (either FAT or TMPFS) and spawn as many new tasks (located on the file system) as you want. Due to ramdisk size issues, the FAT filesystem is now usable only on ia32 and amd64, but this is likely to change soon.

Tuesday, September 2, 2008

My failure to implement &&

I spent half of the past weekend implementing support for the logical AND operator for the Lestes compiler. I should say right at the beginning that I still haven't succeeded, but I decided to share the details of my struggle with you, the kind reader.

The state of the current && and || support is that the parser recognizes these constructs and generates semantic structures for them. The problem is that the part of the compiler which is supposed to generate the respective pseudo code is currently a mere assert() saying that these operators are not yet implemented.

The particular problem of implementing these two is their short-circuit evaluation. When you have, for example, A && B, you first need to evaluate A and if it evaluates to true, only then you can start evaluating B. Without this special feature of C/C++, I'd be able to implement && and || in half a minute, using the already existing template for binary operands. With short-circuit evaluation in place, I need to insert a couple of extra pseudo instructions in the right order:

  1. psp:
  2. psp(A):
  3. L = evaluate(A)
  4. nsp(A):
  5. PI_BF(L, msp2) /* skip evaluation of B */
  6. psp(B)
  7. R = evaulate(B)
  8. nsp(B)
  9. DEST = PI_LAND(L, R) /* logical AND */
  10. msp1:
  11. PI_BA(nsp)
  12. msp2:
  13. PI_MOV(L, DEST)
  14. nsp:
In the pseudo code snippet above, psp stands for previous sequence point and nsp stands for next sequence point. You can see that the sequence points figure as destinations for branch instructions PI_BF and PI_BA and that each expression is surrounded by the psp-nsp pair. nsp(A) and psp(B) are essential in that I must place the PI_BF instruction somewhere between the two, enforcing thus the order of evaluation to start with subexpression A. I added two extra sequence points msp1 and msp2 to help me to enforce order on the pseudo instructions in steps 9 through 14.

While I am quite successful in generating code for the end of the sequence, I can't figure out how to place the PI_BF pseudo instruction without generating an infinite loop. My guess is that somewhere in the Lestes code, something enforces the opposite order of evaluation of A and B. That would, of course, made the topological sort algorithm spin forever. I've tried to place the instruction between psp and nsp of the entire expression. That got me rid of the infinit loop, but I didn't get the right order.

I also can't figure out how to express the above 14 steps in the SSA form so that the DEST register is assigned to only once and there exists a single pseudo instruction which can be denoted as the origin of the register value.

Well, I guess I need to keep on trying. Every failure like this teaches me something new from Lestes and maybe it's worth the trouble just because of that.