Note: This is an experimental planet, if you have a feed you would like to add or you would like your feed to be removed please contact me.

Planet 9

January 19, 2012

research!rsc

Regular Expression Article #4

In January 2007 I posted an article on my web site titled “Regular Expression Matching Can Be Simple And Fast.” I intended this to be the first of three; the second would explain how to do submatching using automata, and the third would explain how to make a really fast DFA. These were inspired by my work on Google Code Search.

Today, the fourth article in my three-part series is available, accompanied by source code (as usual). This one describes how Code Search worked.

by rsc (noreply@blogger.com) at January 19, 2012 05:24 PM

January 01, 2012

Command Center

Esmerelda's Imagination

An actress acquaintance of mine—let's call her Esmerelda—once said, "I can't imagine being anything except an actress." To which the retort was given, "You can't be much of an actress then, can you?"

I was reminded of this exchange when someone said to me about Go, "I can't imagine programming in a language that doesn't have generics." My retort, unspoken this time, was, "You can't be much of a programmer, then, can you?"

This is not an essay about generics (which are a fine thing and may arrive in Go one day, or may not) but about imagination, or at least what passes for imagination among computer programmers: complaint. A friend observed that the definitive modern pastime is to complain on line. For the complainers, it's fun, for the recipients of the complaint it can be dispiriting. As a recipient, I am pushing back—by complaining, of course.

Not so long ago, a programmer was someone who programs, but that seems to be the last thing programmers do nowadays. Today, the definition of a programmer is someone who complains unless the problem being solved has already been solved and whose solution can be expressed in a single line of code. (From the point of view of a language designer, this reduces to a corollary of language success: every program must be reducible to single line of code or your language sucks. The lessons of APL have been lost.)

A different, more liberal definition might be that a programmer is someone who approaches every problem exactly the same way and complains about the tools if the approach is unsuccessful.

For the programmer population, the modern pastime demands that if one is required to program, or at least to think while programming, one blogs/tweets/rants instead. I have seen people write thousands of words of on-line vituperation that problem X requires a few extra keystrokes than it might otherwise, missing the irony that had they spent those words on programming, they could have solved the problem many times over with the saved keystrokes. But, of course, that would be programming.

Two years ago Go went public. This year, Dart was announced. Both came from Google but from different teams with different goals; they have little in common. Yet I was struck by a property of the criticisms of Dart in the first few days: by doing a global substitution of "Go" for "Dart", many of the early complaints about Go would have fit right into the stream of Dart invective. It was unnecessary to try Go or Dart before commenting publicly on them; in fact, it was important not to (for one thing, trying them would require programming). The criticisms were loud and vociferous but irrelevant because they weren't about the languages at all. They were just a standard reaction to something new, empty of meaning, the result of a modern programmer's need to complain about everything different. Complaints are infinitely recyclable. ("I can't imagine programming in a language without XXX.") After all, they have a low quality standard: they need not be checked by a compiler.

A while after Go launched, the criticisms changed tenor somewhat. Some people had actually tried it, but there were still many complainers, including the one quoted above. The problem now was that imagination had failed: Go is a language for writing Go programs, not Java programs or Haskell programs or any other language's programs. You need to think a different way to write good Go programs. But that takes time and effort, more than most will invest. So the usual story is to translate one program from another language into Go and see how it turns out. But translation misses idiom. A first attempt to write, for example, some Java construct in Go will likely fail, while a different Go-specific approach might succeed and illuminate. After 10 years of Java programming and 10 minutes of Go programming, any comparison of the language's capabilities is unlikely to generate insight, yet here come the results, because that's a modern programmer's job.

It's not all bad, of course. Two years on, Go has lots of people who've spent the time to learn how it's meant to be used, and for many willing to invest such time the results have been worthwhile. It takes time and imagination and programming to learn how to use any language well, but it can be time well spent. The growing Go community has generated lots of great software and has given me hope, hope that there may still be actual programmers out there.

However, I still see far too much ill-informed commentary about Go on the web, so for my own protection I will start 2012 with a resolution:

I resolve to recognize that a complaint reveals more about the complainer than the complained-about. Authority is won not by rants but by experience and insight, which require practice and imagination. And maybe some programming.

by rob (noreply@blogger.com) at January 01, 2012 07:17 PM

October 22, 2011

newsham

Mu, Recursive types, functor composition...

I wanted to goof off a little bit with decomposing types using Mu, functor composition and deconstructors. This was inspired by some blog posts:http://mainisusuallyafunction.blogspot.com/2010/12/type-level-fix-and-generic-folds.html and http://blog.plover.com/prog/springschool95-2.html
If this all looks a bit like mental masturbation, it probably is, but it was quite enjoyable.  If you're not into Haskell and functional programming you probably want to skip this.


{-# OPTIONS_GHC -XTypeOperators #-}
{- Recursive types in haskell. -}
module Main where
import Prelude hiding ( Maybe, Just, Nothing, maybe, sum, succ, map )

---- Pairs
data Pair a b = Pair a b

pair :: (a -> b -> c) -> Pair a b -> c
pair f (Pair a b) = f a b

instance Functor (Pair a) where
fmap f (Pair a b) = Pair a (f b)


---- Optional type
data Maybe a = Nothing | Just a

maybe :: c -> (a -> c) -> Maybe a -> c
maybe f _ Nothing = f
maybe _ g (Just x) = g x

instance Functor Maybe where
fmap f = maybe Nothing (Just . f)


---- Recursive types
newtype Mu a = In { mu :: (a (Mu a)) }

cata :: Functor f => (f a -> a) -> Mu f -> a
cata f = f . fmap (cata f) . mu


---- recursive peano numbers
--type Peano = Mu Maybe

zero = In $ Nothing
succ n = In $ Just n

mkNat 0 = In $ Nothing
mkNat n | n == 0 = zero
| n > 0 = succ $ mkNat (n - 1)
| n < 0 = error "bad natural"

unNat = cata $ maybe 0 (+ 1)

add x = cata $ maybe x succ
mul x = cata $ maybe zero $ add x


---- Composition of functors
newtype (g :. f) a = O { unO :: g (f a) }

instance (Functor f, Functor g) => Functor (g :. f) where
fmap f (O x) = O . (fmap . fmap) f $ x


---- proto recursive lists
type ListX a = Maybe :. Pair a

listx :: c -> (a -> b -> c) -> ListX a b -> c
listx f g = maybe f (pair g) . unO


---- recursive lists (ie. of ints)
--type ListI = Mu (ListX Int)

nil = In . O $ Nothing
cons x xs = In . O . Just $ Pair x xs

mkList [] = nil
mkList (x:xs) = cons x $ mkList xs

unList = cata $ listx [] (:)

sum = cata $ listx 0 (+)
map f = cata $ listx [] (\a b -> f a : b)


---- test it out...
main = do
let p m x = putStrLn (m ++ ": " ++ show x)
--
p "nat" $ unNat $ mkNat 5
p "add" $ unNat $ add (mkNat 5) (mkNat 3)
p "mul" $ unNat $ mul (mkNat 5) (mkNat 3)
--
p "list" $ unList $ mkList [1,2,3,4,5]
p "sum" $ sum $ mkList [1,2,3,4,5]
p "map" $ map (*2) $ mkList [1,2,3,4,5]

by newsham (noreply@blogger.com) at October 22, 2011 09:39 AM

October 06, 2011

newsham

Code Safety and Correctness II

In a previous post I talked about code safety and correctness for a small example function, memset. That article showed an implementation using imperative programming and we reasoned about it using Hoare Logic.  In this article we'll take a look at reasoning about safety and correctness when using functional programming. We'll continue to use notation introduced last time, so you may want to go back and skim the article if you haven't read it yet.

Halfway There

We'll start by looking at a C implementation of memcpy written in a functional style:

void memcpy(char *p, char *q, int n)
{
if(n > 0) {
p[0] = q[0];
memcpy(p+1, q+1, n-1);
}
}

This is very non-idiomatic C code and for practical reasons we'd probably never write the function this way. If translated directly to assembly this code will execute much more slowly than the version we looked at last time. However, with certain optimizations it can actually execute just as quickly.

What makes this code different than our last implementation?  The major difference is there are almost no state changes to track during execution. The parameters p, q and n that are passed in retain their value throughout the function. There is no difference between p at the end and p_arg (as we called it last time) at the start.  In pure functional programming there is no difference between assignment and equality; since values never change, once a value is assigned to a variable, that value is equal to that value throughout its scope. This greatly eases the analysis.

We're not all the way there yet, though. This implementation of memcpy is still imperative as there is a state change when p[0] is set to q[0]. As a result each call to memcpy causes a state change and in fact that was the whole point. Memcpy is a primitive for changing the state of mutable arrays and is inherently imperative.

Our analysis will still make use of Hoare Logic. We start with the same precondition as we used last time:


// alloc(p[0..n]), init(q[0..n]), n >= 0
1: void memcpy(char *p, char *q, int n)
2: {
// alloc(p[0..n]), init(q[0..n]), n >= 0
3: if(n > 0) {
// alloc(p[0..n]), init(q[0..n]), n > 0
4: p[0] = q[0]; // safe
5: memcpy(p+1, q+1, n-1);
6: } else {
7: }
8: }

By propagating the precondition we can immediately see that the read of q[0] and the write of p[0] are safe. We would like to show that after the function completes all of p[0..n] equals q[0..n]. Let's assume this and see if we can verify it:

// alloc(p[0..n]), init(q[0..n]), n >= 0
1: void memcpy(char *p, char *q, int n)
2: {
// alloc(p[0..n]), init(q[0..n]), n >= 0
3: if(n > 0) {
// alloc(p[0..n]), init(q[0..n]), n > 0
4: p[0] = q[0];
// p[0] = q[0], init(p[0]),
// alloc(p[0..n]), init(q[0..n]), n > 0
5: memcpy(p+1, q+1, n-1);
6: } else {
// n = 0, p[0..n] = q[0..n]
7: }
// p[0..n] = q[0..n]?
8: }

The desired postcondition is trivially true in the else clause when n is zero (since [0..n] is empty). The only step remaining is to show that the desired postcondition is true at the end of the then clause. At line 5 we've met the precondition for calling memcpy. This is made more obvious if we rewrite the precondition to line 5. Therefore we can use our assumed postcondition after the call at line 5. This is induction.

// alloc(p[0..n]), init(q[0..n]), n >= 0
1: void memcpy(char *p, char *q, int n)
2: {
// alloc(p[0..n]), init(q[0..n]), n >= 0
3: if(n > 0) {
// alloc(p[0..n]), init(q[0..n]), n > 0
4: p[0] = q[0];
// p[0] = q[0], init(p[0]),
// alloc((p+1)[0..n-1]), init((q+1)[0..n-1]),
// (n-1) >= 0
5: memcpy(p+1, q+1, n-1);
// p[0] = q[0], init(p[0]),
// p[1..n] = q[1..n], init(p[1..n])
6: } else {
// n = 0, init(p[0..n]), p[0..n] = q[0..n]
7: }
// init(p[0..n]), p[0..n] = q[0..n]
8: }

We have shown that our assumption is true. We can see that there can be no integer overflows of p or q except when n is zero by our earlier definitions of alloc and init. We can also see that there can be no integer underflows of n. We have shown the function to be both safe and correct.

Notice that this proof is quite a bit simpler than the proof we constructed last time.  Since p, q and n do not change throughout, the proof can be summarized as:


// alloc(p[0..n]), init(q[0..n]), n >= 0
1: void memcpy(char *p, char *q, int n)
2: {
// alloc(p[0..n]), init(q[0..n]), n >= 0
3: if(n > 0) {
4: p[0] = q[0]; // safe since n > 0
5: memcpy(p+1, q+1, n-1); // precond met
// p[0] = q[0] and p[1..n] = q[1..n]
6: }
// init(p[0..n]), p[0..n] = q[0..n]
7: }


We should point out that this proof has the same shortcomings pointed out last time. Most importantly, p and q can be aliased together in which case q can be altered and the correctness argument invalidated.

What's Next?

By using a functional style we were able to greatly improve the ease of reasoning about our code. Unfortunately the code we wrote doesn't run very efficiently in C. The C language is not ideally suited for writing in a functional style as it does not have some optimizations needed to run recursive code efficiently. It is also missing some features that would promote the use of recursion instead of iteration. To fully explore the functional programming style we'd be better suited working in another language.

Although we were able to simplify our proof by using some functional techniques, we still came against some of the same limitations. These problems arose because our program is still imperative and still mutates data (although in a more limited way than our previous program). We can take this further and write programs in a purely functional style by using immutable data types.  This brings many advantages but also a few downsides. In the end there will always be some imperative actions going on in our program (they still have to execute on imperative CPUs!) but we can greatly limit the amount of imperative code we have to reason about.

by newsham (noreply@blogger.com) at October 06, 2011 08:54 PM

Safety, Correctness and Strncpy - a quick note

Yesterday I put up an article on code safety and correctness. I want to point out a quick observation about strncpy today. Strncpy and other similar functions have been proposed as a safe alternative to functions like strcpy. The advantage that strncpy has over strcpy is that if you pass in the buffer bounds of your destination buffer, it will guarantee to never write beyond the end of your destination buffer. In other words, it is easier to use strncpy in a safe manner.

When it comes time to analyze a function like strncpy we face a problem.  There are two ways in which strncpy can be used -- the first is as a safer variant of strcpy. The second is as a truncating variant of strcpy. These are really two different functions and ideally would be served by two different API functions. When analyzing a program that uses strncpy we have to determine which of these two cases was intended. Why? The preconditions for safe use of strncpy are the same in both situations, however the preconditions to guarantee correct behavior are different in each situation! When strncpy is used as a replacement for strcpy, it will only behave correctly when the source string fits in the destination buffer. When strncpy is used to truncate a string, this precondition is not necessary.

Why is this important?  It becomes a lot harder to determine if uses of strncpy are correct. Consider a static code analyzer that encounters a strncpy call. It may not have the context required to determine if a call that may truncate its result is correct or not since it may not be able to infer the intent of the programmer. So while strncpy may have been handy to improve the safety of programs, it may actually make it harder to ensure a programs correctness.



Edit: By request here are two small examples.  This first example shows a use of strncpy to copy a substring:


// find the last dot, copy everything up to the dot.
p = strrchr(name, '.');
if(p && idx < sizeof(namebuf) - 1) {
int idx = p - name;
strncpy(namebuf, name, idx);
namebuf[idx] = 0;
}

This second example shows strncpy being used as a safe replacement for strcpy:

strncpy(namebuf, name, sizeof namebuf-1);
namebuf[sizeof namebuf - 1] = 0;

by newsham (noreply@blogger.com) at October 06, 2011 07:38 PM

October 03, 2011

newsham

Code Safety and Correctness

In this article I am going to go over some of the concepts used when analyzing C code for safety and correctness.  This analysis is based on some formal theory but the concepts should already be familiar to most programmers.  All of our analysis will be informal.

The memcpy function

We'll discuss the safety and correctness of the memcpy function implemented with the following C code:

1: void memcpy(char *p, char *q, int n)
2: {
3: while(n--)
4: *p++ = *q++;
5: }

To be really concrete we would have to define exactly what the C language means and be mindful of some of the cases where the language may leave behavior undefined.  To simplify matters we'll assume that the code is translated in a direct manner (ie. without optimization) on an "ILP32" machine; that is, a machineon which integers and pointers are both 32-bit values storedas 2's complement numbers. In our fictitious dialect of Cthere are no undefined behaviors. For example, if you add oneto (char)127 you'll get the well defined (char)-128 accordingto the "normal" behavior you'd expect with a CPU implementing2's complement numbers.

Now let's consider some fairly straightforward questions:Is this function "safe"? Is it "correct"? We can't answer these questions without defining what safety and correctness mean.

What operations are not safe? In this small piece of code thereare several things that can go wrong. We're reading from memorythrough a pointer. The pointer could be invalid. It could bevalid and point to some data that we didn't intend to read.It could point to a memory cell that has never been initializedwith valid data. We're also writing to memory through a pointer.The pointer could point to invalid memory or to memory we didnot intend to write. We're doing arithmetic on integers andpointers. We might expect these values to operate as integerswhen in fact they are represented as finite 32-bit values.

Some notation

We can introduce some notation to define what we mean bysafe operations.We'll define overflow(e) to mean that the expression e has a different result when evaluated with a finiterepresentation than the result if all values are treatedas ideal integers. In our analysis we'll repeated beusing genuine integers, not the finite representationsprovided by C. We'll treat the use of any values after an overflow occurs as an unsafe operation. (Note: if the programmer wishes to rely on an the behavior of finite arithmetic she can explicitly use modular arithmetic or mask off select bits, in which case the expression would evaluate to the same value when using finite or arbitrary sized integers).

When considering buffers we would like to discuss a fewproperties of the memory cells occupied by the buffer.We'll use the notation [n..m] to denote the range of integers(again, using genuine integers, notfinite 32-bit representations) from n to m-1. For example[3..6] will represent the numbers 3, 4 and 5 but not 6.The ranges [6..6] and [7..2] are both empty.

We'll use the notation alloc(p[n]) to denote that thememory storage at (p+n) has been allocated to the p buffer.We'll use the notation alloc(p[n..m]) to represent thatalloc(p[x]) holds for all x in [n..m] and we'll use similarnotation later for other properties over a range.For example, alloc(p[3..6]) means that p[3], p[4] and p[5]are all allocated to p.We'll also asume that values are allocated in such a waythat elements never wrap-around the address space.Said differently, alloc(p[n]) implies that overflow(p+n) is false.We'll treat any writes to p[n] to be unsafe unless alloc(p[n]) istrue.

We'll use the notation init(p[n]) to denote that thememory cell at p+n is allocated and has been filled with meaningful data.We'll treat any reads of p[n] to be unsafe unless init(p[n]) is true.

Given these definitions its quite clear that we cannot guaranteethe safety of the memcpy function for all possiblearguments (consider memcpy(NULL, "test", 1000)!). The only way we could declare this function safe isif we imposed additional restrictions on the function. A reasonable set of restrictions are: n >= 0, alloc(p[0..n]) andinit(q[0..n]). These restrictions say that the bytes fromthe source buffer have been initialized and the bytes of the destinationbuffer have been allocated. Lets analyze the program to see ifwe can convince ourselves that memcpy is safe with these restrictions.

Safety

To analyze the function we're going to informally useHoare Logic. In Hoare Logic we list what is known before each statement is executed (the precondition), the statement to be executed, and what can be inferred after the statement is executed (the postcondition).  It will be a lot easier to write down our analysis if we expand our definition of memcpy with a single side effect per line:

1:  void memcpy(char *p, char *q, int n)
2: {
3: while(1) {
4: int oldn = n;
5: n--;
6: if(!oldn)
7: break;
8: char c = *q;
9: *p = c;
10: p++;
11: q++;
12: }
13: }

To start our analysis, we write down our restriction on the arguments as a precondition to the function.  This will also be the precondition to the while statement, which is the first statement in the function:

    { n >= 0, alloc(p[0..n]), init(q[0..n]) }
1: void memcpy(char *p, char *q, int n)
2: {
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
3: while(1) {
4: int oldn = n;
5: n--;
6: if(!oldn)
7: break;
8: char c = *q;
9: *p = c;
10: p++;
11: q++;
12: }
13: }

The precondition to the first statement inside the while loop must agree with the precondition of the while statement itself as well as the postcondition of any statement that loops back to the start of the while loop.

To analyze the loop we usually look for some property thatremains true throughout the loop. Such properties arecalled loop invariants. In this case it appears that theprogram increments p and q and decrements n in step so thatp+n and q+n remain invariant (except for brief periods between the update of n, p and q). It also looks like theprecondition of the loop might remain true throughout theloop. We know that the property holds during the firstiteration, but it must also hold when control loops aroundfor us to state it as a precondition to line 4. Let's assumethis is the case for now and verify that it is indeed the case (marking our assumption with a question mark):

{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
1: void memcpy(char *p, char *q, int n)
2: {
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
3: while(1) {
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }?
4: int oldn = n;
5: n--;
6: if(!oldn)
7: break;
8: char c = *q;
9: *p = c;
10: p++;
11: q++;
12: }
13: }

Filling in the postcondition after introducing the oldn variable is pretty simple.  The postcondition after decrementing n is a little more interesting; for example since alloc(p[0..n]) was true before decrementing n, alloc([0..n+1]) must be true after n is replaced with n-1:
 
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }?
4: int oldn = n;
{ oldn >= 0, n >= 0,
alloc(p[0..n]), init(q[0..n]) }?
5: n--;
{ oldn >= 0, n >= -1,
alloc(p[0..n+1]), init(q[0..n+1]) }?
6: if(!oldn)
{ oldn == 0, n = -1,
alloc(p[0..n+1]), init(q[0..n+1]) }?
7: break;

We note that there is no overflow(n-1) at line 5 since n >= 0. The break is the only exit point for the loop so the postconditionof the whole loop will be the precondition from the break statement.Note that since n is fixed we can rewrite the precondition to line 7 as:n = -1, alloc(p[0..0]), init(q[0..0]) and since the range[0..0] is empty we can discard the alloc() and init() properties entirely.

Continuing our analysis, we note that after the if statementfails n cannot be -1:
 
7: break;
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }?
8: char c = *q;
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }?
9: *p = c;
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }?
10: p++;
{ oldn != 0, n > -1,
alloc(p[-1..n]), init(q[0..n+1]) }?
11: q++;
{ oldn != 0, n > -1,
alloc(p[-1..n]), init(q[-1..n]) }?

At line 8 we arrive at an access of q[0]. We can rule thisaccess as safe since by init(q[0..n+1]) with n > -1.  At line 9 a value is stored into q[0] which is safe by alloc(p[0..n+1])with n > -1.At lines 10 and 11 the increments of p and q will not overflowif n > 0 by alloc(p[-1..n]) and init(p[-1..n]).However, if n = 0 we cannot rule out an overflow!Any further access through p and q after this point could be invalid.If there were other accesses through p and q later in the programwe might have to worry about this, but as we've alreadyshown that the pointer accesses in lines 8 and 9 are valid,we do not need to be concerned.
Finally after lines 10 and 11 we reach the endof the loop with a postcondition: n > -1, alloc(p[-1..n]),init(q[-1..n]). We can rewrite this as: n >= 0, alloc(p[0..n]),init(q[0..n]) by recognizing that [0..n] is a subset of [-1..n].This provides a precondition for the next iteration of the whileloop and confirms our earlier assumption.

Our analysis is complete and we have:

{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
1: void memcpy(char *p, char *q, int n)
2: {
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
3: while(1) {
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
4: int oldn = n;
{ oldn >= 0, n >= 0,
alloc(p[0..n]), init(q[0..n]) }
5: n--; // no overflow(n-1) by n >=0
{ oldn >= 0, n >= -1,
alloc(p[0..n+1]), init(q[0..n+1]) }
6: if(!oldn)
{ oldn == 0, n == -1 }
7: break;
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }
8: char c = *q; // read q[0] valid by
// init(q[0..n+1]) and n >= 0
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }
9: *p = c; // write p[0] valid by 
// alloc(p[0..n+1]) and n >= 0
{ oldn != 0, n > -1,
alloc(p[0..n+1]), init(q[0..n+1]) }
10: p++; // can overflow(p+1) of n == 0!
{ oldn != 0, n > -1,
alloc(p[-1..n]), init(q[0..n+1]) }
11: q++; // can overflow(q+1) if n == 0!
{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
12: }
{ n == -1 }
13: }

One important safety property we did not yet discuss is termination; if we call memcpy will it ever return?  Its not hard to convince ourselves that it will.  If we start with n >= 0 and the while loop runs until n = 0 and decrements n once per iteration then eventually n will become zero and the loop will terminate.

Correctness

We've shown that given appropriate restrictions, memcpy can be shown to be safe. This does not mean that the functionis correct. To determine if it is correct we first have to comeup with a definition of what correct behavior is.A reasonable choice is to say that after memcpy(d, s, m) is calledall of d[x] equal s[x] for x in [0..n]. Let's see if we can derive this as a postcondition of memcpy using Hoare Logic.

In our function the value of the arguments p, q and n are altered throughout the function. This is problematic since we want to say something about the original values of these arguments in the postcondition. To deal with this, we'll introduce the notation p_arg to denote the original value of the p argument, with similar notation for q and n.  Next we'll define d = p - p_arg which is how far p has moved past p_arg (and how far q has moved past q_arg) while executing memcpy. This will allow us to relate the final value of p (and q) with the initial value. We note that d + n remains invariant with the value n_arg.

With these observations in mind we note that at any point in the execution, p[-d..0] has been set to values from q[-d..0].  The details of the analysis are:

{ n >= 0, alloc(p[0..n]), init(q[0..n]) }
1: void memcpy(char *p, char *q, int n)
2: {
{ d = 0, d+n = n_arg, n >= 0,
init(p[0..0]), p[0..0] = q[0..0] }
3: while(1) {
{ d+n=n_arg, n >= 0,
init(p[-d..0]), p[-d..0] = q[-d..0] }
4: int oldn = n;
{ d+n=n_arg, oldn >= 0, n >= 0,
init(p[-d..0]), p[-d..0] = q[-d..0] }
5: n--;
{ d+n+1 = n_arg, oldn >= 0, n >= -1,
init(p[-d..0]), p[-d..0] = q[-d..0] }
6: if(!oldn)
{ oldn = 0, n = -1, d=n_arg,
init(p[-d..0]), p[-d..0] = q[-d..0] }
7: break;
{ d+n+1 = n_arg, oldn != 0, n > -1,
init(p[-d..0]), p[-d..0] = q[-d..0] }
8: char c = *q;
{ d+n+1 = n_arg, n >= 0,
init(p[-d..0]), p[-d..0] = q[-d..0] }
9: *p = c; // now p[0] = q[0]
{ d+n+1 = n_arg, n >= 0,
init(p[-d..1], p[-d..1] = q[-d..1] }
10: p++; // now d is one larger
{ d+n = n_arg, n >=0,
init(p[-d..0], p[-d..0] = q[-d+1..1] }
11: q++;
{ d+n = n_arg, n >= 0,
init(p[-d..0]), p[-d..0] = q[-d..0] }
12: }
{ n = -1, init(p[-n_arg..0],
p[-n_arg..0] = q[-n_arg..0] }
13: }

(Note: to simplify the analysis slightly and limit the amount of information we write down in the pre- and postconditions we skipped over the analysis that d is (almost always) q - q_arg.  Finally at the end of the procedure we can use our definition of d to rewrite our postcondition as init(p_arg[0..n_arg]), p_arg[0..n_arg] = q_arg[0..n_arg] proving the correctness of the memcpy function.

Can't we simplify?

Our proof was pretty long and involved for such a small function. While it points out the sorts of things we should be thinking about while writing or analyzing a program, we don't always have to go through all of the busy work we walked through in our proof. After getting the hang of manipulating preconditions and postconditions we can skip over many of the steps when we don't need a detailed formal analysis.  For example, the following procedure contains the salient points that can be used to convince yourself or someone else of the safety and correctness of memcpy:

// precond: alloc(p[0..n]), init(q[0..n]), n >= 0
1: void memcpy(char *p, char *q, int n)
2: {
// invariant: alloc(p[-d..n']), init(q[-d..n']),
// p[-d..0] = q[-d..0], n' >= 0
// where n' is n before the postdecrement
// and d = p-p_arg, d = q-q_arg.
3: while(n--) {
// n' > 0
4: *p++ = *q++;
5: }
// d = n_arg, p[-d..0] = q[-d..]
// so p_arg[0..n_arg] = q_arg[0..n]
6: }

What's a proof worth?

We've shown that memcpy is both safe and correct if called correctly.  You might be surprised to learn that there are some conditions where calling memcpy with "correct" arguments can lead to incorrect or unsafe behavior! What exactly is our proof worth?

Consider what happens if you call memcpy with two buffers that overlap.  Let's say we have a buffer p that holds "TESTING 1 2 3" and q that points to the fourth byte of p. What happens if we call memcpy(p, q, 8)? After the call p will contain "ING 1 2 1 2 3". In this case nothing "bad" happened, but p[0..8] is most definitely not equal to q[0..8]! During execution each p[x] was set to each q[x] for x in [0..8] but some of the values of q were altered in the process. Somehow in our analysis we had assumed that the values of q[0..8] would remain constant and they did not. We've encountered something called aliasing. Some of the values of p[x] were aliased to some values of q which resulted in q being altered outside of our analysis.  Another precondition that must hold for our definition of correctness to hold must be that p and q are not aliased to each other.

Are there other ways in which q could have been altered while memcpy executes?  Definitely!  If our program is multi-threaded then another thread could have altered parts of p or q or even worse the state used by memcpy during execution! Even more extremely, our program could be altered during execution by someone using a debugger, or using other OS mechanisms to manipulate the program.  Such manipulation could clearly violate our analysis. There could also be bugs in the compiler that translated our C program to assembly, the operating system that runs the program or in the implementation of the CPU. Hardware failures could result in the program crashing entirely or failing more subtly with the corruption of the buffers or internal state of memcpy.  In short, there are still a wide number of things that can go wrong to violate the assumptions of our proof!  Its important that we try to understand the limitations of our analysis.


Try your own

The strcpy function is very similar to the memcpy program.  Try your hand at analyzing it.  Hint: define Z(p) to be the index of the first index x in the initialized range of p that contains a NIL character. Note that strcpy(s, t) is almost the same as memcpy(s, t, Z(t)+1).

Consider what happens if the arguments to strcpy are aliased. Why is the behavior so different than in the memcpy case?


What Next?

We did some analysis of imperative code using Hoare Logic in this article.  Most programmers already think about the kinds of issues we encountered when they write or analyze programs.  Having notation and tools like Hoare Logic allow us to communicate these ideas efficiently to others or write them down and manipulate them in small, less-error-prone steps so that we can work on more difficult problems than we can easily solve in our head. Writing down our work can also provide another benefit -- it slows us down and focuses our attention on important safety issues. This is especially helpful when using dangerous constructs (like pointer manipulation) in safety critical code (such as inside an operating system kernel).

Unfortunately there is no room for our analysis in our C programs themselves.  We can state restrictions in comments or in documentation and we might even be able to add some asserts to check some of the properties, but unfortunately the C language does not understand or validate any of our analysis.  Ideally all preconditions would be provided as additional arguments to a function and the postcondition would be part of the return value or return type of a function or operation. This would require we work in a language that supported representing and manipulating proofs and would probably burden the programmer with filling in the details of the analysis. However, without the compiler checking over our work, we are bound to make mistakes.

Analysis of imperative programs can be a little awkward as the program state mutates.  At times we want to discuss the particular value of a variable in a pre- or postcondition after it has been mutated (as we did with p_arg for example). Another style of programming, functional programming, can be manipulated mathematically more naturally.  Next time we'll go through a small example in a (mostly) functional programming language.

by newsham (noreply@blogger.com) at October 03, 2011 08:40 PM

September 24, 2011

newsham

Simple instrumenation in Qemu

I recently had a desire to log the dynamic call statements executed by a program.  There are a lot of approaches that can be taken and I chose to instrument Qemu in this instance. In this article I show how I added a very simple tracing function to Qemu.

Qemu is a free and open source system emulator. It works by performing dynamic translation of the code it is running from the instruction set being execute (in my case i386) to whatever instruction set your computer runs natively.  The translation is performed "Just In Time" on individual blocks of code whenever new code is encountered that hasn't been translated yet. In this case we're going to treat Qemu as if it was a computer with a customizable CPU -- one we could easily repurpose to perform something a real cpu doesn't normally do. It turns out that customizing Qemu can be quite simple.

Qemu already has a logging facility that can be used to show what code is being translated, what intermediate operations are emitted, what target instructions are emitted, what blocks are being executed, etc.  You can use these options from the command line by specifying "-D logfile -d list,of,flags" or from the qemu console with the commands "logfile filename" and "log list,of,flags". In the following figure you can see a list of the flag options:


Our first step is to add a new flag called "extras" which we will use to turn on and off any new tracing features we add:


Next we take a look at the current translation of dynamic calls and the code that generates it, so we can figure out where to add our patch. This is as simple as running with "-d in_asm,op" and looking at the resulting log file:


Here we see an instruction we're interested in at 0xf332d.  The code is translated into an intermediate format called TCG.  This format is documented in qemu/tcg/README. Here we see that the address of ebx (which is a variable representing the virtual EBX register) is loaded into the tmp2 register and then dereferenced to load it's value into tmp0.  The remaining instructions emulate pushing the program counter onto the stack before updating the new program counter which is dereferenced off of the "env" variable.  Below we see the code in qemu/target-i386/translate.c that generated this code:


You can find the details of the call instruction in the IA-32 Intel Architecture Software Developer's Manual Volume 2: Instruction Set Reference. The code right before the switch statement handles loading of the target address into the temp0 register for several different variants of the opcode.  The code in the switch statement then generates the code that simulates the stack push of the return address and the setting of the new program counter.

What we would like to do is to call a logging function every time an indirect call is executed.  This function would log the current program counter and the target address.  This is fairly straightforward since the target address has already been computed into the temp0 address.  First we define a new helper function in helper.h. This will define a "gen_helper_tracecall" function which will generate a call to our helper.  All that is left then is to call our helper at the top of the switch statement with the target value from temp0 and the current program counter, and define our helper function.  The helper function simply writes to the log if the "extras" logging flag has been set:


Thats it!  Now if we run qemu with the "-D calls.txt -d extras" flags, we get a log of all of the dynamic call targets in the "calls.txt" file.  Or we could run qemu, switch to the console and turn on tracing with "log extras" and turn it off again with "log none".

So that was pretty simple and it runs fairly quickly.  This is a really powerful technique and we can get a lot of mileage out of it, but it would be good to have more control.  There are a lot of things we can do going forward to make this more powerful.  One thing that comes to mind is adding (operating system specific) hooks that track what process or thread is currently running to allow filtering of events by process and thread. It would also be useful to keep an internal log of unique targets and only emit targets that haven't yet been logged.

by newsham (noreply@blogger.com) at September 24, 2011 11:49 PM

September 18, 2011

Command Center

User experience

[We open in a well-lit corporate conference room. A meeting has been running for a while. Lots has been accomplished but time is running out.]

[The door opens and a tall, tow-headed twenty-something guy in glasses walks in, carrying a Mac Air and a folder.]

Manager:
Oh, here he is. This is Richard. I asked him to join us today. Glad he could make it. He's got some great user experience ideas.

Richard:
Call me Dick.

Manager:
Dick's done a lot of seminal UX work for us.

Engineer:
Hey, aren't you the guy who's arguing we shouldn't have search in e-books?

Dick:
Absolutely. It's a lousy idea.

Engineer:
What?

Dick:
Books are the best UI ever created. They've been perfected over more than 500 years of development. We shouldn't mess with success.

Product manager:
Well, this is a new age. We should be allowed to ...

Dick:
Books have never had search. If we add search, we'll just confuse the user.

Product manager:
Oh, you're right. We don't want to do that.

Engineer:
But e-books aren't physical books. They're not words on paper. They're just bits, information.

Dick:
Our users don't know that.

Engineer:
Yes they do! They don't want simple books, they want the possibilities that electronic books can bring. Do you know about information theory? Have you even heard of Claude Shannon?

Dick:
Isn't he the chef at that new biodynamic tofu restaurant in North Beach?

Engineer:
Uhh, yeah, OK. But look, you're treating books as a metaphor for your user interface. That's as lame as using a trash can to throw away files and folders. We can do so much more!

Dick:
You misunderstand. Our goal is to make computers easier to use, not to make them more useful.

Product manager:
Wow, that's good.

Engineer:
Wow.

Manager:
Let's get back on track. Dick, you had some suggestions for us?

Dick:
Yeah. I was thinking about the work we did with the Notes iPhone app. Using a font that looked like a felt marker was a big help for users.

Engineer:
Seriously?

Dick:
Yes, it made users feel more comfortable about keeping notes on their phone. Having a font that looks like handwriting helps them forget there's a computer underneath.

Engineer:
I see....

Dick:
Yes, so... I was thinking for the Address Book app for Lion, we should change the look to be like a...

Manager:
Can you show us?

Dick:
Yeah, sure. I have a mock-up here.
[Opens laptop, turns it to face the room.]

Product manager:
An address book! That's fantastic. Look at the detail! Leather, seams at the corners, a visible spine. This is awesome!

Engineer:
It's just a book. It's a throwback. What are you doing? Why does it need to look like a physical address book?

Dick:
Because it is an address book!

Engineer:
No it's not, it's an app!

Dick:
It's a book.

Engineer:
You've made it one. This time it's not even a metaphor - it's literally a book. You're giving up on the possibility of doing more.

Dick:
As I said, users don't care about functionality. They want comfort and familiarity. An Address Book app that looks like an address book will be welcome. Soothing.

Engineer:
If they want a paper address book, they can buy one.

Dick:
Why would they do that if they have one on their desktop?

Engineer:
Can they at least change the appearance? Is there a setting somewhere?

Dick:
Oh, no. We know better than the user - otherwise why are we here? Settings are just confusing.

Engineer:
I ... I really don't understand what's going on.

Manager:
That's OK, you don't have to, but I'd like to give you the action item to build it. End of the quarter OK?

Engineer:
Uhhh, sure.

Manager.
Dick, do you have the requirements doc there?

Dick:
Right here.
[Pushes the folder across the desk.]

Engineer:
Can't you just mail it to me?

Dick:
It's right there.

Engineer:
I know, but... OK.

Manager:
That's a great start, Dick. What else do you have?

Dick:
Well, actually, maybe this is the time to announce that I'm moving on. Today is my last day here.

Manager, Product manager:
[Unison] Oh no!

Dick:
Yeah, sorry about that. I've had an amazing time here changing the world but it's tiem for me to seek new challenges.

Manager:
Do you have something in mind?

Dick:
Yes, I'm moving north. Microsoft has asked me to head a group there. They've got some amazing new ideas around paper clips.

FADE

by rob (noreply@blogger.com) at September 18, 2011 10:27 PM

NineTimes - Inferno and Plan 9 News

Inferno for Android phones!

John Floren announced on 9fans that himself and some of the folks at Sandia National Labs have inferno running on a few android devices.

We would like to announce the availability of Inferno for Android phones. Because our slogan is "If it ain't broke, break it", we decided to replace the Java stack on Android phones with Inferno. We've dubbed it the Hellaphone--it was originally Hellphone, to keep with the Inferno theme, but then we realized we're in Northern California and the change was obvious.

The Hellaphone runs Inferno directly on top of the basic Linux layer provided by Android. We do not even allow the Java system to start. Instead, emu draws directly to the Linux framebuffer (thanks, Andrey, for the initial code!) and treats the touchscreen like a one-button mouse. Because the Java environment doesn't start, it only takes about 10 seconds to go from power off to a fully-booted Inferno environment.

As of today, we have Inferno running on the Nexus S and the Nook Color. It should also run on the Android emulator, but we haven't tested that in a long time. The cell radio is supported, at least on the Nexus S (the only actual phone we've had), so you can make phone calls, send texts, and use the data network.

A video John made showing what's working so far.

Great job guys!

by www-data at September 18, 2011 02:16 PM

6th International Workshop on Plan 9

This years iwp9 is going to be held at Universidad Rey Juan Carlos in Fuenlabrada, Madrid, Spain. The workshop is taking place October 20 — 21, 2011. The registration deadline (which is subject to change) is on October 10th.

More information at iwp9.org.

by www-data at September 18, 2011 01:59 PM

August 23, 2011

Command Center

Regular expressions in lexing and parsing

Comments extracted from a code review. I've been asked to disseminate them more widely.

I should say something about regular expressions in lexing and
parsing. Regular expressions are hard to write, hard to write well,
and can be expensive relative to other technologies. (Even when they
are implemented correctly in N*M time, they have significant
overheads, especially if they must capture the output.)

Lexers, on the other hand, are fairly easy to write correctly (if not
as compactly), and very easy to test. Consider finding alphanumeric
identifiers. It's not too hard to write the regexp (something like
"[a-ZA-Z_][a-ZA-Z_0-9]*"), but really not much harder to write as a
simple loop. The performance of the loop, though, will be much higher
and will involve much less code under the covers. A regular expression
library is a big thing. Using one to parse identifiers is like using a
Ferrari to go to the store for milk.

And when we want to adjust our lexer to admit other character types,
such as Unicode identifiers, and handle normalization, and so on, the
hand-written loop can cope easily but the regexp approach will break
down.

A similar argument applies to parsing. Using regular expressions to
explore the parse state to find the way forward is expensive,
overkill, and error-prone. Standard lexing and parsing techniques are
so easy to write, so general, and so adaptable there's no reason to
use regular expressions. They also result in much faster, safer, and
compact implementations.

Another way to look at it is that lexers and parsing are matching
statically-defined patterns, but regular expressions' strength is that
they provide a way to express patterns dynamically. They're great in
text editors and search tools, but when you know at compile time what
all the things are you're looking for, regular expressions bring far
more generality and flexibility than you need.

Finally, on the point about writing well. Regular expressions are, in
my experience, widely misunderstood and abused. When I do code reviews
involving regular expressions, I fix up a far higher fraction of the
regular expressions in the code than I do regular statements. This is
a sign of misuse: most programmers (no finger pointing here, just
observing a generality) simply don't know what they are or how to use
them correctly.

Encouraging regular expressions as a panacea for all text processing
problems is not only lazy and poor engineering, it also reinforces
their use by people who shouldn't be using them at all.

So don't write lexers and parsers with regular expressions as the
starting point. Your code will be faster, cleaner, and much easier to
understand and to maintain.

by rob (noreply@blogger.com) at August 23, 2011 02:38 AM

July 26, 2011

Software Magpie

My latest squeeze

My last audio CD player failed, and my CDs were taking up too much space. I decided finally to go properly digital with my music. I was anxious to control the software, at least to some extent, and to ensure I could use a good range of formats, not just MP3.

I built a mini-ATX box to run Vortexbox, to provide my digital jukebox. Vortexbox is based on a Fedora Linux distribution, preconfigured on installation to act immediately as a jukebox, without having to mess with it further. Being Linux-based, and with source available, should allow arbitrary tinkering.

To feed my Quad 34/306 hi-fi system, I decided to try the Logitech Squeezebox Touch. Until all the bits arrived for the Vortexbox, I have been playing BBC Radio 3 on Internet radio, which works well, even though the Squeezebox is currently using a WiFi connection. (I'm just setting up the Vortexbox now.)

I was pleased to find that I can ssh in to the Touch. On doing so for the first time, I read:
root@squeeze1's password:

This network device is for authorized use only. Unauthorized or improper use
of this system may result in you hearing very bad music. If you do not consent
to these terms, LOG OFF IMMEDIATELY.

Ha, only joking. Now you have logged in feel free to change your root password
using the 'passwd' command. You can safely modify any of the files on this
system. A factory reset (press and hold add on power on) will remove all your
modifications and revert to the installed firmware.

Enjoy!
I found that mildly amusing, but it also emphasises a big change in attitude for consumer devices: this one is clearly my device, and I can change it as I like, not just in trivial ways.

by forsyth (noreply@blogger.com) at July 26, 2011 10:49 AM

July 17, 2011

NineTimes - Inferno and Plan 9 News

Plan 9 from the People's Front of cat-v.org (9front)

Plan 9 has been forked to start a development out of the Bell‐Labs (or whatever they are called these days…). This true community‐approach allows further development of Plan 9, even if the shrinking resources at Bell‐Labs for Plan 9 are vanishing.

The homepage and the code can be both found at Google code. You can boot 9front from the regulary built live cd or build the binaries in your existing Plan 9 installation. Installation instructions and further information can be obtained at the 9front wiki.

Everyone is invited to join the development of 9front. Discussions about 9front are held in #cat‐v on irc.freenode.net.

by www-data at July 17, 2011 03:03 PM

July 03, 2011

catena

Credo example: from lex and c source to cygwin executable, across directories

This article walks through building a trivial example of lex source
code into C, and then compiling the generated C source code into
object files and an executable in a different directory. The point is
to demonstrate using dodep to generate and customize build scripts
and list dependencies, and credo to execute the build process.

Lines which start with ; are shell commands; lines without are output.
Key commands are displayed in red text.

Starting state

We start with the files lorem, src/dcr.l, src/header.c, src/lex.h,
and an empty obj directory. Each line of lorem ends with a
carriage return (not usually printed), which we want to strip.
header.c is an ordinary C source file to compile and link.
lex.h gives us a local header to look for.

Generate source code

; cd src
; echo flex > dcr.c.lex.env

We first go into the src directory, and set the lex shell variable to flex,
to specify which lexer to use. To demonstrate an unusual feature,
that would be valid across ports to different operating systems,¹
we’ll capture it with this command.
¹ Inferno and Plan 9 store variables by name in the directory /env,
different for each process. This gives it a definite advantage in
capturing dependencies on variables, since we can just list the name
of the shell-variable file as a dependency.

src/dcr.c.lex.env:1: flex

This setting applies (is transformed into a shell variable) when credo
processes the target dcr.c, if dcr.c depends on /env/lex.

; dodep (c lex l c) dcr.c
credo dcr.c

We ask dodep to generate a shell script and list of dependencies with
this command. The content of these files comes from a library,
indexed by the four parameters given before the name of the file we
want to build. In this case, we want to look in the “c” namespace,
at the command “lex”, to translate an “l” file to a “c” file.

The generated dependency list includes the name of the lex source file,
and references to the variables lex and lflags.

src/dcr.c.dep:1: 0 dcr.l
src/dcr.c.dep:2: 0 /env/lex
src/dcr.c.dep:3: 0 /env/lflags

Credo passes the name of its target, the C source file, to the shell script,
which prints the name of the C source file if it was able to create it.

src/dcr.c.do:1: #!/dis/sh
src/dcr.c.do:2: c = $1
src/dcr.c.do:3: (sum l) = `{grep '\.l$' $c^.dep}
src/dcr.c.do:4:
src/dcr.c.do:5: if {no $lex} {lex = lex}
src/dcr.c.do:6:
src/dcr.c.do:7: if {
src/dcr.c.do:8: flag x +
src/dcr.c.do:9: os -T $lex $lflags $l
src/dcr.c.do:10: mv lex.yy.c $c
src/dcr.c.do:11: } {
src/dcr.c.do:12: echo $c
src/dcr.c.do:13: }

The hosted-Inferno command os calls host-OS commands. Under Cygwin,
it calls Cygwin or Windows commands.

Dodep finally prints a credo command to build the requested target.

; credo dcr.c
credo dcr.l
credo dcr.c
os -T flex dcr.l
mv lex.yy.c dcr.c
dcr.c

Credo builds all its non-/env dependencies in parallel. Once these
are complete, it determines whether the checksum of each file on which
it depends is different than the checksum stored for that file the
last time credo built this target.

This credo command creates a build-avoidance marker file dcr.l.redont,
which tells further runs of credo that the file dcr.l does not need
any processing, so exit early. It calls the host’s flex program,
on the path set when the Inferno environment started, to create the
target file dcr.c.

It updates the check sums in the file dcr.c.dep, to reflect content at
the time of compilation. This means that each target has its own view
of its dependencies, and any change to any file on which it depends,
or a change to the list of files on which it depends, prompts the
target to rebuild.

src/dcr.c.dep:1: 79f9925952d9cfb5a58270c0e2b67691 dcr.l
src/dcr.c.dep:2: 897a779351421523cadbafccdce22efe /env/lex
src/dcr.c.dep:3: 0 /env/lflags

It creates these checksum files to detect any changes to the target’s
dependencies or build script.

src/dcr.c.dep.sum:1: c33129421847b7d897d4e244caf1b8c0 dcr.c.dep

src/dcr.c.do.sum:1: 71d94d651922bd5ff5b543bba52f47d0 dcr.c.do

It also checksums the target file itself. If credo finds, the next
time it runs, that the current checksum of the target does not match,
then it assumes that the target was changed by hand, and will not
overwrite it.

src/dcr.c.sum:1: 2fed3667f390fd80993e090a7eb92c75 dcr.c

Compile objects and executable

We now have all the source, so we go to the object-file and executable
directory, and ask dodep to provide do and dep files to build dcr.exe.
From the “c”-language toolset, we use “cc” to compile an “o”bject file
into an “exe”cutable.

; cd ../obj
; dodep (c cc o exe) dcr.exe
credo dcr.exe

This time the dodep command created two *.do files, dcr.exe.dep.do and
dcr.exe.do, instead of a do and dep file for the target dcr.exe.

obj/dcr.exe.dep.do:1: #!/dis/sh
obj/dcr.exe.dep.do:2: dep = $1
obj/dcr.exe.dep.do:3: exe = `{echo $dep | sed 's,\.dep,,'}
obj/dcr.exe.dep.do:4: (stem ext o) = `{crext o $exe}
obj/dcr.exe.dep.do:5:
obj/dcr.exe.dep.do:6: adddep $exe /env/ars /env/cc /env/cflags /env/ldflags /env/objs
obj/dcr.exe.dep.do:7: adddep $exe $o
obj/dcr.exe.dep.do:8: adddep $exe `{/lib/do/c/coneed $o}

dcr.exe.dep.do creates the dependency list dcr.exe.dep for the target
dcr.exe by adding a set of credo-specific (ars and objs) and standard
(cc, cflags, and ldflags) shell variables; the primary object file,
named for the executable; and any other object files which the primary
one needs. The tool coneed does this last bit of analysis by compiling
the primary object file, looking for unresolved symbols in available
source code, compiling the source code to make sure, and printing the
set of matching object files.

obj/dcr.exe.do:1: #!/dis/sh
obj/dcr.exe.do:2: exe = $1
obj/dcr.exe.do:3: if {no $cc} {cc = cc}
obj/dcr.exe.do:4: if {
obj/dcr.exe.do:5: flag x +
obj/dcr.exe.do:6: os -T $cc $cflags -o $exe $objs $ars $ldflags
obj/dcr.exe.do:7: } {
obj/dcr.exe.do:8: echo os -T ./$exe
obj/dcr.exe.do:9: }

To gather enough information to compile the executable, credo relays
information from dependencies back to calling targets through *.relay
files, which are shell scripts that set the environment for calling targets.

Finally, to start the build process, we locate the source code,
record which compilation tools to use, and again call credo.

; srcdir = ../src/
; echo cpp-3 > default.cpp.env
; echo -I../src/ > default.cppflags.env
; echo gcc-3 > default.cc.env
; credo dcr.exe
credo dcr.exe.dep
os -T gcc-3 -I../src/ -c ../src/dcr.c
os -T gcc-3 -I../src/ -c ../src/header.c
credo dcr.o.dep
credo dcr.o
credo header.o.dep
credo header.o
credo dcr.exe
os -T gcc-3 -o dcr.exe dcr.o header.o
os -T ./dcr.exe

Credo first runs dcr.exe.dep.do to find and store the dependencies of
dcr.exe in dcr.exe.dep. dcr.exe.dep is an implicit dependency of dcr.exe,
so credo runs its *.do script automatically.

obj/dcr.exe.dep:1: 0 /env/ars
obj/dcr.exe.dep:2: 131eb9ab2f6a65b34e0158de1b321e3c /env/cc
obj/dcr.exe.dep:3: 0 /env/cflags
obj/dcr.exe.dep:4: 0 /env/ldflags
obj/dcr.exe.dep:5: 84a1a3c306e006dd723bebe9df29ee6c /env/objs
obj/dcr.exe.dep:6: 3f0d90c1c7483ad1805080cf9e48d050 dcr.o
obj/dcr.exe.dep:7: 1db618b791dc87ac0bd5504f69434273 header.o

Coneed uses $srcdir to find dcr.c, and the setting of cppflags in
default.cppflags.env to find lex.h. Once coneed compiles dcr.o,
it finds the unresolved symbol for the header function, finds a
definition of header() in header.c, and compiles header.o to verify
that it defines the symbol. Coneed finds no other unresolved symbols
supplied by other source code in $srcdir (c.f. printf), so it stops.
Coneed uses “dodep (c cc c o)” to compile the source code in $srcdir
into object files in the current directory (obj). This generates
*.o.dep.do and *.o.do for each source file: those for dcr are shown,
the ones for header are identical.

obj/dcr.o.dep.do:1: #!/dis/sh
obj/dcr.o.dep.do:2: dep = $1
obj/dcr.o.dep.do:3: o = `{echo $dep | sed 's,\.dep,,'}
obj/dcr.o.dep.do:4: (stem ext c) = `{crext c $srcdir^$o}
obj/dcr.o.dep.do:5:
obj/dcr.o.dep.do:6: adddep $o /env/cc /env/cflags /env/cppflags
obj/dcr.o.dep.do:7: adddep $o $c
obj/dcr.o.dep.do:8: adddep $o `{/lib/do/c/findh $c}

obj/dcr.o.do:1: #!/dis/sh
obj/dcr.o.do:2: o = $1
obj/dcr.o.do:3: (sum c) = `{grep '\.c$' $o^.dep}
obj/dcr.o.do:4:
obj/dcr.o.do:5: if {no $cc} {cc = cc}
obj/dcr.o.do:6:
obj/dcr.o.do:7: if {
obj/dcr.o.do:8: flag x +
obj/dcr.o.do:9: os -T $cc $cflags $cppflags -c $c
obj/dcr.o.do:10: } {
obj/dcr.o.do:11: echo 'objs = $objs '^$o > $o^.relay
obj/dcr.o.do:12: }

Before credo runs dcr.o.do it runs dcr.o.dep.do to generate dcr.o.dep
(header.o likewise). To find the paths to C header files it calls findh,
which gathers header-file #includes from the C source files,
and searches for them in local and system directories using cppflags
and the search list printed by $cpp -v.

obj/dcr.o.dep:1: 131eb9ab2f6a65b34e0158de1b321e3c /env/cc
obj/dcr.o.dep:2: 0 /env/cflags
obj/dcr.o.dep:3: 9134215aef7b4657beb5c4bb7a20d4a1 /env/cppflags
obj/dcr.o.dep:4: 2fed3667f390fd80993e090a7eb92c75 ../src/dcr.c
obj/dcr.o.dep:5: feea1fa232f248baa7a7d07743ee86c4 ../src/lex.h
obj/dcr.o.dep:6: fb584676de41ee148c938983b2338f5b /n/C/cygwin/usr/include/stdio.h
obj/dcr.o.dep:7: f6409b1008743b1866d4ad8e53f925cc /n/C/cygwin/usr/include/string.h
obj/dcr.o.dep:8: 468b1dd86fef03b044dceea020579940 /n/C/cygwin/usr/include/errno.h
obj/dcr.o.dep:9: 4e6678324ba6b69666eba8376069c950 /n/C/cygwin/usr/include/stdlib.h
obj/dcr.o.dep:10: c7575313e03e7c18f8c84a5e13c01118 /n/C/cygwin/usr/include/inttypes.h
obj/dcr.o.dep:11: a8fd5fa102b8f74d1b96c6c345f0e22d /n/C/cygwin/usr/include/unistd.h

obj/header.o.dep:1: 131eb9ab2f6a65b34e0158de1b321e3c /env/cc
obj/header.o.dep:2: 0 /env/cflags
obj/header.o.dep:3: 9134215aef7b4657beb5c4bb7a20d4a1 /env/cppflags
obj/header.o.dep:4: ecd87752a211e69078e6bc37afbb561b ../src/header.c
obj/header.o.dep:5: feea1fa232f248baa7a7d07743ee86c4 ../src/lex.h
obj/header.o.dep:6: fb584676de41ee148c938983b2338f5b /n/C/cygwin/usr/include/stdio.h

At this point credo has the dependencies for dcr.exe, so it works
through them, calling the *.relay files—generated by the object files’
do scripts—to gather object names in $objs.

obj/dcr.o.relay:1: objs = $objs dcr.o

obj/header.o.relay:1: objs = $objs header.o

There’s not much to do since coneed already compiled the object files,
so it links them, and prints an os command to run the executable,
since Inferno can’t directly run a Cygwin executable.

Clean up

To clean up, we remove generated files in the src and obj directories.

; rm -f *.redoing *.redont *.renew *.reold *.sum

This removes all the temporary state files created by credo.
Once this is done credo will rebuild from scratch, and reconstruct
each target’s view of its dependencies.

; rm -f `{lsdo | sed 's,^c?redo ,,'}

This removes all the targets created by do scripts. The lsdo command
(called by credo with no targets) prints credo commands for all the
credo targets in the current directory. For example:

; cd src; credo
credo dcr.c

; cd obj; credo
credo dcr.exe
credo dcr.exe.dep
credo dcr.o
credo dcr.o.dep
credo header.o
credo header.o.dep

Once the targets are removed, credo will unconditionally generate them.

; rm -f *.do *.dep *.relay

This removes all the instructions credo uses to build files.
These may usually be regenerated by dodep, adddep (which adds
to the given target’s dependency list the given files), and
rmdep (which removes the given files).

A single library script is provided to remove the state, target,
and dodep files.

; rm -f *.env

This removes default and per-target environment settings.

Once we remove these sets of files, the directories contain only the
files present in their starting state.


by catena at July 03, 2011 07:51 PM

July 01, 2011

research!rsc

Floating Point to Decimal Conversion is Easy

<style> pre { margin-left: 4em; } p { line-height: 175%; } </style>

Floating point to decimal conversions have a reputation for being difficult. At heart, they're really very simple and straightforward. To prove it, I'll explain a working implementation. It only formats positive numbers, but expanding it to negative numbers, zero, infinities and NaNs would be very easy.

An IEEE 64-bit binary floating point number is an integer v in the range [252, 253) times a power of two: f = v × 2e. Constraining the fractional part of the unpacked float64 to the range [252, 253) makes the representation unique. We could have used any range that spans a multiplicative factor of two, but that range is the first one in which all the values are integers.

In Go, math.Frexp unpacks a float64 into f = fr × 2exp where fr is in the range [½, 1). (C's frexp does too.) Converting to our integer representation is easy:

fr, exp := math.Frexp(f)
v := int64(fr * (1<<53))
e := exp - 53

To convert the integer to decimal, we'll use strconv.Itoa64 out of laziness; you know how to write the direct code.

buf := make([]byte, 1000)
n := copy(buf, strconv.Itoa64(v))

The allocation of buf reserves space for 1000 digits. In general 1/2e requires about 0.7e non-zero decimal digits to write in full. For a float64, the smallest positive number is 1/21074, so 1000 digits is plenty. The second line sets n to the number of bytes copied in from the string representation of the (integer) v. Throughout, n will be the number of digits in buf. Note that we're working with ASCII decimal digits '0' to '9', not bytes 0-9.

Now we've got the decimal for v stored in buf. Since f = v × 2e, all that remains is to multiply or divide buf by 2 the appropriate number of times (e or -e times).

Here's the loop to handle positive e:

for ; e > 0; e-- {
    δ := 0
    if buf[0] >= '5' {
        δ = 1
    }
    x := byte(0)
    for i := n-1; i >= 0; i-- {
        x += 2*(buf[i] - '0')
        x, buf[i+δ] = x/10, x%10 + '0'
    }
    if δ == 1 {
        buf[0] = '1'
        n++
    }
}
dp := n

Each iteration of the inner loop overwrites buf with twice buf. To start, the code determines whether there will be a new digit (δ = 1), which happens when the leading digit is at least 5. Then it runs up the number from right to left, just as you learned in grade school, multiplying each digit by two and using x to carry the result. The digit buf[i] moves into buf[i+δ]. At the end, if the code needs to insert an extra digit, it does. Run e times.

After the loop finishes, we record in dp the current location of the decimal point. Since buf is still an integer (it started as an integer and we've only doubled things), the decimal point is just past all the digits.

Of course, e might have started out negative, in which case we've done nothing and still need to halve buf e times:

for ; e < 0; e++ {
    if buf[n-1]%2 != 0 {
        buf[n] = '0'
        n++
    }
    δ, x := 0, byte(0)
    if buf[0] < '2' {
        δ, x = 1, buf[0] - '0'
        n--
        dp--
    }
    for i := 0; i < n; i++ {
        x = x*10 + buf[i+δ] - '0'
        buf[i], x = x/2 + '0', x%2
    }
}

Dividing by two needs an extra digit if the last digit is odd, to store the final half; in that case we add a 0 to buf so that we'll have room to store a completely precise answer.

After adding the 0, we can set up for the division itself. If the first digit is less than 2, it's going to become a zero; in the interest of avoiding leading zeros we use an initial partial value x and move digits up (δ = 1) during the division. The multiplication ran from right to left copying from buf[i] to buf[i+δ]. The division runs left to right copying from buf[i+δ] to buf[i].

Now buf[0:n] has all the non-zero digits in the exact decimal representation of our f, and dp records where to put the decimal point. We could stop now, but we might as well implement correct rounding while we're here.

The interesting case is when we have more digits than requested (n > ndigit). To make the decision, just like in grade school, we look at the first digit being removed. If it's less than 5, we round down by truncating. If it's greater than 5, we round up by incrementing what's left after truncation. If it's equal to 5, we have to look at the rest of the digits being dropped. If any of the rest of the digits are nonzero, then rounding up is more accurate. Otherwise we're right on the line. In grade school I was taught to handle this case by rounding up: 0.5 rounds to 1. That's a simple rule to teach, but breaking this tie by rounding to the nearest even digit has better numerical properties, because it rounds up and down equally often, and it is the usual rule employed in real calculations.

This all results in the thunderclap rounding condition:

buf[prec] > '5' ||
buf[prec] == '5' && (nonzero(buf[prec+1:n]) || buf[prec-1]%2 == 1)

You can see why children are taught buf[prec] >= '5' instead.

Anyway, if we have to increment after the truncation, we're still working in decimal, so we have to handle carrying incremented 9s ourselves:

i := prec-1
for i >= 0 && buf[i] == '9' {
    buf[i] = '0'
    i--
}
if i >= 0 {
    buf[i]++
} else {
    buf[0] = '1'
    dp++
}

The loop handles the 9s. The if increments what's left, or, if we turned the whole string into zeros, it simulates inserting a 1 at the beginning by changing the first digit to a 1 and moving the decimal place.

Having done that and set n = prec, we can piece together the actual number:

return fmt.Sprintf("%c.%se%+d", buf[0], buf[1:n], dp-1)

That prints the first digit, then a decimal point, then the rest of the digits, and finally the N suffix. The exponent must adjust the number by dp−1 because we are printing all but the first digit after the decimal point.

That's all there is to it. To print f = v × 2e you write v in decimal, multiply or divide the decimal by 2 the right number of times, and print what you're left holding.

The Plan 9 C library and the Go library both use variants of the approach above. On my laptop, the code above does 100,000 conversions per second, which is plenty for most uses, and the code is easy to understand.

Why are some converters more complicated than this? Because you can make them a little faster. On my laptop, glibc's sprintf(buf, "%.50e", M_PI) is about 15 times faster than an equivalent print using the code above, because the implementation of sprintf uses much more sophisticated mathematics to speed the conversion.

If you have a stomach for pages of equations, there are many interesting papers that discuss how to do this conversion and its inverse more quickly:

Just remember: The conversion is easy. The optimizations are hard.

Code

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js" type="text/javascript"></script> <script> function playground() { var code = encodeURIComponent($('
').html(document.getElementById('code').innerHTML).text()) document.location = "http://golang.org/doc/play/#" + code } </script>

The full program is available in this Gist or you can try it using the Go playground.

by rsc (noreply@blogger.com) at July 01, 2011 03:03 PM

June 18, 2011

NineTimes - Inferno and Plan 9 News

9front propaganda

The Plan 9 community always was known to support intelligent jokes. 9front is taking this a bit further. Stanley Lieber (No, it's not him!) has created some artwork for 9front. Thanks to him!

by www-data at June 18, 2011 03:00 PM

May 19, 2011

OS Inferno

Inferno для Windows Mobile

Здесь можно найти сборку и пару скриншотов.

by j1m (noreply@blogger.com) at May 19, 2011 12:38 PM

May 18, 2011

research!rsc

Minimal Boolean Formulas

<style type="text/css"> p { line-height: 150%; } blockquote { text-align: left; } pre.alg { font-family: sans-serif; font-size: 100%; margin-left: 60px; } td, th { padding-left; 5px; padding-right: 5px; vertical-align: top; } #times td { text-align: right; } table { padding-top: 1em; padding-bottom: 1em; } #find td { text-align: center; } </style>

28. That's the minimum number of AND or OR operators you need in order to write any Boolean function of five variables. Alex Healy and I computed that in April 2010. Until then, I believe no one had ever known that little fact. This post describes how we computed it and how we almost got scooped by Knuth's Volume 4A which considers the problem for AND, OR, and XOR.

A Naive Brute Force Approach

Any Boolean function of two variables can be written with at most 3 AND or OR operators: the parity function on two variables X XOR Y is (X AND Y') OR (X' AND Y), where X' denotes “not X.” We can shorten the notation by writing AND and OR like multiplication and addition: X XOR Y = X*Y' + X'*Y.

For three variables, parity is also a hardest function, requiring 9 operators: X XOR Y XOR Z = (X*Z'+X'*Z+Y')*(X*Z+X'*Z'+Y).

For four variables, parity is still a hardest function, requiring 15 operators: W XOR X XOR Y XOR Z = (X*Z'+X'*Z+W'*Y+W*Y')*(X*Z+X'*Z'+W*Y+W'*Y').

The sequence so far prompts a few questions. Is parity always a hardest function? Does the minimum number of operators alternate between 2n−1 and 2n+1?

I computed these results in January 2001 after hearing the problem from Neil Sloane, who suggested it as a variant of a similar problem first studied by Claude Shannon.

The program I wrote to compute a(4) computes the minimum number of operators for every Boolean function of n variables in order to find the largest minimum over all functions. There are 24 = 16 settings of four variables, and each function can pick its own value for each setting, so there are 216 different functions. To make matters worse, you build new functions by taking pairs of old functions and joining them with AND or OR. 216 different functions means 216·216 = 232 pairs of functions.

The program I wrote was a mangling of the Floyd-Warshall all-pairs shortest paths algorithm. That algorithm is:

// Floyd-Warshall all pairs shortest path
func compute():
    for each node i
        for each node j
            dist[i][j] = direct distance, or ∞
    
    for each node k
        for each node i
            for each node j
                d = dist[i][k] + dist[k][j]
                if d < dist[i][j]
                    dist[i][j] = d
    return

The algorithm begins with the distance table dist[i][j] set to an actual distance if i is connected to j and infinity otherwise. Then each round updates the table to account for paths going through the node k: if it's shorter to go from i to k to j, it saves that shorter distance in the table. The nodes are numbered from 0 to n, so the variables i, j, k are simply integers. Because there are only n nodes, we know we'll be done after the outer loop finishes.

The program I wrote to find minimum Boolean formula sizes is an adaptation, substituting formula sizes for distance.

// Algorithm 1
func compute()
    for each function f
        size[f] = ∞
    
    for each single variable function f = v
        size[f] = 0
    
    loop
        changed = false
        for each function f
            for each function g
                d = size[f] + 1 + size[g]
                if d < size[f OR g]
                    size[f OR g] = d
                    changed = true
                if d < size[f AND g]
                    size[f AND g] = d
                    changed = true
        if not changed
            return

Algorithm 1 runs the same kind of iterative update loop as the Floyd-Warshall algorithm, but it isn't as obvious when you can stop, because you don't know the maximum formula size beforehand. So it runs until a round doesn't find any new functions to make, iterating until it finds a fixed point.

The pseudocode above glosses over some details, such as the fact that the per-function loops can iterate over a queue of functions known to have finite size, so that each loop omits the functions that aren't yet known. That's only a constant factor improvement, but it's a useful one.

Another important detail missing above is the representation of functions. The most convenient representation is a binary truth table. For example, if we are computing the complexity of two-variable functions, there are four possible inputs, which we can number as follows.

X Y Value
false false 002 = 0
false true 012 = 1
true false 102 = 2
true true 112 = 3

The functions are then the 4-bit numbers giving the value of the function for each input. For example, function 13 = 11012 is true for all inputs except X=false Y=true. Three-variable functions correspond to 3-bit inputs generating 8-bit truth tables, and so on.

This representation has two key advantages. The first is that the numbering is dense, so that you can implement a map keyed by function using a simple array. The second is that the operations “f AND g” and “f OR g” can be implemented using bitwise operators: the truth table for “f AND g” is the bitwise AND of the truth tables for f and g.

That program worked well enough in 2001 to compute the minimum number of operators necessary to write any 1-, 2-, 3-, and 4-variable Boolean function. Each round takes asymptotically O(22n·22n) = O(22n+1) time, and the number of rounds needed is O(the final answer). The answer for n=4 is 15, so the computation required on the order of 15·225 = 15·232 iterations of the innermost loop. That was plausible on the computer I was using at the time, but the answer for n=5, likely around 30, would need 30·264 iterations to compute, which seemed well out of reach. At the time, it seemed plausible that parity was always a hardest function and that the minimum size would continue to alternate between 2n−1 and 2n+1. It's a nice pattern.

Exploiting Symmetry

Five years later, though, Alex Healy and I got to talking about this sequence, and Alex shot down both conjectures using results from the theory of circuit complexity. (Theorists!) Neil Sloane added this note to the entry for the sequence in his Online Encyclopedia of Integer Sequences:

%E A056287 Russ Cox conjectures that X1 XOR ... XOR Xn is always a worst f and that a(5) = 33 and a(6) = 63. But (Jan 27 2006) Alex Healy points out that this conjecture is definitely false for large n. So what is a(5)?

Indeed. What is a(5)? No one knew, and it wasn't obvious how to find out.

In January 2010, Alex and I started looking into ways to speed up the computation for a(5). 30·264 is too many iterations but maybe we could find ways to cut that number.

In general, if we can identify a class of functions f whose members are guaranteed to have the same complexity, then we can save just one representative of the class as long as we recreate the entire class in the loop body. What used to be:

for each function f
    for each function g
        visit f AND g
        visit f OR g

can be rewritten as

for each canonical function f
    for each canonical function g
        for each ff equivalent to f
            for each gg equivalent to g
                visit ff AND gg
                visit ff OR gg

That doesn't look like an improvement: it's doing all the same work. But it can open the door to new optimizations depending on the equivalences chosen. For example, the functions “f” and “¬f” are guaranteed to have the same complexity, by DeMorgan's laws. If we keep just one of those two on the lists that “for each function” iterates over, we can unroll the inner two loops, producing:

for each canonical function f
    for each canonical function g
        visit f OR g
        visit f AND g
        visit ¬f OR g
        visit ¬f AND g
        visit f OR ¬g
        visit f AND ¬g
        visit ¬f OR ¬g
        visit ¬f AND ¬g

That's still not an improvement, but it's no worse. Each of the two loops considers half as many functions but the inner iteration is four times longer. Now we can notice that half of tests aren't worth doing: “f AND g” is the negation of “¬f OR ¬g,” and so on, so only half of them are necessary.

Let's suppose that when choosing between “f” and “¬f” we keep the one that is false when presented with all true inputs. (This has the nice property that f ^ (int32(f) >> 31) is the truth table for the canonical form of f.) Then we can tell which combinations above will produce canonical functions when f and g are already canonical:

for each canonical function f
    for each canonical function g
        visit f OR g
        visit f AND g
        visit ¬f AND g
        visit f AND ¬g

That's a factor of two improvement over the original loop.

Another observation is that permuting the inputs to a function doesn't change its complexity: “f(V, W, X, Y, Z)” and “f(Z, Y, X, W, V)” will have the same minimum size. For complex functions, each of the 5! = 120 permutations will produce a different truth table. A factor of 120 reduction in storage is good but again we have the problem of expanding the class in the iteration. This time, there's a different trick for reducing the work in the innermost iteration. Since we only need to produce one member of the equivalence class, it doesn't make sense to permute the inputs to both f and g. Instead, permuting just the inputs to f while fixing g is guaranteed to hit at least one member of each class that permuting both f and g would. So we gain the factor of 120 twice in the loops and lose it once in the iteration, for a net savings of 120. (In some ways, this is the same trick we did with “f” vs “¬f.”)

A final observation is that negating any of the inputs to the function doesn't change its complexity, because X and X' have the same complexity. The same argument we used for permutations applies here, for another constant factor of 25 = 32.

The code stores a single function for each equivalence class and then recomputes the equivalent functions for f, but not g.

for each canonical function f
    for each function ff equivalent to f
        for each canonical function g
            visit ff OR g
            visit ff AND g
            visit ¬ff AND g
            visit ff AND ¬g

In all, we just got a savings of 2·120·32 = 7680, cutting the total number of iterations from 30·264 = 5×1020 to 7×1016. If you figure we can do around 109 iterations per second, that's still 800 days of CPU time.

The full algorithm at this point is:

// Algorithm 2
func compute():
    for each function f
        size[f] = ∞
    
    for each single variable function f = v
        size[f] = 0
    
    loop
        changed = false
        for each canonical function f
            for each function ff equivalent to f
                for each canonical function g
                    d = size[ff] + 1 + size[g]
                    changed |= visit(d, ff OR g)
                    changed |= visit(d, ff AND g)
                    changed |= visit(d, ff AND ¬g)
                    changed |= visit(d, ¬ff AND g)
        if not changed
            return

func visit(d, fg):
    if size[fg] != ∞
        return false
    
    record fg as canonical

    for each function ffgg equivalent to fg
        size[ffgg] = d
    return true

The helper function “visit” must set the size not only of its argument fg but also all equivalent functions under permutation or inversion of the inputs, so that future tests will see that they have been computed.

Methodical Exploration

There's one final improvement we can make. The approach of looping until things stop changing considers each function pair multiple times as their sizes go down. Instead, we can consider functions in order of complexity, so that the main loop builds first all the functions of minimum complexity 1, then all the functions of minimum complexity 2, and so on. If we do that, we'll consider each function pair at most once. We can stop when all functions are accounted for.

Applying this idea to Algorithm 1 (before canonicalization) yields:

// Algorithm 3
func compute()
    for each function f
        size[f] = ∞
    
    for each single variable function f = v
        size[f] = 0
    
    for k = 1 to ∞
        for each function f
            for each function g of size k − size(f) − 1
                if size[f AND g] == ∞
                    size[f AND g] = k
                    nsize++
                if size[f OR g] == ∞
                    size[f OR g] = k
                    nsize++
        if nsize == 22n
            return

Applying the idea to Algorithm 2 (after canonicalization) yields:

// Algorithm 4
func compute():
    for each function f
        size[f] = ∞
    
    for each single variable function f = v
        size[f] = 0
    
    for k = 1 to ∞
        for each canonical function f
            for each function ff equivalent to f
                for each canonical function g of size k − size(f) − 1
                    visit(k, ff OR g)
                    visit(k, ff AND g)
                    visit(k, ff AND ¬g)
                    visit(k, ¬ff AND g)
        if nvisited == 22n
            return

func visit(d, fg):
    if size[fg] != ∞
        return
    
    record fg as canonical

    for each function ffgg equivalent to fg
        if size[ffgg] != ∞
            size[ffgg] = d
            nvisited += 2  // counts ffgg and ¬ffgg
    return

The original loop in Algorithms 1 and 2 considered each pair f, g in every iteration of the loop after they were computed. The new loop in Algorithms 3 and 4 considers each pair f, g only once, when k = size(f) + size(g) + 1. This removes the leading factor of 30 (the number of times we expected the first loop to run) from our estimation of the run time. Now the expected number of iterations is around 264/7680 = 2.4×1015. If we can do 109 iterations per second, that's only 28 days of CPU time, which I can deliver if you can wait a month.

Our estimate does not include the fact that not all function pairs need to be considered. For example, if the maximum size is 30, then the functions of size 14 need never be paired against the functions of size 16, because any result would have size 14+1+16 = 31. So even 2.4×1015 is an overestimate, but it's in the right ballpark. (With hindsight I can report that only 1.7×1014 pairs need to be considered but also that our estimate of 109 iterations per second was optimistic. The actual calculation ran for 20 days, an average of about 108 iterations per second.)

Endgame: Directed Search

A month is still a long time to wait, and we can do better. Near the end (after k is bigger than, say, 22), we are exploring the fairly large space of function pairs in hopes of finding a fairly small number of remaining functions. At that point it makes sense to change from the bottom-up “bang things together and see what we make” to the top-down “try to make this one of these specific functions.” That is, the core of the current search is:

for each canonical function f
    for each function ff equivalent to f
        for each canonical function g of size k − size(f) − 1
            visit(k, ff OR g)
            visit(k, ff AND g)
            visit(k, ff AND ¬g)
            visit(k, ¬ff AND g)

We can change it to:

for each missing function fg
    for each canonical function g
        for all possible f such that one of these holds
                * fg = f OR g
                * fg = f AND g
                * fg = ¬f AND g
                * fg = f AND ¬g
            if size[f] == k − size(g) − 1
                visit(k, fg)
                next fg

By the time we're at the end, exploring all the possible f to make the missing functions—a directed search—is much less work than the brute force of exploring all combinations.

As an example, suppose we are looking for f such that fg = f OR g. The equation is only possible to satisfy if fg OR g == fg. That is, if g has any extraneous 1 bits, no f will work, so we can move on. Otherwise, the remaining condition is that f AND ¬g == fg AND ¬g. That is, for the bit positions where g is 0, f must match fg. The other bits of f (the bits where g has 1s) can take any value. We can enumerate the possible f values by recursively trying all possible values for the “don't care” bits.

func find(x, any, xsize):
    if size(x) == xsize
        return x
    while any != 0
        bit = any AND −any  // rightmost 1 bit in any
        any = any AND ¬bit
        if f = find(x OR bit, any, xsize) succeeds
            return f
    return failure

It doesn't matter which 1 bit we choose for the recursion, but finding the rightmost 1 bit is cheap: it is isolated by the (admittedly surprising) expression “any AND −any.”

Given find, the loop above can try these four cases:

Formula Condition Base x “Any” bits
fg = f OR g fg OR g == fg fg AND ¬g g
fg = f OR ¬g fg OR ¬g == fg fg AND g ¬g
¬fg = f OR g ¬fg OR g == fg ¬fg AND ¬g g
¬fg = f OR ¬g ¬fg OR ¬g == ¬fg ¬fg AND g ¬g

Rewriting the Boolean expressions to use only the four OR forms means that we only need to write the “adding bits” version of find.

The final algorithm is:

// Algorithm 5
func compute():
    for each function f
        size[f] = ∞
    
    for each single variable function f = v
        size[f] = 0
    
    // Generate functions.
    for k = 1 to max_generate
        for each canonical function f
            for each function ff equivalent to f
                for each canonical function g of size k − size(f) − 1
                    visit(k, ff OR g)
                    visit(k, ff AND g)
                    visit(k, ff AND ¬g)
                    visit(k, ¬ff AND g)

    // Search for functions.
    for k = max_generate+1 to ∞
        for each missing function fg
            for each canonical function g
                fsize = k − size(g) − 1
                if fg OR g == fg
                    if f = find(fg AND ¬g, g, fsize) succeeds
                        visit(k, fg)
                        next fg
                if fg OR ¬g == fg
                    if f = find(fg AND g, ¬g, fsize) succeeds
                        visit(k, fg)
                        next fg
                if ¬fg OR g == ¬fg
                    if f = find(¬fg AND ¬g, g, fsize) succeeds
                        visit(k, fg)
                        next fg
                if ¬fg OR ¬g == ¬fg
                    if f = find(¬fg AND g, ¬g, fsize) succeeds
                        visit(k, fg)
                        next fg
        if nvisited == 22n
            return

func visit(d, fg):
    if size[fg] != ∞
        return
    
    record fg as canonical

    for each function ffgg equivalent to fg
        if size[ffgg] != ∞
            size[ffgg] = d
            nvisited += 2  // counts ffgg and ¬ffgg
    return

func find(x, any, xsize):
    if size(x) == xsize
        return x
    while any != 0
        bit = any AND −any  // rightmost 1 bit in any
        any = any AND ¬bit
        if f = find(x OR bit, any, xsize) succeeds
            return f
    return failure

To get a sense of the speedup here, and to check my work, I ran the program using both algorithms on a 2.53 GHz Intel Core 2 Duo E7200.

————— # of Functions ————————— Time ————
Size Canonical All All, Cumulative Generate Search
0 1 10 10
1 2 82 92 < 0.1 seconds 3.4 minutes
2 2 640 732 < 0.1 seconds 7.2 minutes
3 7 4420 5152 < 0.1 seconds 12.3 minutes
4 19 25276 29696 < 0.1 seconds 30.1 minutes
5 44 117440 147136 < 0.1 seconds 1.3 hours
6 142 515040 662176 < 0.1 seconds 3.5 hours
7 436 1999608 2661784 0.2 seconds 11.6 hours
8 1209 6598400 9260184 0.6 seconds 1.7 days
9 3307 19577332 28837516 1.7 seconds 4.9 days
10 7741 50822560 79660076 4.6 seconds [ 10 days ? ]
11 17257 114619264 194279340 10.8 seconds [ 20 days ? ]
12 31851 221301008 415580348 21.7 seconds [ 50 days ? ]
13 53901 374704776 790285124 38.5 seconds [ 80 days ? ]
14 75248 533594528 1323879652 58.7 seconds [ 100 days ? ]
15 94572 667653642 1991533294 1.5 minutes [ 120 days ? ]
16 98237 697228760 2688762054 2.1 minutes [ 120 days ? ]
17 89342 628589440 3317351494 4.1 minutes [ 90 days ? ]
18 66951 468552896 3785904390 9.1 minutes [ 50 days ? ]
19 41664 287647616 4073552006 23.4 minutes [ 30 days ? ]
20 21481 144079832 4217631838 57.0 minutes [ 10 days ? ]
21 8680 55538224 4273170062 2.4 hours 2.5 days
22 2730 16099568 4289269630 5.2 hours 11.7 hours
23 937 4428800 4293698430 11.2 hours 2.2 hours
24 228 959328 4294657758 22.0 hours 33.2 minutes
25 103 283200 4294940958 1.7 days 4.0 minutes
26 21 22224 4294963182 2.9 days 42 seconds
27 10 3602 4294966784 4.7 days 2.4 seconds
28 3 512 4294967296 [ 7 days ? ] 0.1 seconds

The bracketed times are estimates based on the work involved: I did not wait that long for the intermediate search steps. The search algorithm is quite a bit worse than generate until there are very few functions left to find. However, it comes in handy just when it is most useful: when the generate algorithm has slowed to a crawl. If we run generate through formulas of size 22 and then switch to search for 23 onward, we can run the whole computation in just over half a day of CPU time.

The computation of a(5) identified the sizes of all 616,126 canonical Boolean functions of 5 inputs. In contrast, there are just over 200 trillion canonical Boolean functions of 6 inputs. Determining a(6) is unlikely to happen by brute force computation, no matter what clever tricks we use.

Adding XOR

We've assumed the use of just AND and OR as our basis for the Boolean formulas. If we also allow XOR, functions can be written using many fewer operators. In particular, a hardest function for the 1-, 2-, 3-, and 4-input cases—parity—is now trivial. Knuth examines the complexity of 5-input Boolean functions using AND, OR, and XOR in detail in The Art of Computer Programming, Volume 4A. Section 7.1.2's Algorithm L is the same as our Algorithm 3 above, given for computing 4-input functions. Knuth mentions that to adapt it for 5-input functions one must treat only canonical functions and gives results for 5-input functions with XOR allowed. So another way to check our work is to add XOR to our Algorithm 4 and check that our results match Knuth's.

Because the minimum formula sizes are smaller (at most 12), the computation of sizes with XOR is much faster than before:

————— # of Functions —————
Size Canonical All All, Cumulative Time
0 1 10 10
1 3 102 112 < 0.1 seconds
2 5 1140 1252 < 0.1 seconds
3 20 11570 12822 < 0.1 seconds
4 93 109826 122648 < 0.1 seconds
5 366 936440 1059088 0.1 seconds
6 1730 7236880 8295968 0.7 seconds
7 8782 47739088 56035056 4.5 seconds
8 40297 250674320 306709376 24.0 seconds
9 141422 955812256 1262521632 95.5 seconds
10 273277 1945383936 3207905568 200.7 seconds
11 145707 1055912608 4263818176 121.2 seconds
12 4423 31149120 4294967296 65.0 seconds

Knuth does not discuss anything like Algorithm 5, because the search for specific functions does not apply to the AND, OR, and XOR basis. XOR is a non-monotone function (it can both turn bits on and turn bits off), so there is no test like our “if fg OR g == fg” and no small set of “don't care” bits to trim the search for f. The search for an appropriate f in the XOR case would have to try all f of the right size, which is exactly what Algorithm 4 already does.

Volume 4A also considers the problem of building minimal circuits, which are like formulas but can use common subexpressions additional times for free, and the problem of building the shallowest possible circuits. See Section 7.1.2 for all the details.

Code and Web Site

The web site boolean-oracle.swtch.com lets you type in a Boolean expression and gives back the minimal formula for it. It uses tables generated while running Algorithm 5; those tables and the programs described in this post are also available on the site.

Postscript: Generating All Permutations and Inversions

The algorithms given above depend crucially on the step “for each function ff equivalent to f,” which generates all the ff obtained by permuting or inverting inputs to f, but I did not explain how to do that. We already saw that we can manipulate the binary truth table representation directly to turn f into ¬f and to compute combinations of functions. We can also manipulate the binary representation directly to invert a specific input or swap a pair of adjacent inputs. Using those operations we can cycle through all the equivalent functions.

To invert a specific input, let's consider the structure of the truth table. The index of a bit in the truth table encodes the inputs for that entry. For example, the low bit of the index gives the value of the first input. So the even-numbered bits—at indices 0, 2, 4, 6, ...—correspond to the first input being false, while the odd-numbered bits—at indices 1, 3, 5, 7, ...—correspond to the first input being true. Changing just that bit in the index corresponds to changing the single variable, so indices 0, 1 differ only in the value of the first input, as do 2, 3, and 4, 5, and 6, 7, and so on. Given the truth table for f(V, W, X, Y, Z) we can compute the truth table for f(¬V, W, X, Y, Z) by swapping adjacent bit pairs in the original truth table. Even better, we can do all the swaps in parallel using a bitwise operation. To invert a different input, we swap larger runs of bits.

Function Truth Table (f = f(V, W, X, Y, Z))
f(¬V, W, X, Y, Z) (f&0x55555555)<< 1 | (f>> 1)&0x55555555
f(V, ¬W, X, Y, Z) (f&0x33333333)<< 2 | (f>> 2)&0x33333333
f(V, W, ¬X, Y, Z) (f&0x0f0f0f0f)<< 4 | (f>> 4)&0x0f0f0f0f
f(V, W, X, ¬Y, Z) (f&0x00ff00ff)<< 8 | (f>> 8)&0x00ff00ff
f(V, W, X, Y, ¬Z) (f&0x0000ffff)<<16 | (f>>16)&0x0000ffff

Being able to invert a specific input lets us consider all possible inversions by building them up one at a time. The Gray code lets us enumerate all possible 5-bit input codes while changing only 1 bit at a time as we move from one input to the next:

0, 1, 3, 2, 6, 7, 5, 4,
12, 13, 15, 14, 10, 11, 9, 8,
24, 25, 27, 26, 30, 31, 29, 28,
20, 21, 23, 22, 18, 19, 17, 16

This minimizes the number of inversions we need: to consider all 32 cases, we only need 31 inversion operations. In contrast, visiting the 5-bit input codes in the usual binary order 0, 1, 2, 3, 4, ... would often need to change multiple bits, like when changing from 3 to 4.

To swap a pair of adjacent inputs, we can again take advantage of the truth table. For a pair of inputs, there are four cases: 00, 01, 10, and 11. We can leave the 00 and 11 cases alone, because they are invariant under swapping, and concentrate on swapping the 01 and 10 bits. The first two inputs change most often in the truth table: each run of 4 bits corresponds to those four cases. In each run, we want to leave the first and fourth alone and swap the second and third. For later inputs, the four cases consist of sections of bits instead of single bits.

Function Truth Table (f = f(V, W, X, Y, Z))
f(W, V, X, Y, Z) f&0x99999999 | (f&0x22222222)<<1 | (f>>1)&0x22222222
f(V, X, W, Y, Z) f&0xc3c3c3c3 | (f&0x0c0c0c0c)<<1 | (f>>1)&0x0c0c0c0c
f(V, W, Y, X, Z) f&0xf00ff00f | (f&0x00f000f0)<<1 | (f>>1)&0x00f000f0
f(V, W, X, Z, Y) f&0xff0000ff | (f&0x0000ff00)<<8 | (f>>8)&0x0000ff00

Being able to swap a pair of adjacent inputs lets us consider all possible permutations by building them up one at a time. Again it is convenient to have a way to visit all permutations by applying only one swap at a time. Here Volume 4A comes to the rescue. Section 7.2.1.2 is titled “Generating All Permutations,” and Knuth delivers many algorithms to do just that. The most convenient for our purposes is Algorithm P, which generates a sequence that considers all permutations exactly once with only a single swap of adjacent inputs between steps. Knuth calls it Algorithm P because it corresponds to the “Plain changes” algorithm used by bell ringers in 17th century England to ring a set of bells in all possible permutations. The algorithm is described in a manuscript written around 1653!

We can examine all possible permutations and inversions by nesting a loop over all permutations inside a loop over all inversions, and in fact that's what my program does. Knuth does one better, though: his Exercise 7.2.1.2-20 suggests that it is possible to build up all the possibilities using only adjacent swaps and inversion of the first input. Negating arbitrary inputs is not hard, though, and still does minimal work, so the code sticks with Gray codes and Plain changes.

by rsc (noreply@blogger.com) at May 18, 2011 07:56 PM

May 11, 2011

research!rsc

Irregular expression matching with the .NET stack

Python introduced a named capturing form in its regular expressions: (?P<x>...) is like (...) but you refer to it as \k'x' instead of \1. Perl and .NET picked it up without the Pythonic P. They also let you say (?'x'...) instead of (?<x>...). (Perl: there's more than one way to do it!)

You can use these during the match just like any backreference; the value of \k'x' is the most recent text matched by (?<x>...). So (?<d>[0-9])+\k'd' matches any digit string in which the last two digits are the same.

.NET introduces a variant on the named capture, (?<-x>...), which, during the match, deletes the last captured substring for x, exposing the one that was there before. This implies that the named captures are stored on an internal per-name stack and it lets you do nested matching: (?<d>[0-9])+(\k'd'(?<-d>))+ matches the longest prefix of an even-length palindrome of digits, so 12345543 in 1234554399.

There's a Perl addition that appears to have snuck into .NET undocumented, which is that you can say (?(cond)then|else) or just (?(cond)then) as a conditional expression: if cond is true, then it tries to match then, else else (or the empty string). The condition 1 means that there was a match for \1. In Perl the condition <name> means that there was a match for the named back-reference \k'name'. In .NET it appears that the syntax is just name, by closer analogy with the numbered backreferences.

If you want full palindromes, you can insert a condition that there must be no matches left for d on the stack, by using d as the condition and (?!) - which cannot match anything - as the true branch. The final thunderclap, then, is:

(?<d>[0-9])+(\k'd'(?<-d>))+(?(d)(?!))

I haven't tried this, since I don't have easy access to .NET.

This post expands on the technique to match well-formed XML.

by rsc (noreply@blogger.com) at May 11, 2011 03:12 AM

May 10, 2011

OS Inferno

Plan9front - очередной форк plan9

Не так давно Cinap Lenrek, автор Linux-эмулятора linuxemu, SMB-сервера cifsd, newboot и кучи другого p9-софта, создал форк Plan 9 под названием plan9front.

Это не было бы столь интересно, если бы автор не успевал "строчить" по 10-20 коммитов ежедневно, а список изменений не был бы таким занимательным:

* 9load заменен на более развитый 9boot.
* Файловая система fossil заменена на cwfs.
* Язык программирования и среда исполнения языка Go.
* Новый файловый сервер kbdfs.
* Возможность установки с USB CD-ROM, USB HDD и USB FLASH.
* SMB-сервер cifsd в комплекте.
* Python и Mercurial.
* Эмулятор tty.
* Эмулятор /dev/realmode realemu, позволяющий активировать графический режим на ранее не поддерживаемых видео-картах.
* Драйвера для следующих устройств: сетевая карта Broadcom BCM57xx, SATA-контроллеры Intel 82801FBM SATA, ntel 82801HB/HR/HH/HO SATA IDE, ntel 82801HBM SATA (ICH8-M), видео-карты AMD Geode LX, планшетов Wacom WACF004 и WACF008.

by j1m (noreply@blogger.com) at May 10, 2011 05:04 PM

May 07, 2011

James Tomaschke

Color Conversion



/* CMAP to RGB24 */
loadmemimage(cimg, cimg->r, screens[0], N);
memimagedraw(timg, timg->r, cimg, ZP, nil, ZP, S);

/* Memimage to Image */
unloadmemimage(timg, img->r, buf24, sizeof(buf24));
loadimage(img, img->r, buf24, sizeof(buf24));

by jtomaschke (noreply@blogger.com) at May 07, 2011 09:59 PM

Starting Over



Had some time off from work, so I decided to redo the port from scratch because I simply couldn't find the original from backup and came up with this today. Need to figure out a nice way of remapping CMAP8 to 24bit displays as well as supplement the keyboard driver again. Oh well, the second time around I am doing more with less code.

by jtomaschke (noreply@blogger.com) at May 07, 2011 01:07 AM

May 06, 2011

NineTimes - Inferno and Plan 9 News

Developing modules for Limbo in C

New article/tutorial about developing modules for Limbo in C explain how to create new module from scratch, and some important details about Inferno memory management.

by www-data at May 06, 2011 03:36 PM

May 02, 2011

newsham

Retro Cmd Line Fun

If you've been interested in checking out an old unix system but didn't want to go through the effort of installing an emulator, someone's put up a javascript pdp11 emulator running 6th edition.  It all runs in your browser (for some browsers, at least).

Or if you want a more versatile and guided experience of old computers, check the awesome Telehack page. This page isn't a historically accurate simulation; its more like a game. But it provides an experience similar to trying to hack from machine to machine with 1980s era networks and machines.

by newsham (noreply@blogger.com) at May 02, 2011 01:36 AM

April 24, 2011

newsham

seL4 download

You can now download the seL4 distribution from NICTA: http://ertos.nicta.com.au/software/seL4/
For those not yet familiar with it, seL4 is a variant of the L4 microkernel that has been formally verified.

by newsham (noreply@blogger.com) at April 24, 2011 02:19 AM

April 21, 2011

OS Inferno

Переезд Inferno Wiki

Перенес Inferno Wiki на новый адрес: inferno.execbit.ru. Больше никакой рекламы и зубодробительных доменов четвертого уровня. Все просто и логично.

В процессе переноса я исправил почти все статьи, так что теперь большинство из них вполне пригодны для чтения человеком. Некоторые статьи я еще не успел перенести, так что если кто-нибудь согласится помочь, это будет очень кстати. Так же добавил несколько новых статей. Приятного чтения/правки.

by j1m (noreply@blogger.com) at April 21, 2011 08:55 AM

April 12, 2011

Stanley Lieber

plan 9 + thinkpad x41 tablet

stanleylieber posted a photo:

plan 9 + thinkpad x41 tablet

the x41 has no internal optical drive. broadcom ethernet and atheros wifi are not supported. vesa 1024x768x32. booted from the docking station.

by stanleylieber at April 12, 2011 01:07 AM

April 05, 2011

OS Inferno

emuq - запуск Inferno без мороки

Вчера mjl анонсировал emuq - версию Inferno emu, которую не нужно распаковывать, компилировать, настраивать и т.д. Достаточно только скачать бинарник (весом 1.5 Мб) с официальной страницы проекта, запустить, изменить несколько настроек с помощью конфигуратора и на экране появится полноценный рабочий стол Inferno.

Внутри бинарника находится самый обычный emu со слегка измененным кодом инициализации. После запуска emuq подключается к venti-серверу, запрашивает у него vac-архив, содержащий коневую ФС Inferno, и подключает ее к корню c помощью vacsrv. При этом адрес venti-сервера и имя vac-архива указываются во время запуска emuq (тот самый конфигуратор), что дает возможность использовать его для запуска самых разных редакций Inferno.

К сожалению, работает emuq пока только в Windows.

UPD: скиншот

by j1m (noreply@blogger.com) at April 05, 2011 05:41 AM

April 02, 2011

OS Inferno

q или inferno portable apps

Новый проект mjl - реализация инструментов для создания самодостаточных Inferno-приложений, которые не потребуют для своего запуска ниче+го, кроме emu (или ядра Inferno, если она собрана как ОС).

q позволяет превратить любое приложение в файловый сервер, который будет хранить внутри себя все необходимые этому приложению (да и любые другие) файлы и каталоге. Во время запуска сервер подключит дерево этих файлов к текущему пространству имен и продолжит функционировать как оригинальная программа.

Для хранения файлов внутри сервера используется обычный байтовый массив (так же как это сделано в псевдо-устройстве root), поэтому после+ его создания файлы невозможно изменить или добавить.

by j1m (noreply@blogger.com) at April 02, 2011 07:17 AM

March 29, 2011

OS Inferno

Поддержка UNIX-сокетов в псевдо-устройстве #U

22 марта Noah Evans добавил в дерево исходников проекта inferno-npe патч, реализующий поддержку чтения и записи в сокеты, расположенные в файловой системе низлежащей ОС. Сегодня Чарльз Форсайт перенес этот патч в официальную ветку inferno-os.

Смысл всего этого в том, чтобы позволить Inferno напрямую работать с файловыми серверами из p9p (Plan 9 from User Space), которые используют сокеты в качестве интерфейса для обмена Styx-сообщениями (p9p работает в UNIX, поэтому в нем нет полноценной поддержки пространств имен).

by j1m (noreply@blogger.com) at March 29, 2011 01:16 PM

newsham

So cool - your own GSM BTS

A very cool blog post about setting up your own GSM base station, with video:
http://anders.com/cms/391/USRP/OpenBTS/GNU.Radio/Software.Radio/Ettus.Research/N210

The video is here:

<iframe allowfullscreen="" frameborder="0" height="349" src="http://www.youtube.com/embed/pTb1_v8M6iA" title="YouTube video player" width="560"></iframe>

by newsham (noreply@blogger.com) at March 29, 2011 05:45 AM

March 28, 2011

Stanley Lieber

equis + linuxemu + opera

stanleylieber posted a photo:

equis + linuxemu + opera

thinkpad t23 running plan 9 native. thanks to fgb's equis x11 server and cinap_lenrek's linuxemu, it is possible to run a reasonably modern web browser on plan 9.

by stanleylieber at March 28, 2011 02:07 AM

March 23, 2011

OS Inferno

VNC-вьюер

Очередная новинка от mjl. На этот раз он написал VNC-вьюер.

by j1m (noreply@blogger.com) at March 23, 2011 08:03 AM

March 22, 2011

research!rsc

Data Structures Go Programs

I've been writing about Go internals, which I'll keep doing, but I also want to spend some time on Go programs. Go's reflect package implements data reflection. Any value can be passed to reflect.NewValue (its argument type is interface{}, the empty interface) and then manipulated and examined by looking at the result, a reflect.Value. Typically, you use a type switch to determine what kind of value it is—a *IntValue, a *StructValue, a *PtrValue, and so on—and then walk it further. This is how fmt.Print prints arbitrary data passed to it.

Go allows programs to tag the fields in struct definitions with annotation strings. These annotations have no effect on the compilation but are recorded in the run-time type information and can be examined via reflection. The annotations can be hints for reflection-driven code traversing the data structures. This blog post shows two examples of such code: an XML parser that writes directly into data structures, and a template-based output generator that reads from them.

An XML Linked List

The Go XML package implements a function Unmarshal which parses XML directly into data structures. As a contrived but illustrative example, here is an XML encoding of a linked list:

var encoded = `
   <list><value>a</value>
       <list><value>b</value>
           <list><value>c</value>
           </list>
       </list>
   </list>
`

(That's a backquoted multi-line Go string literal.)

Given this Go data structure definition:

// http://gopaste.org/view/private:7v2CQ
type List struct {
   Value string
   List *List
}

func (l *List) String() string {
if l == nil {
 return "nil"
}
return l.Value + " :: " + l.List.String()
}

we can invoke xml.Unmarshal to turn the XML above into an instance of this data structure:

func main() {
var l *List
if err := xml.Unmarshal(strings.NewReader(encoded), &l); err != nil {
 log.Exit("xml.Unmarshal: ", err)
}
fmt.Println(l)
}

This program prints a :: b :: c :: nil. The call to xml.Unmarshal walked the XML and the data structure at the same time, matching the XML tags <value> and <list> against the names of the structure fields and allocating pointers where necessary.

There are decisions to be made when matching XML against the data structure that aren't easily expressed with just the usual struct information. For example, we might want to make the value an attribute instead of a sub-element. Because attributes and sub-elements might use the same names, the XML package requires struct fields holding attributes to be tagged with "attr". This data structure and XML encoding would also work in the program above:

// http://gopaste.org/view/private:2yR7E
type List struct {
   Value string "attr"
   List *List
}

var encoded = `
   <list value="a">
       <list value="b">
           <list value="c"/>
       </list>
   </list>
`

If a field has type string, it gets the character data from the element with that name, but sometimes you need to accumulate both character data and sub-elements. To do this, you can tag a field with "chardata". This data structure and XML encoding would also work:

// http://gopaste.org/view/private:U3ews
type List struct {
    Value string "chardata"
    List *List
}

var encoded = `<list>a<list>b<list>c</list></list></list>`

The annotations are trivial but important: where the encoding and the data structures are not a perfect match (and they rarely are), the annotations provide the hints necessary for the XML package to make them work together anyway.

Code Search Client

Let's look at real XML data. Google's Code Search provides regular expression search over the world's public source code. The search [file:/usr/sys/ken/slp.c expected] returns a single, famous result. Instead of using the web interface, though, we can issue the same query using the Code Search GData API and get this XML back (simplified and highlighted for structure):

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"
     xmlns:gcs="http://schemas.google.com/codesearch/2006"
     xml:base="http://www.google.com">
 <id>http://www.google.com/codesearch/feeds/search?q=file:/usr/sys/ken/slp.c+expected</id>
 <updated>2009-12-04T23:49:22Z</updated>
 <title type="text">Google Code Search</title>
  <generator version="1.0" uri="http://www.google.com/codesearch">Google Code Search</generator>
 <entry>
   <gcs:package name="http://www.tuhs.org" uri="http://www.tuhs.org"></gcs:package>
   <gcs:file name="Archive/PDP-11/Trees/V6/usr/sys/ken/slp.c"></gcs:file>
   <gcs:match lineNumber="325" type="text/html">&lt;pre&gt;  * You are not &lt;b&gt;expected&lt;/b&gt; to understand this.
&lt;/pre&gt;</gcs:match>
 </entry>
</feed>

From the API documentation or just by eyeballing, we can write data structures for the information we'd like to extract:

type Feed struct {
   XMLName    xml.Name "http://www.w3.org/2005/Atom feed"
   Entry []Entry
}

type Entry struct {
   XMLName xml.Name "http://www.w3.org/2005/Atom entry"
   Package Package
   File File
   Match []Match
}

type Package struct {
   XMLName xml.Name "http://schemas.google.com/codesearch/2006 package"
   Name string "attr"
   URI string "attr"
}

type File struct {
   XMLName xml.Name "http://schemas.google.com/codesearch/2006 file"
   Name string "attr"
}

type Match struct {
   XMLName xml.Name "http://schemas.google.com/codesearch/2006 match"
   LineNumber string "attr"
   Type string "attr"
   Snippet string "chardata"
}

The XMLName fields are not required. The XML package records the XML tag name in them. If there is a string annotation (as there is here), then the XML package requires that the XML element being considered has the given name space and tag name. That is, the annotation is both documentation and an assertion about what kind of XML goes into this structure.

A full Atom GData feed has many other fields, but these are the only ones that we care about. The XML parser will throw away data that doesn't fit into our structures.

We can fetch the feed via HTTP and unmarshal the HTTP response data directly into the data structures:

r, _, err := http.Get(`http://www.google.com/codesearch/feeds/search?q=` +
   http.URLEscape(flag.Arg(0)))
if err != nil {
   log.Exit(err)
}

var f Feed
if err := xml.Unmarshal(r.Body, &f); err != nil {
   log.Exit("xml.Unmarshal: ", err)
}

and then print the results:

for _, e := range f.Entry {
   fmt.Printf("%s\n", e.Package.Name)
   for _, m := range e.Match {
       fmt.Printf("%s:%s:\n", e.File.Name, m.LineNumber)
       fmt.Printf("%s\n", cutHTML(m.Snippet))
   }
   fmt.Printf("\n")
}

(The cutHTML function returns its argument with HTML tags stripped: the m.Snippet is HTML but we just want to print text.)

Compared to the xml.Unmarshal, that loop with all the fmt.Printf calls sure looks clunky. We can use a reflection-driven process to print the data too. The template package is a Go implementation of json-template. We can write a template like:

var tmpl = `{.repeated section Entry}
{Package.Name}
{.repeated section Match}
{File.Name}:{LineNumber}:
{Snippet|nohtml}
{.end}

{.end}`

and then print the data using that template:

t := template.MustParse(tmpl, template.FormatterMap{"nohtml": nohtml})
t.Execute(&f, os.Stdout)

The formatter map definitions arrange that the {Snippet|nohtml} in the template run the data through the nohtml function when printing. Both variants of our program produce the same output:

http://www.tuhs.org
Archive/PDP-11/Trees/V6/usr/sys/ken/slp.c:325:
 * You are not expected to understand this.

Full source code is at http://gopaste.org/view/private:7ybCj

Parsing HTML

I still have three more tricks up my sleeve.

First, in a pinch and if the HTML is mostly well formatted or you're just lucky, the XML parser can handle HTML too. Second, when the XML parser walks into a struct field, if the field is a pointer and is nil, it allocates a new value, as we saw in the linked list example. But if the field is a non-nil pointer already, the parser just follows the pointer. The same goes for growing slices. Third, if the XML parser cannot find a struct field for a particular subelement, before giving up it looks for a field named Any and uses that field instead.

Combined, these three rules mean that this structure:

type Form struct {
   Input []Field
   Any *Form
}

type Field struct {
   Name string "attr"
   Value string "attr"
}

can be used to save a tree of form and other information from a web page. When you unmarshal into it, you get a tree showing the structure of the web page, with each element becoming a *Form because the parser used the Any field.

But who wants to walk that tree just to find the form fields? Instead, we can set the Any pointer to point back at the Form it is in. Then as the XML parser thinks it is walking the data structure, in fact it will be visiting the same Form struct over and over. Each time it finds an <input>, the parser will append it to the Input slice.

// http://gopaste.org/view/private:b8Ya2
func main() {
    r, _, err := http.Get(`http://code.google.com/p/go/`)
    if err != nil {
        log.Exit(err)
    }

    var f Form
    f.Any = &f

    p := xml.NewParser(r.Body)
    p.Strict = false
    p.Entity = xml.HTMLEntity
    p.AutoClose = xml.HTMLAutoClose
    if err := p.Unmarshal(&f, nil); err != nil {
        log.Exit("xml.Unmarshal: ", err)
    }

    for _, fld := range f.Input {
        fmt.Printf("%s=%q\n", fld.Name, fld.Value)
    }
}

This program prints:

q=""
projectsearch="Search projects"

Further Reading

There are a few more examples of this technique in the standard Go libraries. The net package's DNS client uses this technique to build generic packing and unpacking routines for DNS packets. The JSON package provides approximately the same functionality as the XML package, but with JSON as the wire format. In fact, JSON is arguably more suited to the task. (Remember: “the essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.”) The ASN1 package also works this way, though its generality is limited: it implements just enough to parse SSL/TLS certificates. There are also other ways to use reflection, which will have to wait for future posts.

It's actually a little surprising how often this technique gets used in the Go libraries. Part of the reason is that it's a shiny new hammer, but it seems to be a very useful shiny new hammer, because so much of modern programming is talking to other programs over a variety of wire encodings.

Unmarshalling directly into data structures via reflection, with just a few necessary hints as annotations, can be implemented in other languages, but the examples I've seen are not nearly as elegant or as concise as the Go versions.

Apologies to Jon Bentley.

by rsc (noreply@blogger.com) at March 22, 2011 06:28 PM

Go Data Structures: Interfaces

Go's interfaces—static, checked at compile time, dynamic when asked for—are, for me, the most exciting part of Go from a language design point of view. If I could export one feature of Go into other languages, it would be interfaces.

This post is my take on the implementation of interface values in the “gc” compilers: 6g, 8g, and 5g. Over at Airs, Ian Lance Taylor has written two posts about the implementation of interface values in gccgo. The implementations are more alike than different: the biggest difference is that this post has pictures.

Before looking at the implementation, let's get a sense of what it must support.

Usage

Go's interfaces let you use duck typing like you would in a purely dynamic language like Python but still have the compiler catch obvious mistakes like passing an int where an object with a Read method was expected, or like calling the Read method with the wrong number of arguments. To use interfaces, first define the interface type (say, ReadCloser):

type ReadCloser interface {
    Read(b []byte) (n int, err os.Error)
    Close()
}

and then define your new function as taking a ReadCloser. For example, this function calls Read repeatedly to get all the data that was requested and then calls Close:

func ReadAndClose(r ReadCloser, buf []byte) (n int, err os.Error) {
    for len(buf) > 0 && err == nil {
        var nr int
        nr, err = r.Read(buf)
        n += nr
        buf = buf[nr:]
    }
    r.Close()
    return
}

The code that calls ReadAndClose can pass a value of any type as long as it has Read and Close methods with the right signatures. And, unlike in languages like Python, if you pass a value with the wrong type, you get an error at compile time, not run time.

Interfaces aren't restricted to static checking, though. You can check dynamically whether a particular interface value has an additional method. For example:

type Stringer interface {
    String() string
}

func ToString(any interface{}) string {
    if v, ok := any.(Stringer); ok {
        return v.String()
    }
    switch v := any.(type) {
    case int:
        return strconv.Itoa(v)
    case float:
        return strconv.Ftoa(v, 'g', -1)
    }
    return "???"
}

The value any has static type interface{}, meaning no guarantee of any methods at all: it could contain any type. The “comma ok” assignment inside the if statement asks whether it is possible to convert any to an interface value of type Stringer, which has the method String. If so, the body of that statement calls the method to obtain a string to return. Otherwise, the switch picks off a few basic types before giving up. This is basically a stripped down version of what the fmt package does. (The if could be replaced by adding case Stringer: at the top of the switch, but I used a separate statement to draw attention to the check.)

As a simple example, let's consider a 64-bit integer type with a String method that prints the value in binary and a trivial Get method:

type Binary uint64

func (i Binary) String() string {
    return strconv.Uitob64(i.Get(), 2)
}

func (i Binary) Get() uint64 {
    return uint64(i)
}

A value of type Binary can be passed to ToString, which will format it using the String method, even though the program never says that Binary intends to implement Stringer. There's no need: the runtime can see that Binary has a String method, so it implements Stringer, even if the author of Binary has never heard of Stringer.

These examples show that even though all the implicit conversions are checked at compile time, explicit interface-to-interface conversions can inquire about method sets at run time. “Effective Go” has more details about and examples of how interface values can be used.

Interface Values

Languages with methods typically fall into one of two camps: prepare tables for all the method calls statically (as in C++ and Java), or do a method lookup at each call (as in Smalltalk and its many imitators, JavaScript and Python included) and add fancy caching to make that call efficient. Go sits halfway between the two: it has method tables but computes them at run time. I don't know whether Go is the first language to use this technique, but it's certainly not a common one. (I'd be interested to hear about earlier examples; leave a comment below.)

As a warmup, a value of type Binary is just a 64-bit integer made up of two 32-bit words (like in the last post, we'll assume a 32-bit machine; this time memory grows down instead of to the right):

Interface values are represented as a two-word pair giving a pointer to information about the type stored in the interface and a pointer to the associated data. Assigning b to an interface value of type Stringer sets both words of the interface value.

(The pointers contained in the interface value are gray to emphasize that they are implicit, not directly exposed to Go programs.)

The first word in the interface value points at what I call an interface table or itable (pronounced i-table; in the runtime sources, the C implementation name is Itab). The itable begins with some metadata about the types involved and then becomes a list of function pointers. Note that the itable corresponds to the interface type, not the dynamic type. In terms of our example, the itable for Stringer holding type Binary lists the methods used to satisfy Stringer, which is just String: Binary's other methods (Get) make no appearance in the itable.

The second word in the interface value points at the actual data, in this case a copy of b. The assignment var s Stringer = b makes a copy of b rather than point at b for the same reason that var c uint64 = b makes a copy: if b later changes, s and c are supposed to have the original value, not the new one. Values stored in interfaces might be arbitrarily large, but only one word is dedicated to holding the value in the interface structure, so the assignment allocates a chunk of memory on the heap and records the pointer in the one-word slot. (There's an obvious optimization when the value does fit in the slot; we'll get to that later.)

To check whether an interface value holds a particular type, as in the type switch above, the Go compiler generates code equivalent to the C expression s.tab->type to obtain the type pointer and check it against the desired type. If the types match, the value can be copied by by dereferencing s.data.

To call s.String(), the Go compiler generates code that does the equivalent of the C expression s.tab->fun[0](s.data): it calls the appropriate function pointer from the itable, passing the interface value's data word as the function's first (in this example, only) argument. You can see this code if you run 8g -S x.go (details at the bottom of this post). Note that the function in the itable is being passed the 32-bit pointer from the second word of the interface value, not the 64-bit value it points at. In general, the interface call site doesn't know the meaning of this word nor how much data it points at. Instead, the interface code arranges that the function pointers in the itable expect the 32-bit representation stored in the interface values. Thus the function pointer in this example is (*Binary).String not Binary.String.

The example we're considering is an interface with just one method. An interface with more methods would have more entries in the fun list at the bottom of the itable.

Computing the Itable

Now we know what the itables look like, but where do they come from? Go's dynamic type conversions mean that it isn't reasonable for the compiler or linker to precompute all possible itables: there are too many (interface type, concrete type) pairs, and most won't be needed. Instead, the compiler generates a type description structure for each concrete type like Binary or int or func(map[int]string). Among other metadata, the type description structure contains a list of the methods implemented by that type. Similarly, the compiler generates a (different) type description structure for each interface type like Stringer; it too contains a method list. The interface runtime computes the itable by looking for each method listed in the interface type's method table in the concrete type's method table. The runtime caches the itable after generating it, so that this correspondence need only be computed once.

In our simple example, the method table for Stringer has one method, while the table for Binary has two methods. In general there might be ni methods for the interface type and nt methods for the concrete type. The obvious search to find the mapping from interface methods to concrete methods would take O(ni × nt) time, but we can do better. By sorting the two method tables and walking them simultaneously, we can build the mapping in O(ni + nt) time instead.

Memory Optimizations

The space used by the implementation described above can be optimized in two complementary ways.

First, if the interface type involved is empty—it has no methods—then the itable serves no purpose except to hold the pointer to the original type. In this case, the itable can be dropped and the value can point at the type directly:

Whether an interface type has methods is a static property—either the type in the source code says interface{} or it says interace{ methods... }—so the compiler knows which representation is in use at each point in the program.

Second, if the value associated with the interface value can fit in a single machine word, there's no need to introduce the indirection or the heap allocation. If we define Binary32 to be like Binary but implemented as a uint32, it could be stored in an interface value by keeping the actual value in the second word:

Whether the actual value is being pointed at or inlined depends on the size of the type. The compiler arranges for the functions listed in the type's method table (which get copied into the itables) to do the right thing with the word that gets passed in. If the receiver type fits in a word, it is used directly; if not, it is dereferenced. The diagrams show this: in the Binary version far above, the method in the itable is (*Binary).String, while in the Binary32 example, the method in the itable is Binary32.String not (*Binary32).String.

Of course, empty interfaces holding word-sized (or smaller) values can take advantage of both optimizations:

Method Lookup Performance

Smalltalk and the many dynamic systems that have followed it perform a method lookup every time a method gets called. For speed, many implementations use a simple one-entry cache at each call site, often in the instruction stream itself. In a multithreaded program, these caches must be managed carefully, since multiple threads could be at the same call site simultaneously. Even once the races have been avoided, the caches would end up being a source of memory contention.

Because Go has the hint of static typing to go along with the dynamic method lookups, it can move the lookups back from the call sites to the point when the value is stored in the interface. For example, consider this code snippet:

1   var any interface{}  // initialized elsewhere
2   s := any.(Stringer)  // dynamic conversion
3   for i := 0; i < 100; i++ {
4       fmt.Println(s.String())
5   }

In Go, the itable gets computed (or found in a cache) during the assignment on line 2; the dispatch for the s.String() call executed on line 4 is a couple of memory fetches and a single indirect call instruction.

In contrast, the implementation of this program in a dynamic language like Smalltalk (or JavaScript, or Python, or ...) would do the method lookup at line 4, which in a loop repeats needless work. The cache mentioned earlier makes this less expensive than it might be, but it's still more expensive than a single indirect call instruction.

Of course, this being a blog post, I don't have any numbers to back up this discussion, but it certainly seems like the lack of memory contention would be a big win in a heavily parallel program, as is being able to move the method lookup out of tight loops. Also, I'm talking about the general architecture, not the specifics o the implementation: the latter probably has a few constant factor optimizations still available.

More Information

The interface runtime support is in $GOROOT/src/pkg/runtime/iface.c. There's much more to say about interfaces (we haven't even seen an example of a pointer receiver yet) and the type descriptors (they power reflection in addition to the interface runtime) but those will have to wait for future posts.

Code

Supporting code (x.go):

package main

import (
 "fmt"
 "strconv"
)

type Stringer interface {
 String() string
}

type Binary uint64

func (i Binary) String() string {
 return strconv.Uitob64(i.Get(), 2)
}

func (i Binary) Get() uint64 {
 return uint64(i)
}

func main() {
 b := Binary(200)
 s := Stringer(b)
 fmt.Println(s.String())
}

Selected output of 8g -S x.go:

0045 (x.go:25) LEAL    s+-24(SP),BX
0046 (x.go:25) MOVL    4(BX),BP
0047 (x.go:25) MOVL    BP,(SP)
0048 (x.go:25) MOVL    (BX),BX
0049 (x.go:25) MOVL    20(BX),BX
0050 (x.go:25) CALL    ,BX

The LEAL loads the address of s into the register BX. (The notation n(SP) describes the word in memory at SP+n. 0(SP) can be shortened to (SP).) The next two MOVL instructions fetch the value from the second word in the interface and store it as the first function call argument, 0(SP). The final two MOVL instructions fetch the itable and then the function pointer from the itable, in preparation for calling that function.

by rsc (noreply@blogger.com) at March 22, 2011 06:25 PM

March 18, 2011

OS Inferno

Книга о Inferno 2010 года издания

Книга на Амазон. Понятия не имею что это и почему понадобилось три автора для написания 88 страниц, но сам факт довольно интересен. Если у кого-то есть информация на этот счет, поделитесь. Интересно будет всем.

by j1m (noreply@blogger.com) at March 18, 2011 11:31 AM

March 16, 2011

johnny

Our daily poison - Notre poison quotidien

<object allowscriptaccess="always" classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=10,0,0,0" height="410" id="playerArte" width="640"><param name="allowFullScreen" value="true"><param name="allowScriptAccess" value="always"><param name="quality" value="high"><param name="movie" value="http://videos.arte.tv/videoplayer.swf?videorefFileUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Ffr%2Fdo%5Fdelegate%2Fvideos%2Fnotre%5Fpoison%5Fquotidien%2D3761854%2Cview%2CasPlayerXml%2Exml&amp;admin=false&amp;autoPlay=true&amp;mode=prod&amp;localizedPathUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Fcae%2Fstatic%2Fflash%2Fplayer%2F&amp;configFileUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Fcae%2Fstatic%2Fflash%2Fplayer%2Fconfig%2Exml&amp;videoId=3761854&amp;lang=fr&amp;embed=true&amp;autoPlay=false"><embed allowfullscreen="true" allowscriptaccess="always" height="410" name="playerArte" pluginspage="http://www.macromedia.com/go/getflashplayer" quality="high" src="http://videos.arte.tv/videoplayer.swf?videorefFileUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Ffr%2Fdo%5Fdelegate%2Fvideos%2Fnotre%5Fpoison%5Fquotidien%2D3761854%2Cview%2CasPlayerXml%2Exml&amp;admin=false&amp;autoPlay=true&amp;mode=prod&amp;localizedPathUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Fcae%2Fstatic%2Fflash%2Fplayer%2F&amp;configFileUrl=http%3A%2F%2Fvideos%2Earte%2Etv%2Fcae%2Fstatic%2Fflash%2Fplayer%2Fconfig%2Exml&amp;videoId=3761854&amp;lang=fr&amp;embed=true&amp;autoPlay=false" type="application/x-shockwave-flash" width="640"></embed></object>

by lighttpd at March 16, 2011 03:08 PM

OS Inferno

Google Summer Of Code 2011

Опубликован список идей для SoC 2011, касающихся Plan 9 и всех связанных с ней технологий. Выглядит он примерно так:

1. Переписать механизм генерации html-кода в wikifs(4).
2. Научить wikifs производить аутентификацию с помощью auth-сервера.
3. Портировать BSD NDISulator (для работы сетевых Windows-драйверов).
4. Научить компилятор kencc генерировать объектные файлы для Windows (как я понял это нужно, чтобы портировать некоторые компоненты Plan 9 в Windows).
5. Библиотека для декодирования asn.1 DER (часть проекта по портированию LDAP).
6. Реализовать NAT (уже не первый раз в GSoC, видимо никому не нужно).
7. Научить libdraw/libframe работать со шрифтами разной высоты.
8. x2apic и msi interrupts (в двух словах: улучшение работы Plan 9 в кластерах).
9. Работа над /sys/src/libc/amd64 (часть затянувшегося проекта по портированию на amd64).
10. Научить 9load-e820 переходить в 64-битный режим до загрузки ядра.
11 Файл-сервер SVGdraw на JavaScript (Очень интересная идея, суть которой в том, чтобы реализовать draw-сервер в браузере и позволить Plan 9/Inferno использовать его вместо нативного. Это позволит выводить всю графику на web-страницу без каких-либо костылей. Так же планируется добавить поддержку websockets к текущей реализации Styx на js.)
12 Реализовать клавиатурный файл-сервер.
13 Создать альтернативный менеджер окон (давно пора).
14 Поднять аналог kernel.org для Plan 9.
15 Закончить ratrace (аналог strace).
16 Драйвера для поддержи KVM/Virtio (для улучшения производительности Plan 9 в qemu).

Более подробную информацию можно получить пройдя по приведенной в начале поста ссылке.

by j1m (noreply@blogger.com) at March 16, 2011 11:10 AM

February 17, 2011

newsham

more encryption

Google has an encrypted search option at https://encrypted.google.com/.  Now you can search without all your queries and responses going out over the wire in the clear. You can easily configure chrome to use this as your default search by going to <toolbar>->options->Basics->Manage search engines and adding a new engine:</toolbar>

  1. EncryptedGoogle
  2. encrypted.google.com
  3. https://encrypted.google.com/search?{google:RLZ}{google:acceptedSuggestion}{google:originalQueryForSuggestion}sourceid=chrome&ie={inputEncoding}&q=%s
and making that your default. Don't forget to set up gmail to default to https, if you haven't already!

Hat tip to grey for pointing this out.

by newsham (noreply@blogger.com) at February 17, 2011 01:11 AM

February 11, 2011

newsham

Wifi MITM fun and games

People have been talking about wireless attacks at cafes for a long time, but here's an actual attack going on in the wild. Its a small device that hijacks your connection and silently modifies the news as you read it. Very insidious!

ps: I might have been tweeked myself.  the url is "newstweek" after all.  I don't have any verification that this is actually real.  treat accordingly.

by newsham (noreply@blogger.com) at February 11, 2011 11:58 PM

February 08, 2011

Eric Van Hensbergen

Standing Desk

Since building standing desks seem to be all the rage, I decided to put one together for work. I used an Ikea Fredrik as the base, but only used the two shelf pieces and discarded the main desk piece as I wanted it to fit in a corner. The Fredrik has multiple slots along the side allowing you to position the shelves wherever you'd like. One thing I would caution is decide once where you want the stuff as pulling it apart and reassembling is a bit of a chore and you have to built it from the bottom up. It's got quite a nice solution for cable routing and hiding power strips. I also used the Fredrik Accessory Pack mostly so I could add cup holders to the desk, but the hooks came in quite handy for wire routing and for hanging my backpack. The official instructions actually have variations which seem appropriate for standing desks. One thing I did notice with the way mine is configured is it a bit shaky if I type too hard. The vibration isn't pronounced, but its enough to slightly shake the LCD on my laptop.

I attached a 3M keyboard tray to the top shelf, but had to use some nylon spacers to attach it since there is a metal support beam running down the center of the shelf. I used two cheap LCD arms for the extra monitors -- although these tend to tilt slightly with the way I have them mounted. They also need some wood blocks added under the shelf so they have the right thickness to attach to. To top it off, I bought a cheap drafting chair for days when I feel a bit tired of standing -- but I went for one that looked uncomfortable so that I wouldn't be tempted to sit all that often.

I've been using it for a couple of weeks now and seem to be able to stand most of the day while being productive.





Posted by Picasa

by Eric Van Hensbergen (noreply@blogger.com) at February 08, 2011 01:27 AM

February 01, 2011

Eric Van Hensbergen

IBM Centennial Film: 100 X 100 - A century of achievements that have cha...

<iframe allowfullscreen="" frameborder="0" height="295" src="http://www.youtube.com/embed/39jtNUGgmd4?fs=1" width="480"></iframe>

by Eric Van Hensbergen (noreply@blogger.com) at February 01, 2011 08:43 PM

January 30, 2011

newsham

The Big Short

The Big Short by Michael Lewis is a great read. Its both entertaining, informative and insightful. This guy really knows how to write, and also has a great understanding of the bond markets and access to the people who understood what was going on in the subprime mortgage bond market.  I highly recommend it.

Here's some of the quotes I picked up from the read:


"It was as if the ordinary rules of finance had been suspended in response to a social problem. A thought crossed his [Vincent Daniel's] mind: How do you make
poor people feel wealthy when wages are stagnant? You give them cheap loans." - The Big short, M. Lewis, pg 14

"I now realized there was an entire industry, called consumer finance, that basically existed to rip people off." - Steve Eisman, quoted in The Big Short, M. Lewis, pg 20.

"The goal of the innovation [mortgage bonds], in short, was to make the financial markets more efficient.  Now, somehow, that same innovative spirit was being put to the oposite purpose: to hide the risk by complicating. The market was paying Goldman Sachs bond traders to make the market less eficient." - The Big Short, M. Lewis, pg 74

"She [S&P analyst] told them, for instance, that even though she was responsible for evaluating subprime mortgage bonds, she wasn't allowed by her bosses simply to downgrade  the ones she thought deserved to be downgraded. She submitted a list of the bonds she wished to downgrade to her superiors and received back a
list of what she was permitted to downgrade. "She said she'd submit a list of a hundred bonds and get back a list with twenty-five bonds on it, with no explanation of why," said Danny [Moses]" - M. Lewis, The Big Short, pg 102.

"In private transactions," said Charlie [Ledley], "you usually have an advisor on both sides that's sophisticted. You don't have people who just fundamentally know what something's worth. In public markets you have people doing things for all sorts of insane reasons." - M. Lewis, The Big Short, pg 110.

"You know how when you walk into a post office you realize there is such a difference between a government employee and other people, said Vinny [Daniel]. "The ratings agency people were like government employees." - M. Lewis, The Big Short, pg 156.

Vinny [Daniel], always darker, said, "There were more morons than crooks, but the crooks were higher ups." - on criminality vs. incompetence in the subprime mortgage business. M. Lewis, The Big Short, pg 158.

One of the reasons Wall Street [ie. financials] had cooked up this new industry [subprime securitization] was that its old-fashioned business was every day becoming less profitable. The profits in stockbroking, along with those in the more conventional sorts of bond broking, had been squashed by Internet competition.
  - M. Lewis, The Big Short, pg 172.

At which point Greg Lipmann just said "Dude, fuck your model. I'll make you a market.  They are seventy/seventy-seven. You have three choices. You can sell them back to me at seventy. You can buy some more at seventy-seven. Or you can give me my fucking one point two billion dollars. - On Morgan Stanley refusing to mark the Deutch Bank CDOs at a realistic price and pay the reserve due. M. Lewis, The Big Short, pg 213.

The subprime bonds beneath them [lower tranches in CDO] were either all bad or all good. The CDOs were worth either zero or 100. - On correlation. M. Lewis, The Big Short, pg 214.

"The upper classes of this country raped this country. You fucked people. You built a castle to rip people off. Not once in all these years have I come across a person inside a big Wall Street firm who was having a crisis of conscience. Nobody ever said, 'This is wrong.' And no one ever gave a shit about what I had to say." - What Steve Eisman wished he had said at Bear Stearns meeting where he predicted financial collapse while BS was going under. M. Lewis, The Big Short, pg 232.

"That he kept interest rates too low for too long is the least of it. I'm convinced that he knew what was happening in sub-prime, and he ignored it, because the consumer getting screwed was not his problem.  I sort of feel sorry for him because he's a guy who is really smart who was basically wrong about everything."
 - quoting Steve Eisman on Alan Greenspan, M. Lewis, The Big Short, pg 239.

One hallmark of the mania is the rapid rise in the incidence and complexity of fraud. - M. Lewis, The Big Short, pg 55.

by newsham (noreply@blogger.com) at January 30, 2011 06:15 AM

January 14, 2011

research!rsc

Knuth, Volume 4A

My copy arrived today. If you're looking to buy one, ordering directly from the publisher with coupon code KNUTH2010 [sic] gets you the book for $48.74 with free shipping. In contrast, Amazon and Barnes & Noble both charge more and won't ship for another week or two.

by rsc (noreply@blogger.com) at January 14, 2011 05:09 PM

January 11, 2011

OS Inferno

Новинки от mjl

Неутомимый mjl продолжает публиковать свои приложения для Inferno. На этот раз он приготовил несколько по-настоящему интересных вещей:

1. Менеджер окон qwm. В отличие от стандартного менеджера окон, qwm использует "тайловый" метод раскладки окон, так что внешне рабочий стол под его управлением больше похож на редактор acme или рабочий стол Unix под управлением dwm, awesome или ion3. Имеется поддержка виртуальных рабочих столов и клавиатурных комбинаций.

2. Vi-подобный текстовый редактор vixen. Подарок поклонникам vi(m) и ненавистникам acme.

3. Ланчер wmrun. Простая и удобная замена wm/sh.

4. Старый добрый find. Не в стиле inferno, зато удобно.

by j1m (noreply@blogger.com) at January 11, 2011 09:00 AM

January 03, 2011

research!rsc

The MOS 6502 and the Best Layout Guy in the World

The MOS 6502 was ubiquitous in its day. The 6502 and its slight variants were at the heart of the Apple II, the Atari 2600, the BBC Micro, the Commodore 64, and the Nintendo Entertainment System, among others. It's amazing to think that all five—each a very influential system in its own right—were all built around the same chip.

The 6502 was the brainchild of Chuck Peddle, at the time an engineer with Motorola. Peddle was one of the engineers who worked on the Motorola 6800, and one of his jobs was to, well, peddle the 6800 to customers. The customers loved everything about it except its $300 price tag. Peddle tried to convince management at Motorola to create a lower-cost microprocessor, but Motorola didn't want to have such a chip cut into their not insubstantial profits from the 6800, and they told him in no uncertain terms that they wouldn't build such a chip. In response, Peddle and a handful of other 6800 engineers left Motorola and built one themselves. It was the MOS Technology 6502 and sold for $25. Even though both the 6800 and 6502 had a clock rate of 1 MHz, the 6502 had a minimal instruction pipeline that overlapped the fetch of the next instruction with the execution of the current one when possible, giving it a significant performance boost. And of course it sold for ten times less. So it ended up everywhere.

The story of the 6502 makes up the first chapter of Brian Bagnall's On the Edge: the Spectacular Rise and Fall of Commodore. My favorite part of the description of the development of the 6502 is the actual chip layout. These days, you can't design and lay out a computer chip without a computer. An Intel Core 2 chip has hundreds of millions of transistors. The 6502 had 3,510, and an engineer—a person, not a computer—had to draw each one by hand to lay out the chip. Mainly it was a single engineer, Bill Mensch.

But it gets better. Once the layout was completed and double-checked—a process that meant months using a ruler!—it still had to be converted into a Rubylith photomask that would etch the right patterns onto the silicon. The photomask for the 6502 was the size of a large table—large enough that the engineers crawled around on top of it to perform the job of cutting the layout out of the mask, all the while being careful to wear clean socks with no holes, so that stray toenails didn't insert traces in the mask where they didn't belong.

The most amazing part about the whole process is that they got the 6502 right in one try. Quoting On the Edge:

Bil Herd summarizes the situation. “No chip worked the first time,” he states emphatically. “No chip. It took seven or nine revs [revisions], or if someone was real good they would get it in five or six.”

Normally, a large number of flaws originate from the layout design. After all, there are six layers (and six masks) that have to align with each other perfectly. Imagine designing a town with every conceivable layer of infrastructure placed one on top of another. Plumbing is the lowest layer, followed by the subway system, underground walkways, buildings, overhead walkways, and finally telephone wires. These different layers have to connect with each other perfectly; otherwise, the town will not function. The massive complexity of such a system makes it likely that human errors will creep into the design.

After fabricating a run of chips and probing them, the layout engineers usually have to make changes to their original design and the process repeats from the Rubylith down. “Each run is a couple of hundred thousand [dollars],” says Herd.

Implausibly, the engineers detected no errors in [Bill] Mensch's layout. “He built seven different chips without ever having an error,” says Peddle with disbelief in his voice. “Almost all done by hand. When I tell people that, they don't believe me, but it's true. This guy is a unique person. He is the best layout guy in the world.”

The first chapter of On the Edge is posted on Bagnall's web site in HTML or PDF. The chapter says that Peddle “created a concept called pipelining,” which could be interpreted as saying that the 6502 was the first pipelined processor and that Peddle invented it. Does anyone know?

Fast forward thirty years. Computers are now old enough that there can be significant interest in maintaining the history of these old, venerable designs. The actual paper designs of the 6502 are long gone.

A team of three people—Greg James, Barry Silverman, and Brian Silverman—accumulated a bunch of 6502 chips, applied sulfuric acid to them to strip the casing and expose the actual chips, used a high-resolution photomicroscope to scan the chips, applied computer graphics techniques to build a vector representation of the chip, and finally derived from the vector form what amounts to the circuit diagram of the chip: a list of all 3,510 transistors with inputs, outputs, and what they're connected to. Combining that with a fairly generic (and, as these things go, trivial) “transistor circuit” simulator written in JavaScript and some HTML5 goodness, they created an animated 6502 web page that lets you watch the voltages race around the chip as it executes. For more, see their web site visual6502.org.

Oh, and it actually works. They applied the same technique to build the transistor map for an Atari 10444D TIA chip, which connected the 6502 to the television in the original Atari 2600, and then they simulated both chips together and were able to run actual Atari 2600 games. So the transistor map is probably (very close to) correct. More impressively, they got to that point without debugging. Their SIGGRAPH 2010 abstract explains that there were only 8 errors in the map of 20,000 components, and all the errors were spotted during the vectorization process. History repeats itself.

Michael Steil took the circuit information and started looking closely at how the chip did what it did. At last week's Chaos Communication Congress 27C3 conference, he gave a 50-minute talk that introduces the 6502 architecture and then uses the circuit diagram to explain various details and undocumented features of the chip. The original announcement is at Steil's blog. There is a version of the talk on YouTube, and in the blog comments you'll find links to higher-resolution copies. It's well worth watching, in any form, and it's fun to see how much Peddle, Mensch, and the rest of the 6502 team packed into that tiny number of transistors.



P. S. If you liked this, you might also like Computing Machines. Also, Barry Silverman and Brian Silverman (mentioned above) were two of the people behind the PDP-1 simulator written in Java that runs Spacewar!

by Russ Cox (noreply@blogger.com) at January 03, 2011 04:00 PM

January 01, 2011

newsham

What you didnt know about the 6502

Pretty awesome talk (broken down into 6 parts on youtube) about reverse engineering the 6502. They went so far as to use hires pictures of the die and automatically extract circuits from them and then simulate the circuits directly:

<object height="385" width="480"><param name="movie" value="http://www.youtube.com/v/HW9AWBFH1sA?fs=1&amp;hl=en_US&amp;rel=0"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="385" src="http://www.youtube.com/v/HW9AWBFH1sA?fs=1&amp;hl=en_US&amp;rel=0" type="application/x-shockwave-flash" width="480"></embed></object>

http://www.youtube.com/watch?v=HW9AWBFH1sA
http://www.youtube.com/watch?v=bBE4KHKzhKc
http://www.youtube.com/watch?v=tRBo7O_blVo
http://www.youtube.com/watch?v=H_15RtVbqGU
http://www.youtube.com/watch?v=N9DYmlprCKA
http://www.youtube.com/watch?v=eZOUuqc4pk8

and a page with a visual javascript emulator that shows you what is going on on the cpu as it executes:
http://visual6502.org/

by newsham (noreply@blogger.com) at January 01, 2011 08:41 PM

December 13, 2010

research!rsc

Visualizing TCP

This video depicts an HTTP GET by showing packets as balls flying between client and server, slowed 40x compared to real time. It's pretty nice for getting a visceral sense of the underlying TCP session: you can see individual rounds at the start and then as the session stabilizes the rounds begin to overlap enough that they disappear.

When I saw the video, it reminded me of the ever-present videos of sorting algorithms, like the ones on the Wikipedia heapsort and quicksort pages:

That in turn reminded me of a 2007 blog post by Aldo Cortesi about visualizing sorting algorithms using graphs instead of videos. He points out that the animations make time-based questions difficult to answer:

[It] is my measured opinion that this animation has all the explanatory power of a glob of porridge flung against a wall. To see why I say this, try to find rough answers to the following set of simple questions with reference to it:

  • After what percentage of time is half of the array sorted?
  • Can you find an element that moved about half the length of the array to reach its final destination?
  • What percentage of the array was sorted after 80% of the sorting process? How about 20%?
  • Does the number of sorted elements grow linearly or non-linearly with time (i.e. logarithmically or exponentially)?

If you thought that was harder than it needed to be, blame animation. First, while humans are great at estimating distances in space, they are pretty bad at estimating distances in time. This is why you had to watch the animation two or three times to answer the first question. When we translate time to a geometric length, as is done in any scientific diagram with a time dimension, this estimation process becomes easy. Second, many questions about sorting algorithms require us to actively compare the sorting state at two or more different time points. Since we don't have perfect memories, this is very, very hard in all but the simplest cases. This leaves us with a strangely one-dimensional view into an animation - we can see what's on screen at any given moment, but we have to strain to answer simple questions about, say, rates of change. Which is why the final question is hard to answer accurately.

Instead, he suggests plotting static graphs showing the paths of individual array elements over time. Here are his original visualizations of heapsort and quicksort:

The blog post is worth reading in full. It has much more detail and links to many other interesting blog posts and a new site dedicated to those visualizations, sortvis.org.

Remembering this made me wonder if the lesson of the sorting animations applied to the TCP animations too, so I collected my own tcpdumps of both sides of an HTTP GET, pulled out the trusty (but neglected) Unix tool join(1) to produce a dump with two timestamps on each packet, and then used Perl to turn it into a graph (raw PostScript):

Time marches to the right, with each row picking up where the previous one left off. Blue lines indicate packets from the client and black lines indicate packets from the server.

I think this kind of graph has the same important benefits that Cortesi realized with his sorting graphs. It's static, so you have time to study it, focus on anomalies, or notice patterns that aren't immediately apparent. The geometry of the lines, with flatter slopes corresponding to slower packets, makes it easy to compare speeds.

Here are just a few things I noticed.

The connection starts with the usual 3-way TCP handshake. The client sends the third packet in the handshake and the first data packet back to back, one of its few back-to-back transmissions. After the handshake and the initial GET, which probably fits entirely in that first packet, the client only receives data: the black lines are large payloads while the blue lines are ACK messages that trigger more payloads. Scanning across the client's view of the connection (the top edge of each row) we can see that the client is using the standard delayed ACKs, sending an ACK every two packets. Scanning across the server's view of the connection (the bottom edge of each row) we can see that the server usually responds to the acknowledgement of two packets by sending two more packets, leaving the TCP window size (the number of packets in flight) unchanged. Occasionally, though, the server grows the window size by responding with three packets instead of two. Three of these events are circled in rows 1 and 2. Occasionally the server only responds to an ACK with a single packet, but most of those appear to be times when the client was only acknowledging a single packet.

We can count the number of packets in flight at any given time by counting the number of black lines that cross a particular blue line, plus the number of packets sent when the blue line reaches the server. For example, three black lines cross the last complete blue line on row 1, and the server sends two more packets in response, so the window size then is five packets. By the end of row 2, the window size is seven packets.

The window size fluctuates a little once it stabilizes: at the end of row 10 it has grown to eight packets but at the end of row 20 it is back to seven. It's not clear what causes the server to shrink or grow the window size. I couldn't find any evidence of packet loss in the traces.

I set the time axis such that those initial packets travel on lines with a 2:1 slope. A glance down the page shows that while the packets sent by the client continue to travel at approximately that speed, the packets sent by the server gradually slow until row 3, where they appear to stabilize around a 1:2 slope, four times longer to send than the initial packets. Looking at the raw data, those later packets around 100 ms for a one-way trip compared to 20 ms during the original handshake, so our visual estimate is not far off. This is evidence of queueing along the path from server to client; the likely suspect is the server's upstream DSL connection.

The circled fragment on row 11 is more evidence of queuing. Something delays the ACK from client to server (note the more gradual slope) but the packets that the server sends in response are not noticeably delayed: they still had time to catch up with the bottlenecked packets.

The circled fragment on row 24 shows the opposite case. Most of the time the packet arrivals at the client are evenly spaced, but if a few packets from the server to the client get delayed, that delays the corresponding ACKs, which throws the conversation into a bursty behavior that takes a few rounds to convert to a steady state.

People more familiar with TCP might notice other interesting details in the plot. If you find something, please leave a comment.

Of course, the idea of plotting network packets against time in this way is not new: any class in network protocols uses diagrams like these. However, I have never seen a diagram like this one made from real timings on both sides, so that you can see the actual speeds of individual packets and compare the relative speeds of different packets, and I've never seen a diagram that was this zoomed out, so that you can see macro effects like the occasional congestion burp of rows 11 and 24. It would be interesting to have a tool that generated these automatically, but I rarely have a need to examine protocols at this level so I'm not likely to spend more time on it. If you know of (or make) such a tool, please leave a comment.

by Russ Cox (noreply@blogger.com) at December 13, 2010 02:00 PM

December 07, 2010

newsham

Honest Journalism

Is comedy the last remaining outlet for honest journalism? Check out Rap News.

<object height="340" width="560"><param name="movie" value="http://www.youtube.com/v/NXbCwq4ewBU?fs=1&amp;hl=en_US"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="340" src="http://www.youtube.com/v/NXbCwq4ewBU?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" width="560"></embed></object>

by newsham (noreply@blogger.com) at December 07, 2010 07:26 PM

December 06, 2010

research!rsc

Yacc is Not Dead

The internet tubes have been filled with chatter recently about a paper posted on arXiv by Matthew Might and David Darais titled “Yacc is dead.” Unfortunately, much of the discussion seems to have centered on whether people think yacc is or should be dead and not on the technical merits of the paper.

Reports of yacc's death, at least in this case, are greatly exaggerated. The paper starts by arguing against the “cargo cult parsing” of putting together parsers by copying and pasting regular expressions from other sources until they work. Unfortunately, the paper ends up being an example of a much worse kind of cargo cult: claiming a theoretical grounding without having any real theoretical basis. That's a pretty strong criticism, but I think it's an important one, because the paper is repeating the same mistakes that led to widespread exponential time regular expression matching. I'm going to spend the rest of this post explaining what I mean. But to do so, we must first review some history.

Grep and yacc are the two canonical examples from Unix of good theory—regular expressions and LR parsing—producing useful software tools. The good theory part is important, and it's the reason that yacc isn't dead.

Regular Expressions

The first seminal paper about processing regular expressions was Janus Brzozowski's 1964 CACM article “Derivatives of regular expressions.” That formulation was the foundation behind Ken Thompson's clever on-the-fly regular expression compiler, which he described in the 1968 CACM article “Regular expression search algorithm.” Thompson learned regular expressions from Brzozowski himself while both were at Berkeley, and he credits the method in the 1968 paper. The now-standard framing of Thompson's paper as an NFA construction and the addition of caching states to produce a DFA were both introduced later, by Aho and Ullman in their various textbooks. Like many primary sources, the original is different from the textbook presentations of it. (I myself am also guilty of this particular misrepresentation.)

The 1964 theory had more to it than the 1968 paper used, though. In particular, because each state computed by Thompson's code was ephemeral—it only lasted during the processing of a single byte—there was no need to apply the more aggressive transformations that Brzozowski's paper included, like rewriting x|x into x.

A much more recent paper by Owens, Reppy, and Turon, titled “Regular-expression derivatives reexamined,” returned to the original Brzozowski paper and considered the effect of using Brzozowski's simplification rules in a regular expression engine that cached states (what textbooks call a DFA). There, the simplification has the important benefit that it reduces the number of distinct states, so that a lexer generator like lex or a regular expression matcher like grep can reduce the necessary memory footprint for a given pattern. Because the implementation manipulates regular expressions symbolically instead of the usual sets of graph nodes, it has a very natural expression in most functional programming languages. So it's more efficient and easier to implement: win-win.

Yacc is dead?

That seems a bit of a digression, but it is relevant to the “Yacc is dead” paper. That paper, claiming inspiration from the “Regular-expression derivatives reexamined” paper, shows that since you can define derivatives of context free grammars, you can frame context free parsing as taking successive derivatives. That's definitely true, but not news. Just as each input transition of an NFA can be described as taking the derivative (with respect to the input character) of the regular expression corresponding to the originating NFA state, each shift transition in an LR(0) parsing table can be described as taking the derivative (with respect to the shift symbol) of the grammar corresponding to the originating LR(0) state. A key difference, though, is that NFA states typically carry no state with them and can have duplicates removed, while a parser associates with each LR(0) parse states a parse stack, making duplicate removal more involved.

Linear vs Exponential Runtime

The linear time guarantee of regular expression matching is due to the fact that, because you can eliminate duplicate states while processing, there is a finite bound on the number of NFA states that must be considered at any point in the input. If you can't eliminate duplicate LR(0) states there is no such bound and no such linear time guarantee.

Yacc and other parser generators guarantee linear time by requiring that the parsing state automata not have even a hint of ambiguity: there must never be a (state, input symbol) combination where multiple possible next states must be explored. In yacc, these possible ambiguities are called shift/reduce or reduce/reduce conflicts. The art of writing yacc grammars without these conflicts is undeniably arcane, and yacc in particular does a bad job of explaining what went wrong, but that is not inherent to the approach. When you do manage to write down a grammar that is free of these conflicts, yacc rewards you with two important guarantees. First, the language you have defined is unambiguous. There are no inputs that can be interpreted in multiple ways. Second, the generated parser will parse your input files in time linear in the length of the input, even if they are not syntactically valid. If you are, say, defining a programming language, these are both very important properties.

The approach outlined in the “Yacc is dead” paper gives up both of these properties. The grammar you give the parser can be ambiguous. If so, it can have exponentially many possible parses to consider: they might all succeed, they might all fail, or maybe all but the very last one will fail. This implies a worst-case running time exponential in the length of the input. The paper makes some hand-wavy claims about returning parses lazily, so that if a program only cares about the first one or two, it does not pay the cost of computing all the others. That's fine, but if there are no successful parses, the laziness doesn't help.

It's easy construct plausible examples that take exponential time to reject an input. Consider this grammar for unary sums:

S → T
T → T + T | N
N → 1

This grammar is ambiguous: it doesn't say whether 1+1+1 should be parsed as (1+1)+1 or 1+(1+1). Worse, as the input gets longer there are exponentially many possible parses, approximately O(3n) of them. If the particular choice doesn't matter, then for a well-formed input you could take the first parse, whatever it is, and stop, not having wasted any time on the others. But suppose the input has a syntax error, like 1+1+1+...+1+1+1++1. The backtracking parser will try all the possible parses for the long initial prefix just in case one of them can be followed by the erroneous ++1. So when the user makes a typo, your program grinds to a halt.

For an example that doesn't involve invalid input, we can use the fact that any regular expression can be translated directly into a context free grammar. If you translated a test case that makes Perl's backtracking take exponential time before finding a match into a grammar, the translated case would make the backtracking parsers in this paper take exponential time to find a match too.

Exponential run time is a great name for a computer science lab's softball team but in other contexts is best avoided.

Approaching Ambiguity

The “Yacc is dead” paper observes correctly that ambiguity is a fact of life if you want to handle arbitrary context free grammars. I mentioned above the benefit that if yacc accepts your grammar, then (because yacc rejects all the ambiguous context free grammars, and then some), you know your grammar is unambiguous. I find this invaluable as a way to vet possible language syntaxes, but it's not necessary for all contexts. Sometimes you don't care, or you know the language is ambiguous and want all the possibilities. Either way, you can do better than exponential time parsing. Context free parsing algorithms like the CYK algorithm or the Generalized LR parsing can build a tree representing all possible parses in only O(n3) time. That's not linear, but it's a far cry from exponential.

Another interesting approach to handling ambiguity is to give up on context free grammars entirely. Parsing expression grammars, which can be parsed efficiently using packrat parsing, replace the alternation of context free grammars with an ordered (preference) choice, removing any ambiguity from the language and achieving linear time parsing. This is in some ways the same approach yacc takes, but some people find it easier to write parsing expression grammars than yacc grammars.

These are basically the only two approaches to ambiguity in context free grammars: either handle it (possible in polynomial time) or restrict the possible input grammars to ensure linear time parsing and as a consequence eliminate ambiguity. The obvious third approach—accept all context free grammars but identify and reject just the ambiguous ones—is an undecidable problem, equivalent to solving the halting problem.

Cargo cults

The “Yacc is dead” paper begins with a scathing critique of the practice of parsing context free languages with (not-really-)regular expressions in languages like Perl, which Larry Wall apparently once referred to as “cargo cult parsing” due to the heavy incidence of cut and paste imitation and copying “magic” expressions. The paper says that people abuse regular expressions instead of turning to tools like yacc because “regular expressions are `WYSIWYG'—the language described is the language that gets matched—whereas parser-generators are WYSIWYGIYULR(k)—`what you see is what you get if you understand LR(k).' ”

The paper goes on:

To end cargo-cult parsing, we need a new approach to parsing that:
  • handles arbitrary context-free grammars;
  • parses efficiently on average; and
  • can be implemented as a library with little effort.

Then the paper starts talking about the “Regular-expression derivatives reexamined” paper, which really got me excited. I hoped that just like returning to derivatives led to an easy-to-implement and oh-by-the-way more efficient approach to regular expression matching, I had high hopes that the same would happen for context free parsing. Instead, the paper takes one of the classic blunders in regular expression matching—search via exponential backtracking because the code is easier to write—and applies it to context free parsing.

Concretely, the paper fails at goal #2: “parses efficiently on average.” Most arbitrary context-free grammars are ambiguous, and most inputs are invalid, so most parses will take exponential time exploring all the ambiguities before giving up and declaring the input unparseable. (Worse, the parser couldn't even parse an unambiguous S-expression grammar efficiently without adding a special case.)

Instead of ending the supposed problem of cargo cult parsing (a problem that I suspect is endemic mainly to Perl), the paper ends up being a prime example of what Richard Feynman called “cargo cult science” (Larry Wall was riffing on Feynman's term), in which a successful line of research is imitated but without some key aspect that made the original succeed (in this case, the theoretical foundation that gave the efficiency results), leading to a failed result.

That criticism aside, I must note that the Haskell Scala code in the paper really does deliver on goals #1 and #3. It handles arbitrary context-free grammars, it takes little effort, and on top of that it's very elegant code. Like that Haskell Scala code, the C regular expression matcher in this article (also Chapter 9 of The Practice of Programming and Chapter 1 of Beautiful Code) is clean and elegant. Unfortunately, the use of backtracking in both and the associated exponential run time makes the code much nicer to study than to use.

Is yacc dead?

Yacc is an old tool, and it's showing its age, but as the Plan 9 manual says of lex(1), “The asteroid to kill this dinosaur is still in orbit.” Even if you think yacc itself is unused (which is far from the truth), the newer tools that provide compelling alternatives still embody its spirit. GNU Bison can optionally generate a GLR parser instead of an LALR(1) parsers, though Bison still retains yacc's infuriating lack of detail in error messages. (I use an awk script to parse the bison.output file and tell me what really went wrong.) ANTLR is a more full featured though less widespread take on the approach. These tools and many others all have the guarantee that if they tell you the grammar is unambiguous, they'll give you a linear-time parser, and if not, they'll give you at worst a cubic-time parser. Computer science theory doesn't know a better way. But any of these is better than an exponential time parser.

So no, yacc is not dead, even if it sometimes smells that way, and even if many people wish it were.

by rsc (noreply@blogger.com) at December 06, 2010 06:04 PM

December 04, 2010

research!rsc

Off to the Races

Go is defined to be a safe language. Indices into array or string references must be in bounds; there is no way to reinterpret the bits of one type as another, no way to conjure a pointer out of thin air; and there is no way to release memory, so no chance of “dangling pointer” errors and the associated memory corruption and instability.

In the current Go implementations, though, there are two ways to break through these safety mechanisms. The first and more direct way is to use package unsafe, specifically unsafe.Pointer. The second, less direct way is to use a data race in a multithreaded program.

If you were going to build an environment that ran untrusted Go code, you'd probably want to change the available packages to restrict or delete certain routines, like os.RemoveAll, and you'd want to disallow access to package unsafe. Those kinds of restrictions are straightforward.

The data races that can be used to break through the usual memory safety of Go are less straightforward. This post describes the races and how to rearrange the data structures involved to avoid them. Until the Go implementations have been tuned more, we won't be able to measure whether there is a significant performance difference between the current representation and the race-free implementation.

Package Unsafe

Here's a simple packaging of a type that lets you edit arbitrary memory locations, built using the standard package unsafe:

import "unsafe"

type Mem struct {
 addr *uintptr // actually == &m.data!
 data *uintptr
}

// Peek reads and returns the word at address addr.
func (m *Mem) Peek(addr uintptr) uintptr {
 *m.addr = addr
 return *m.data
}

// Poke sets the word at address addr to val.
func (m *Mem) Poke(addr, val uintptr) {
 *m.addr = addr
 *m.data = val
}

func NewMem() *Mem {
 m := new(Mem)
 m.addr = (*uintptr)(unsafe.Pointer(&m.data))
 return m
}

(The Go type uintptr is an unsigned integer the size of a pointer, like uint32 or uint64 depending on the underlying machine architecture.)

The key line is near the bottom, the use of the special type unsafe.Pointer to convert a **uintptr into a *uintptr. Dereferencing that value gives us m.data (actually a *uintptr) interpreted as a uintptr. We can assign an arbitrary integer to *m.addr, that changes m.data, and then we can dereference the integer as *m.data. In other words, the Mem struct gives us a way to convert between integers and pointers, just like in C. There are no races here: this is just something you can do by importing unsafe. The Mem wrapper is a bit convoluted—normally you'd just use unsafe directly—but we're going to drop in a different implementation of NewMem that doesn't rely on unsafe.

A Race

The current Go representation of slices and interface values admits a data race: because they are multiword values, if one goroutine reads the value while another goroutine writes it, the reader might see half of the old value and half of the new value.

Let's provoke the race using interface values. In Go, an interface value is represented as two words, a type and a value of that type. After these declarations:

var x *uintptr

var i interface{} = (*uintptr)(nil)
var j interface{} = &x

The data structures for i and j look like:

Suppose we kick off a goroutine that alternately assigns i and j to a new interface value k:

var k interface{}

func hammer() {
 for {
  k = i
  k = j
 }
}

After each statement executes, k will look like either i or j, but during the assignment, there will be a moment when k is half i and half j, one of these:

The top case gives us a **uintptr nil, which we could obtain more easily via legitimate means, but the bottom case gives us the value &x (actually a **uintptr) interpreted as a *uintptr. If we can catch the interface when it looks like the case on the right, we'll have rederived the conversion we used above via unsafe. Based on that insight, we can rewrite NewMem without unsafe:

func NewMem() *Mem {
 fmt.Println("here we go!")

 m := new(Mem)

 var i, j, k interface{}
 i = (*uintptr)(nil)
 j = &m.data

 // Try over and over again until we win the race.
 done := false
 go func(){
  for !done {
   k = i
   k = j
  }
 }()
 for {
  // Is k a non-nil *uintptr?  If so, we got it.
  if p, ok := k.(*uintptr); ok && p != nil {
   m.addr = p
   done = true
   break
  }
 }
 return m
}

The same kind of race happens in all of Go's mutable multiword structures: slices, interfaces, and strings. In the case of slices, the trick is to get a pointer from one slice and a cap from a different one. In the case of strings, the trick is to get a pointer from one string and the len from a different one. (The string race isn't as interesting, because strings cannot be written to, so it would only let you read memory, not write it.)

The Fix

The race is fundamentally caused by being able to observe partial updates to Go's multiword values (slices, interfaces, and strings): the updates are not atomic.

The fix is to make the updates atomic. In Go, the easiest way to do that is to make the representation a single pointer that points at an immutable structure. When the value needs to be updated, you allocate a new structure, fill it in completely, and only then change the pointer to point at it. This makes the assignment atomic: another goroutine reading the pointer at the same time sees either the new data or the old data, but not a mix, assuming the compiler is careful to read the pointer just once and then access both fields using the same pointer value.

(The red border indicates immutable data.)

For slices and strings, it makes sense to keep the multiword representation but put an immutable ”pointer and cap” stub structure between the slice and the underlying array. This keeps the same basic efficiency properties of slices at the cost of a few extra instructions on each indexing operation.

The idea here is to keep a structure with a mutable offset and length to support efficient slicing but replace the pointer with an immutable base+length pair. Any access to the underlying data must check the final offset against the immutable cap. Copying slice values is still not an atomic operation, but an invalid len will not keep an out-of-bounds index from being caught.

This representation requires a couple more assembly instructions, because each index must be checked against two bounds, first the relative len and then the absolute cap:

Compute x[i] in AX
RacyRace-free
LEAL x, SI
MOVL i, CX
CMPL CX, 4(SI)
JGE panic




MOVL 0(SI), DI
MOVL (4*CX)(DI), AX
LEAL x, BX
MOVL i, CX
CMPL CX, 4(BX)
JGE panic
ADDL 8(BX), CX
MOVL 0(BX), SI
CMPL CX, 4(SI)
JGE panic
MOVL 0(SI), DI
MOVL (4*CX)(DI), AX
 
 
// i >= len?
 
 
// i+off >= cap?
 
// &x[0] -> SI
// x[i] (or x[i+off]) -> DI
 

With suitable analysis, an optimizing compiler could cache 0(BX), 4(BX), 8(BX), and 4(SI), so in a loop, it is possible that the new representation would run at the same speed as the original.

An ambitious implementation might continue to use the current data structures for slices, interfaces, and strings stored on the stack, because data on the stack can only be accessed by the goroutine running on that stack. (Local variables whose addresses might escape to other goroutines are already allocated on the heap automatically, to avoid dangling pointer bugs after a function returns.)

Garbage Collection

This fix is feasible only because Go is a garbage-collected language: we can treat the red stub structures as immutable and trust that the garbage collector will recycle the memory only when nothing points to them anymore. It's much harder to build a safe language without a garbage collector to fall back on.

Security Implications

It is important to note that these races do not make the current implementations any less secure than they already are. The races allow clever programmers to subvert Go's memory safety, but a less clever programmer can still use the aptly-named package unsafe.

These races only matter if you are trying to build a Go service that can safely run arbitrary code supplied by untrusted programmers (and to the best of my knowledge, there are no such services yet). In that situation, you'd already need to change the implementations to disable access to the unsafe package and remove or restrict functions like os.Remove or net.Dial. Changing the data representations to be race free is just one more change you'd have to make. Now you know, not just that a change is needed but also what the change is.

The races exist because the data representations were chosen for performance: the race-free versions introduce an extra pointer, which carries with it the cost of extra indirection and extra allocation. Once the Go implementations are more mature, we'll be able to evaluate the precise performance impact of using the race-free data structures and whether to use them always or only in situations running untrusted code.

by rsc (noreply@blogger.com) at December 04, 2010 04:40 PM

November 04, 2010

johnny

debianinitscript to set up forwarding for services on an upnp-enabled network

This script sets up forwarding for a few basic services, it should be pretty easy to extend with anything needed.

Services are ssh, ftp (with ports specified for passive mode) and http

#! /bin/sh

### BEGIN INIT INFO
# Provides:          forwarding
# Required-Start:    $network
# Required-Stop:     $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Script that sets up forwarding for ssh, ftp and web external acces
# Description:       Script that sets up forwarding for ssh, ftp and web external acces
### END INIT INFO

NAME=forwarding
DESCRIPTION="Script that sets up forwarding for ssh, ftp and web external acces"
CMD=/root/bin/upnpc-static

# Exit if the package is not installed
test -x $CMD || exit 0

# Load the VERBOSE setting and other rcS variables
[ -f /etc/default/rcS ] && . /etc/default/rcS

# Define LSB log_* functions.
# Depend on lsb-base (>= 3.0-6) to ensure that this file is present.
. /lib/lsb/init-functions

do_start () {
    log_daemon_msg "Starting $DESCRIPTION" "$NAME"
    CMD="$CMD -r 22 TCP 21 TCP 80 TCP"
    for port in $(seq 49152 49170); do
        CMD="$CMD $port TCP"
    done
    $CMD 1>&2 > /dev/null
    log_end_msg $?
    return $?
}

do_stop () {
    log_daemon_msg "Stopping $DESCRIPTION" "$NAME"
    CMD="$CMD -d 22 TCP 21 TCP 80 TCP"
    for port in $(seq 49152 49170); do
        CMD="$CMD $port TCP"
    done
    $CMD 1>&2 > /dev/null
    log_end_msg $?
    return $?
}

case "$1" in
    start)
    do_start
    ;;
    stop)
    do_stop
    ;;
    restart|force-reload)
    do_stop 
    do_start
    ;;

    *)
    log_success_msg "Usage: $0 {start|stop|restart|force-reload}"
    exit 1
esac

exit 0

by lighttpd at November 04, 2010 09:20 PM

November 01, 2010

johnny

wmii kvirc integration

bash function wi_findclient

  wi_findclient () {
    for i in $(wmiir ls /client); do
        if wmiir cat "/client/$i/props" |grep -q $1; then
            echo $i;
        fi;
    done
  }

I just added this to my ~/.bashrc

kvirc setup

in the event editor three events need to be added, i just named all of them wmii

OnHighlight event

run bash -i -c "echo urgent on | wmiir write /client/wi_findclient kvirc4/ctl";

OnQueryMessage event

run bash -i -c "echo urgent on | wmiir write /client/wi_findclient kvirc4/ctl";

OnWindowActivated event

run bash -c "echo urgent off | wmiir write /client/sel/ctl";

That should do it

by lighttpd at November 01, 2010 02:37 PM

October 21, 2010

newsham

Lions Commentary and x86

Lions Commentary is a great resource since it studies a real system that is small enough to be studied in a reasonable amount of time. Unfortunately the system it studies, UNIX v6, is old and runs on an old platform (PDP11) and uses an old version of the C language. I just stumbled across this neat system that addresses these problems: xv6 (http://pdos.csail.mit.edu/6.828/2009/xv6-book/index.html). Its a V6-like system written in modern C to run on an IA32 platform. It comes with a Lions-like commentary. So if you were interested in studying Lions but not interested in learning about the PDP11, you might want to check this out.
(Also worth checking out, the OCW page for MITs OS course using xv6, and the web page for the older course using Lions).

by newsham (noreply@blogger.com) at October 21, 2010 09:36 PM

October 19, 2010

newsham

Sixth edition UNIX in simh

I got into reading Lions Commentary again because I didn't finish it the first time. I installed the stock V6 image from simh's webpage to play with but found the image unsatisfying. So I found the tape image online and installed it fresh here. The original documentation is sufficient, but it doesn't directly address the details of how one installs in SIMH, so I thought I'd throw something together. Rather than just showing the instructions, I thought it would be fun to script them. That way you can reproduce them exactly and get a clean system. You can ignore the scripts if all you want is a clean system, or you can study them if you want to see what the process is like. Since part of the goal was to provide documentation, I tried to keep the scripts simple and readable.

The end result can be found here: http://www.thenewsh.com/~newsham/myv6/

There's a master shell script to pull together all the details. The installation steps are carried out by two expect scripts. One script prepares a minimal bootable disk image from the tape while the second script logs into the minimal system and configures and installs the rest of the software. When the process is done, there's a SIMH config file for running the image in multi-user mode.

A tarball of the whole shebang is available here: http://www.thenewsh.com/~newsham/myv6/myv6.tgz

Any comments for improvements are welcome.

by newsham (noreply@blogger.com) at October 19, 2010 12:26 AM

October 12, 2010

newsham

How the internet cost me $500.

So I got scammed on the internet.  Twice (kind of). The first case is pretty classic. There's an artist, Geoffrey Raymond, who makes some awesome paintings. I contacted him through his (awesome) blog and inquired about ordering a print of Blue Paulson. He said to paypal him $250 (the rate listed on his blog) for the print using his published blog contact email address. I did. He didn't respond right away so I inquired a few days later to make sure he got the order, he did, but he needed $25 to cover shipping, sounds fair to me, I promptly sent him $25 to cover shipping. Again I didn't hear back for a while, so I emailed him asking him why, and he said that he is having problems with his printer and it would take a while. So I waited a few weeks and again asked him for an ETA. Now he stops responding to me, but I notice his blog is still active. After waiting a little longer I open a paypal dispute. He doesn't respond to that. I escalate it to a paypal claim. He doesn't respond to the claim in the period paypal allots to him and I get back a mail from paypal saying they reviewed the dispute and they ruled in my favor! Yay!  Unfortunately the paypal email also says that I stated that a partial refund of 0 (zero!) dollars would be sufficient! WHAT!? Ok.. so they refunded me $0 for the picture I never got. Horray internet! Yay paypal.

So that's story number one. Story number two involves the popular Hyatt hotel chain. I placed a reservation for a room using my Hyatt Gold Passport points. The customer service is excellent, I'm a Gold Passport member, after all. Unfortunately I have to cancel the room and I'm already passed the cancellation deadline. OK, so I'm thinking I'm probably going to be out all those points. I call up the Hyatt Gold Passport customer service line to cancel my reservation and they say if I cancel, they will charge my credit card for the full dollar amount of the room. If I don't show up ("no show") they will charge my credit card for the full dollar amount of the room. The only way I can avoid losing money on my free Hyatt Gold Passport points room is to show up, checkin, and then check out. Ok, in fairness, this is only related to the internet since I booked online. Hyatt would have been happy to rip me off over the phone, too, I imagine. Yay Hyatt.

Moral of the story? I'm not sure.  Avoid buying stuff on the internet from artists types with paypal or staying at Hyatt hotels?

by newsham (noreply@blogger.com) at October 12, 2010 03:33 AM

October 05, 2010

Eric Van Hensbergen

Multi-pipes

Here's our poster we presented at OSDI last night on multi-pipes. Will be giving a short talk on them at IWP9 which you should be able to view next week here.

by Eric Van Hensbergen (noreply@blogger.com) at October 05, 2010 05:28 PM

September 26, 2010

newsham

Liar's Poker

Liar's Poker is an enjoyable, well written book from the mid 80's that's still very relevant today. Written from the perspective of a Salomon Brothers insider, it has lots to say about mortgage traders and wall street in general. Here are some fun quotes from the book:
In any market, as in any poker game, there is a fool.  The astute investor Warren Buffet is fond of saying that any player unaware of the fool in the market probably is the fool in the market. [..] Salomon bond traders knew about fools because that was their job. Knowing about markets is knowing about other people's weaknesses.
Best of all, he [guy teachings the new employees] gave us a rule of thumb about information in the markets that I later found useful: "Those who say don't know, and those who know don't say"
In the stock market the more elaborate and abstruse the mathematics the more uncertain and speculative the conclusion we draw therefrom... Whenever calculus is brought in, or higher algebra, you could take it as a warning signal that the operator was trying to substitute theory for experience" (quoting Benjamin Graham)
[Donnie Green] is remembered as the man who stopped a callow young salesman on his way out the door to catch a flight from New York to Chicago. Green tossed the salesman a ten-dollar bill. "Hey, take out some crash insurance for yourself in my name," he said. "Why?" asked the salesman. "I feel lucky," said Green.
I'm convinced that the worst thing a man can do with a telephone without breaking the law is to call someone he doesn't know and try to sell that person something he doesn't want. 
"Customers have very short memories." (quoting Tom Strauss, President of Salomon Brothers, on the topic of screwing customers.)
The nature of the game is zero sum. A sollar out of my customer's pocket was a dollar in ours, and vice versa.
 Investment banking in America is a long-standing oligopoly [...] American investors (the lenders of money) have been trained to think that they can only do business with a handful of large investment banks.
Note to members of all governments: Be wary of Wall Streeters threatening crashes. They are tempted to do this whenever you encroach on their turf. But they can't cause a crash any more than they can prevent one.

by newsham (noreply@blogger.com) at September 26, 2010 10:47 PM

September 20, 2010

9gridchan

Monday, 10 Sept 2010 - not DEFUNCT, DE-FUNKED, or de.FUNC(t)

Monday, 10 Sept 2010 - not DEFUNCT, DE-FUNKED, or de.FUNC(t)

The non-arrival of a long-ago promised update to the qemu images and gridtools, as well as the lack of any other posts or announcements, might lead a casual observer to conclude that 9gridchan.org is adrift and falling into the swamp of bitrotting and abandoned open-source projects. We (the royal we) categorically deny this, and will now justify our seeming Sloth and explain our current projects.

Why no update to the qemu images? That has a pretty simple answer: complexity creep. The prototype updated qemu images I was working on included my "rootlessboot" modifications to the startup process and then an additional parameterized set of scripts for users to create new system images with various roles and attributes. I had the sudden realization I was probably the only user for these features and most downloaders and users would simply perceive an incomprehensible divergence from standard Plan 9. My own use of these and other modifications is driven by the desire to really explore the possibilities of distributed network architecture, and I have the time and interest to set up networks of multiple machines and VMs and spend a lot of time configuring them. Adding a lot of new knobs was useful for that, but was not going to be a benefit for the majority of potential users.

So, I did what any flaky hobbyist developer does when they realize their project is drifting into the realm of "only its author has any clue how or even why to use it" - I slacked off and waited for a Better Idea to come along to refocus what I was doing.

The #plan9chan irc channel on freenode is where 9gridchan stays active (for sufficiently relaxed definitions of "active") even when we seem to be completely moribund judging by the website. A lot of the discussion is only vaguely related to plan 9 and grids, but that is a feature, not a bug (and also why we have a project-specific channel, since a lot of it would be noise, not signal, in the main #plan9). At some point in a discussion of information-based economies, a channel veteran mentioned Satoshi Nakamoto's Bitcoin software as an example of a decentralized system that might be of interest.

Sudden change of topic: I'm a long-time player of games like Chess, turn-based strategy video games, and Magic: the Gathering. I've often talked about making some kind of bizarre Plan 9 high-concept strategy game based on namespace operations. The closest I ever got to actually making anything was a few skeleton filesystems with large trees of folders with the names of different nouns and verbs, where you could bind them into semantically meaningful strings.

Sudden change of topic: In general I've taken the position that Plan 9 and 9p represent an alternative to our modern http-centric approach to network services and applications. I also felt that trying to use plan 9 as the backend for browser based services wouldn't promote and encourage the direct use of Plan 9, which is a major goal of this project.

Obligatory sudden integration of topics: So here's the idea - use a network of plan 9 nodes as backend game servers that are "factories" for unique digital objects that are game pieces for an in-browser strategy game. Plan 9 adoption is encouraged because running a plan 9 game node/server means that you are able to create your own customized game pieces. The consistency and fairness of the game is guaranteed using cryptographic principles inspired by Bitcoin. The game itself is similar to a collectibe card or miniatures game. Plan 9 systems create and distribute unique game pieces to web-browser client players who use a subset of the game pieces they control to engage in strategic combat. Game flavor will be customizable, but a fantasy-themed example may make the concept clear. A browser based player is like a knight entering combat; a Plan 9 node operator is like the king hosting the fight and the armorsmith providing the weapons.

Still mostly but not completely vaporware. There's nothing worth presenting to the public yet, but I've been working on this idea for several weeks now and have built working prototype code for the basic systems needed. contrib/mycroftiv/unreleased/rsagame has an already-outdated set of utilities and experiments, and a few hapless victims have poked at the "square character battling engine" in the browser which I could link a screenshot of but won't because it looks really dumb at the current point in time. What needs to be done is to integrate these two pieces, (converting some rc scripts to C in the process) and then replace the placeholder battle code with the worked-out-on-pen-and-paper game system, and update the browser client code to interact with it.

Can we get the Hollywood quick-sell version of the idea? Sure thing - its Plan 9 meets Bitcoin meets Magic:the Gathering meets Farmville!

How long before a beta? Will 9gridchan.org be hosting a server for the game? I'm extremely unreliable as to schedules, but I'd like to have something for people to poke at sometime during October. I'll definitely be hosting game server(s) but I might use a different domain name

Will it all still be free as in freedom and free as in beer? I have a strong commitment to open source code so all server and client side code will absolutely be available. I will, however, be using Affero GPL 3 for large portions of it. I'm aware this is a license with many restrictions, but the goal is to preserve user access to source code. I also have no current expectation that this game project will attract any users or popularity whatsoever so it is 99.99% likely to be monetarily free forever also. On the 0.001% chance that the game attracts an enthusiastic and growing user community that wants me to provide multiple game servers or add professional quality art assets to the game I might investigate ways of letting people pay me to provide a way of growing the project. Availability of source means free play would always exist and the intention of this project is just to create something cool and fun to promote Plan 9. I will admit though that figuring out a way to use an os-agnostic browser based hook to try to get people to start using plan 9 in order to achieve higher power/status in a videogame definitely appeals to me, bwahahaha!

Needless to say, we are still supporting all our current services and software.

by root at September 20, 2010 09:34 AM

September 09, 2010

newsham

House of Cards quotes

Just got done reading Cohan's House of Cards. The book is about Bear Stearn's demise. There are some great quotes in this book and I've saved a few:

Tannin, at BSAM, talking about his soon-to-fail hedge funds:

"Believe it or not -- I've been able to convince people to add more money -- which is what I'm doing as well.  No one has redeemed as far as I've seen."

Hedge fund manager's faith in his product:

Cioffi was growing increasingly nervous about his hedge fnds and initiated the process of removing $2 million of the $6 million that he personally had investe din the Enhanced Leverage Fund.

Marking the real market price:
After Goldman sent out the marks in the $0.50 to $0.55 range, Fanlo called Cohn and told him, "You're way off market. Everyone else is at 80, 85." Cohn then offere to sell Fanlo $10 billion of the paper at his $0.55 price and encouraged him to sell that in the market to all the other broker-dealers at the higher price they claimed to be marking the paper at. [...] "you can sell them to every one of those dealers," Cohn told him. "Sell 80, sell 77, sell 76, sell 75. Sell them all the way down to 60. And I'll sell them to you at my mark, at 55, because I was trying to get out. So if you can do that, you can make yourself $5 billion right now." Cohn had been trying to sell the securities at 55 for a period of time and people would just hang up on him. A few days later, Fanlo called Cohn back. "He cam back and said, 'I think your mark might be right,'" Cohn said.

Paul Friedman about BSAM's idea to spin off a company designed to hold all the toxic assets and IPO it:
"Can't say if Warren knew or if the trading desk knew, but personally I thought the notion of packaging the most opaque and complicated securities into a complicated structure and then selling it via an IPO to the widows and orphans of the world was on eof the most bizarre ideas I'd ever heard"

and later on the same deal:
"The wacko Everquest deal was saved from being the dumbest idea ever in the history of the world by not being done," Paul Friedman said. "I mean, 'Let's take CDO equity and residuals and all this stuff, and let's package it up an do a deal and sell it to Mom and Pop.' Nobody's ever come up with a stupider idea in history. Had the funds not blown up, Ithink there's a chance they would have actually tried to bring it to market. That's another indication that there were no grown-ups minding the shop."

by newsham (noreply@blogger.com) at September 09, 2010 02:06 AM

September 08, 2010

research!rsc

Webscript

Back in 2005, before JavaScript was required for everything on the web, I created a toy language called webscript, to let you script automatic web interactions. I wanted to use webscript to replace some clunky wget-based shell scripts we were using to do things like track packages and reboot computers. Two real webscript programs illustrate the language.

This used to work for tracking UPS packages:

#!./webscript

load "http://www.ups.com/WebTracking/track?loc=en_US"
find textbox "InquiryNumber1"
input "1z30557w0340175623"
find next checkbox
input "yes"
find prev form
submit
if(find "Delivery Information"){
 find outer table
 print
}else if(find "One or more"){
 print
}else{
 print "Unexpected results."
 find page
 print
}

And this used to work to connect to a APC power strip and reboot the computer named yoshimi:

#!./webscript

load "http://apc-reset/outlets.htm"
print
print "\n=============\n"
find "yoshimi"
find outer row
find next select
input "Immediate Reboot"
submit
print

I say that the scripts used to work because I haven't run them in a few years. One of the goals of webscript was to let the programmer describe the interactions with the web page instead of working with the raw form parameters, so that scripts would even as the hidden plumbing behind the web page changed. Even so the UPS script mentions InquiryNumber1.

Every few years I go looking for a replacement for webscript, but I never find anything. As more web sites require JavaScript to do anything useful, the job of implementing something like webscript gets harder. I think you'd essentially have to link against a browser to make it work.

Do you know of a tool like webscript that works with today's web pages? Do you know an easy way to make one? (I know about AppleScript and Chickenfoot, but I'd prefer something that can run and produce output as a command line program, without even displaying a web browser on my screen, so that it can be used in shell scripts and cron jobs.)

by Russ Cox (noreply@blogger.com) at September 08, 2010 01:56 PM

August 28, 2010

Command Center

Know your science

Except for the TV show "The Big Bang Theory", popular culture gets science wrong. We all know that.

But there's a way it tends to get science wrong that upsets me more than most. That is when it misuses the tools of science by willfully ignoring what science actually means.

One common example is celebrity equations, wherein some mathematical-looking expression mixes two or more celebrities together, as in (I'm making this one up and I'm not a cultural critic, let alone a comic, so please bear with me): Lady Gaga = (2*Madonna + Carrot Top)/3. Mathematically savvy readers will recognize that I normalized that equation. If you don't know what that means, you shouldn't be writing celebrity equations, because mathematical equations mean something, they're not just symbols. Like musical comedy based on bad notes, bogus mathematical equations are not funny, just lazy.

Some years ago I even wrote a letter to Entertainment Weekly when they had a long article full of egregious celebrity equations. To their credit, they published the letter and even mended their ways for a while. I quote the letter here:

According to EW math, the more buzz or intelligence you have, the less likely you are to be on the It List. That may be true, but I bet you didn't mean that. Your equation is art-directed nonsense. EW seems to think the joke is that the equations look cute: If Einstein is funny, his square root is hilarious...
In short, mathematics may look funny if you don't understand it but that doesn't make it funny if you misuse it in ignorance.

Another sort of abuse is comedy periodic tables: periodic tables of the vegetables, period table of the desserts, periodic table of the presidents, and on and on. There are zillions of them. I believe the vegetables one was the first widely distributed example.

What's wrong with them? Again, they miss the point about the one true periodic table, Mendeleev's periodic table of the elements. In fact, to put things with no structure into a periodic table not only misses the point of the periodic table, it misses the profound idea that some things have periods.

Mendeleev's table, by recognizing the periodic structure of the elements, predicted not only properties of the elements, but the very existence of undiscovered elements. It was a breakthrough.

The periodic table is not some artistic layout of letters, it's science at its very best, one of the great results of the 19th century and the birth of modern chemistry. It doesn't honor science to take, say, typefaces and put them in a funny-looking grid. That just mocks the idea that science can predict the way the world works.

Science is not arbitrary. Making arbitrary cultural artifacts by abusing scientific ideas is not just wrong, it's offensive. It cheapens science.

Another area of abuse is quantum mechanics, and a common victim is Heisenberg's uncertainty principle. Despite what some ill-informed academics would have you believe, Heisenberg's principle is not some general statment about weird shit happening in the world, it is a fantastically precise scientific statement about the limits of measurement of two simultaneous physical properties: position and momentum. It's not a metaphor!

What's really sad is that many of the commonest misuses of the terminology of quantum mechanics come from other areas of science and technology. For instance, there is a term in computer engineering called a Heisenbug, which refers to faults that are unpredictable, most often for bugs that go away when you examine them. It's a cute name but it isn't even a correct reference. The quantum mechanical property of things changing when you observe them is not the Heisenberg uncertainty principle, it's the observer effect. These two ideas are often confused but they are not the same. They're not even closely related.

The observer effect in quantum mechanics describes how the act of measuring a quantum system forces the system to cough up a measurable quantity, which triggers a "wave function collapse". Heisenberg's uncertainty principle says that the minimum product of the error in simultaneous measurement of a particle's position and momentum is Planck's constant divided by 4π, or as we write it in physics, ℏ/2. (By the way, that's an extremely small value.)

Not only are these very different ideas, neither of them has anything to do with computer bugs. The term Heisenbug is trendy but bogus and ignores some strange and beautiful ideas. It's no better informed than the square root of Einstein or the periodic table of the typefaces.

If you're going to use the terms of science to inform your world, please make a point to understand the science too. Your world will be richer for it.

by rob (noreply@blogger.com) at August 28, 2010 01:46 AM

August 27, 2010

Eric Van Hensbergen

4P: a four-protocol API for synthetic filesystems

I've been implementing a lot of synthetic filesystems lately, and it struck me that (at least in the Plan 9 world, but it also seems to hold true for many Linux synthetic file systems) there is a lot of boilerplate gunk in synthetic file systems which is largely unnecessary -- particularly for control interface synthetic file systems (as opposed to normal disk access file systems or some sort of data-exchange file system). For this I'm talking more about interfaces like /proc.

In any case, it seems to me what things largely boil down two is two basic operations, get and put, with perhaps two other protocol elements for flushing and outstanding request and reporting errors. By and large this is the same conclusion that the Octopus protocol [1] folks came to (although they have more complexity and operations than I think are necessary), and its also the fundamental observation underlying web-style RESTful transactions.

The advantage of a minimal interface for synthetic control file systems is that it should allow you to keep the server implementation extremely simple and it will minimize the over-the-wire transactions necessary to complete a given operation (which for many control interfaces is even more critical due to the synchronous nature of walk, open, read, close or walk, open, write, close operations -- particularly for large latency network links). This last point was, I believe the primary motivation for the Octopus folks.

So, punting on security for the moment, the basic principle is to have the 4 protocol operations (Get, Put, Flush, Error -- hence 4P). They follow the same basic protocol convention of 9P with a basic header of size[4] op[1] tag[2]. We ditch fids and the state associated with them for just sending the full path on every operation -- the basic observation being that in most synthetic control file systems we tend to open, read/write, and then close the file -- so there is no reason to have the file held open, and path resolution is simple enough to obviate the need for fid processing/tracking. The basic operations are similarly simplified:

size[4] TPut tag[2] flag[s] path[s] meta[s] data[s]
size[4] RPut tag[2]

Put ends up assuming the job of creation as well as writing data. RESTful semantics are assumed, so individual operations are treated as atomic and whole-file operations -- there is no partial update in the protocol, so no offsets. meta[s] is actually a subset of the typical stat style meta-data, as control systems typically only have user, group, and perm style metadata. flag[s] allows some extended operations and semantic features allow exclusive creation (don't create if it's already there), blocking (to wait for it to not be there before putting data we have), append operation (add to the end instead of just replacing), as well as specification of meta-data only put operations (which might change owner but not data for instance). Using a string for the flags allows file-specific flags (which might lead to nightmares, but I'm leaving it in for now), and partial support of flags is of course allowed (returning an error on an unsupported flag).

size[4] TGet tag[2] flag[s] path[s] nmsg[4] count[4]
size[4] RGet tag[2] meta[s] data[s] more[1]

Get has a similar simplified nature, although it does allow specification of partial reads (although always from the top of the
file) by using count[4]. Since gets can also be streaming (multiple reply messages, get allows you to specify the maximum
number of reply messages you wish to receive relating to this transaction) - the response more field lets the client know when there are more packets coming related to this transaction on the same tag. Get also accommodates flags allowing the specification of advanced functionality including removing the file after getting (linda style), truncating the file after getting, waiting for an update to be put into the file before processing a get (server relative time), only retrieving data, only retrieving metadata, and specification of streaming (allow up to nmesg[4] return messages on updates to the file).

From a client perspective we could build these as either a 9P gateway or a VFS file system (as well as building a client library for tighter binding with an application or application(s)). From a server perspective, we could build servers in a similar fashion to the golang http servers, allowing handlers to be registered for elements, using regular expressions for allowing parameters to be encoded into paths. To enhance modular construction, we could allow for hierarchical representation of the handler tables which should also make search more efficient.

Using such a system it would be possible to construct a hello world file server in a few lines of code, with more complicated dynamic control file systems able to be constructed with not too much additional effort since the system takes care of managing the path hierarchy for you. Its not intended for such a system to be used for typical storage file systems or anyplace where large amounts of data need to be processed or accessed with varying degrees of granularity -- but for command/control file systems this should provide a more optimal solution both in terms of time to develop as well as network latency and efficiency.

A big motivation here as well as this takes some of the state out of the 9P protocol, which might allow more complicated distributed dynamic command/control frameworks. I imagine this might be particularly useful for xcpu style distributed execution environments -- which is where we intend to play with the approach.

[1] Octopus Protocol: http://lsub.org/ls/export/op.pdf

by Eric Van Hensbergen (noreply@blogger.com) at August 27, 2010 09:51 PM

August 19, 2010

newsham

Return to sanity

I like how our system works. Occasionally we get it wrong. But we knew we would, and we have checks and balances to help even the scales. Today we see a good example of that. Some guy had lied about receiving some military medals and in knee-jerk reaction we made a new law to make that illegal. It shouldn't be, as N8 pointed out last year. Well, today the courts make it right: http://atwar.blogs.nytimes.com/2010/08/18/stolen-valor-act-is-declared-unconstitutional-by-circuit-court/?ref=world

by newsham (noreply@blogger.com) at August 19, 2010 05:46 PM

August 15, 2010

newsham

Trust cost of gas. $15 per gallon?

Interesting article that tries to factor in the external costs of gas to figure out what the true cost we pay for our energy: http://www.alternet.org/environment/147842/gas_is_really_costing_us_about_$15_a_gallon. Some of the items seem a little bit questionable, but its clear that the true cost is considerably higher than what we pay at the pump.

by newsham (noreply@blogger.com) at August 15, 2010 05:04 AM

August 14, 2010

NineTimes - Inferno and Plan 9 News

IWP9 5e Submissions Deadline Extended

Subject: [9fans] IWP9 2010 - Submissions deadline extended Date: Sat, 14 Aug 2010 13:56:32 -0700

Hello 9fans,

The deadline for paper and WIP submissions has been extended to Sept 6th. Please send in your submissions as soon as possible.

Thanks, -Skip

Post on 9fans

by www-data at August 14, 2010 09:31 PM

August 13, 2010

newsham

Terror Babies

Eek! Have you heard about this? They're sneaking babies into america to terrorize us! And you know whats really weird? CNN asking hard questions (this still happens?  Is it even allowed?)

<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/dQVfQCpYocQ?fs=1&amp;hl=en_US"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="344" src="http://www.youtube.com/v/dQVfQCpYocQ?fs=1&amp;hl=en_US" type="application/x-shockwave-flash" width="425"></embed></object>

wow!

by newsham (noreply@blogger.com) at August 13, 2010 08:02 PM

OS Inferno

В Inferno интегрирован новый криптографический фреймворк

10 Августа Charles Forsyth добавил в репозиторий Inferno код модуля Crypt, который должен стать новым стандартным интерфейсом к крипто-функциям, используемым для аутентификации с использованием открытых/закрытых ключей. Коммит.

by j1m (noreply@blogger.com) at August 13, 2010 07:01 AM

Charles Forsyth представит доклад о текущем состоянии Inferno на IWP9 2010

Charles Forsyth в качестве представителя компании Vita Nuova выступит на конференции IWP9 2010 с отчетом о текущем положении дел в компании и проектах, связанных с Inferno и языком Limbo. Источник.

by j1m (noreply@blogger.com) at August 13, 2010 06:52 AM

August 11, 2010

NineTimes - Inferno and Plan 9 News

IWP9 5e update

Skip Tavakkolian has posted an update on IWP9 5e, the post to 9fans is included below.

Subject: [9fans] IWP9 2010 - Update and reminder Date: Sat, 7 Aug 2010 10:23:43 -0700

Hello 9fans,

IWP9 2010 is shaping up to be an exciting event. We are happy to announce the following talks:

• Sape Mullender et al. will present a paper on a new Operating System that Bell Labs is working on.

• Charles Forsyth will give a talk on the state of Inferno, Limbo and related projects at Vita Nuova.

• Geoff Collyer et al. will give a talk on ongoing Plan 9 efforts at Bell Labs, including the Blue Gene port and ports to ARM boards and plugs, Virtex 4 and 5 Power PC FPGA and others.

• Ron Minnich will lead a "hack session", putting Plan 9 on small devices -- primarily the Sheeva/Guru plugs; other boards like the BeagleBoard and the Gumstix Overo are also encouraged.

Please note that the deadline for submitting your drafts is August 15th.

Thanks, -Skip

by www-data at August 11, 2010 12:12 AM

August 10, 2010

newsham

Change you can believe in

Obama's been in office for a year and a half now. So why is it that scientists are still being persecuted by the government for speaking the truth? And by NOAA, of all agencies. And you thought the "war on science" was over.

by newsham (noreply@blogger.com) at August 10, 2010 08:47 PM

Choice Quote

From the great Dr. Krugman (http://www.pkarchive.org/column/032902.html):
While George Soros was spending lavishly to promote democracy abroad, Mr. Scaife was spending lavishly to undermine it at home.

by newsham (noreply@blogger.com) at August 10, 2010 06:20 AM

August 04, 2010

newsham

Conference calls

Conference calls, who doesn't love em?
<object height="344" width="425"><param name="movie" value="http://www.youtube.com/v/zbJAJEtNUX0&amp;hl=en_US&amp;fs=1"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="344" src="http://www.youtube.com/v/zbJAJEtNUX0&amp;hl=en_US&amp;fs=1" type="application/x-shockwave-flash" width="425"></embed></object>

by newsham (noreply@blogger.com) at August 04, 2010 05:42 PM

July 31, 2010

OS Inferno

9buntu - Ubuntu с лицом Plan 9


Наткнулся на ISO-образ 9buntu, распространяемый широко известными в узких кругах ребятами с сайта suckless.org. Как можете видеть скриншот очень красноречив, однако что это, 9vx или доработанный p9p пока не ясно.

by j1m (noreply@blogger.com) at July 31, 2010 06:55 AM

July 30, 2010

johnny

motherfucking captive wifi portals

So i was at a conference three weeks ago called RMLL, they didn't have wi-fi at the "villages" we stayed in, just a captive portal with a 24 hour free trial (of course activated by sms so they have your cell number afterwards and can harass you with pr bullshit).

The thing is that their captive portal did a http permanent redirect to their portal, so now all of my feeds in liferea point to their fucking website. way to go motherfuckers, now i can manually go through my 100 feeds and rewrite the url? or what?

anyone who has a good beefy connection can ping -f wifirst.net

by lighttpd at July 30, 2010 02:05 PM

July 20, 2010

newsham

Protect Traditional Marriage

A new ballot initiative in California will strengthen traditional marriage by making it illegal to get a divorce. Here's a public service announcement to get you the facts:
<object height="340" width="560"><param name="movie" value="http://www.youtube.com/v/oqe4mx2N7E0&amp;hl=en_US&amp;fs=1"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="340" src="http://www.youtube.com/v/oqe4mx2N7E0&amp;hl=en_US&amp;fs=1" type="application/x-shockwave-flash" width="560"></embed></object>

So if you're in California, make sure to get out there and put your name on the petition and spread the word to your friends. We can save traditional marriage with your help! For more information see http://rescuemarriage.org/

by newsham (noreply@blogger.com) at July 20, 2010 01:11 AM

July 05, 2010

newsham

BP, Photographers, and Terrorism

The police and DoHS detained a news photographers taking pictures of BP facilities from public property. How? Using anti-terrorism laws, of course. After reviewing his film and taking down his personal information they handed the information over directly to BP security. Pretty amazing stuff. There is something broken in America right now and it needs to be fixed very quickly. If anti-terrorism laws are going to be used against american citizens informing other american citizens of important events (in order to protect the image of large foreign companies), then we're going to have to get rid of those anti-terrorism laws.

In related news, the Coast Guard set up new rules that require permission to "to enter any safety zone" and forbid photographers from getting within 65 feet of oil booms. http://www.huffingtonpost.com/georgianne-nienaber/facing-the-future-as-a-me_b_634661.html  So much for freedom of the press (what little of it we have left).

by newsham (noreply@blogger.com) at July 05, 2010 06:21 PM