Quirky Java 22 Feb 2009

This term I’m taking a course on compiler construction, CS 444. For the course I, along with my group members Peter and Ian, am developing a compiler for a subset of Java. So far we have finished the scanner and parser. It’s been an illuminating experience learning about the quirkiness hidden deep in the Java Language Specification and I wanted to share a few of the more esoteric constructs we discovered.

I successfully compiled all of the following examples on my aging Windows XP laptop with javac 1.6.0_04. First, a single semicolon is valid Java file! So

    // in Quirky.java
    ;
    

is a perfectly valid and so is

    // in Quirky.java
    ;
    class Quirky {
        ;
    }
    

Javac will happily generate bytecode for both files. More single-character strangeness: a single $ or _ are valid Java identifiers. Hence, with a wanton willingness to abuse the language,

    // in $.java
    class $ {
        class $$ {
            $ $($ $) { return null; }
        }
    }
    class _ {
        _ _(_ _) { return null; }
    }
    

can be compiled into $.class, \(\).class and _.class! Finally the array brackets can appear in unexpected places as

    // in Quirky.java
    class Quirky {
        int[] twoDim() [] { return null; }
    }
    

generates a method twoDim that returns a two-dimensional array int[][].