Idris 2 Bootstrap Compiler on the JVM with a JVM backend

I am excited to announce that Idris 2 bootstrap compiler can now run on the JVM along with a JVM backend. Here is a REPL session:

$ idris2
     ____    __     _         ___                                           
    /  _/___/ /____(_)____   |__ \                                          
    / // __  / ___/ / ___/   __/ /     Version 0.1.1-786152de1
  _/ // /_/ / /  / (__  )   / __/      https://www.idris-lang.org           
 /___/\__,_/_/  /_/____/   /____/      Type :? for help                     

[BOOTSTRAP VERSION: No longer developed, except as a bootstrapping step.]
Welcome to Idris 2 on the JVM (Private Build OpenJDK 64-Bit Server VM, 1.8.0_275).  Enjoy yourself!
Main> "hello world"
"hello world"
Main> take 10 [25..]
[25, 26, 27, 28, 29, 30, 31, 32, 33, 34]
Main> :t map
Prelude.map : Functor f => (a -> b) -> f a -> f b
Main> :exec printLn "Hello Idris 2 on the JVM - this is printed from executing JVM bytecode from memory"
"Hello Idris 2 on the JVM - this is printed from executing JVM bytecode from memory"
Main> 

You may notice that the banner includes the JVM version that the Idris compiler is running on. Idris 2 for JVM can run on Java 8 or above. Since the compiler itself runs on the JVM, the last :exec command at line 16 above can directly interpret the generated bytecode on the compiler JVM, and the generated bytecode is loaded from memory without any file system footprint.

To try it out

Prerequisite: Java 8 or above

  • Extract the zip from here
  • Add idris2-boot-jvm/bin from extracted directory to PATH
  • Run idris2. That’s all!

Features

  • Eliminate tail recursion using JVM’s GOTO
  • Trampoline non-self tail calls
  • Support file, directory, array, IORef and buffer primitives.
  • Analyse function dependencies to generate bytecode for only used functions from main
  • Limited type inference on Idris IR to avoid boxing and typecasting in the generated bytecode
  • Static and instance Java FFI method calls.
  • Debug information (Idris source file and line numbers) in the bytecode from Idris IR

Currently, the goal for this bootstrap compiler is to be able to compile current Idris 2 so the items here could definitely be improved a lot and other features may be added to support the absolute minimum for building Idris 2.

Example

module Main

import System.FFI

-- Tail call is eliminated with GOTO
sum : Int -> Int
sum n = go 0 n where
  go : Int -> Int -> Int
  go acc 0 = acc
  go acc n = go (acc + n) (n - 1)

mutual
  -- Mutually recursive functions are trampolined
  -- keeping it simple with int instead of Nat which is compiled to Java BigInteger
  isOdd : Int -> Bool
  isOdd 0 = False
  isOdd n = isEven (n - 1)

  isEven : Int -> Bool
  isEven 0 = True
  isEven n = isOdd (n - 1)

-- Java instance method calls
%foreign jvm "java/lang/String" ".length"
jvmStringLength : String -> Int

-- Java static method calls
%foreign jvm "java/lang/Integer" "toBinaryString"
intToBinaryString : Int -> String

main : IO ()
main = do
  printLn $ sum 5000
  printLn $ isEven 100000
  printLn $ intToBinaryString 512
  printLn $ jvmStringLength "hello"

Compile

idris2 Main.idr -o main

The classes will be compiled into build/exec/main_app.

$ tree build/exec/main_app/
build/exec/main_app/
├── Main.class
├── Prelude
│   ├── $col$col.class
│   ├── Just.class
│   ├── Nil.class
│   ├── Nothing.class
│   ├── Open.class
│   ├── PrefixMinus.class
│   └── Show$spcat$spcPrelude$dotidr$col1260$col1--1284$col1__mkShow.class
├── Prelude.class
└── PrimIO.class

It compiles down to three main modules Main, Prelude and PrimIO. Rest of the classes are data classes for Idris constructors like Just, Nothing, Nil, :: etc.

Run

$ java -cp /path/to/idris2-boot-jvm/lib/runtime-1.0-SNAPSHOT.jar:build/exec/main_app Main
12502500
True
"1000000000"
5

The above example demonstrates tail recursion, mutual recursion, Java FFI instance and static method calls.

sum a tail recursive function, would be compiled into a loop. Here is the decompiled code for the nested go function inside sum:

    public static Object $nested1201$307$go(Object arg$0, int arg$1, int arg$2) {
        while(true) {
            switch(arg$2) {
            case 0:
                return arg$1;
            default:
                arg$1 = Prelude.++_Num__Int(arg$1, arg$2);
                arg$2 = Prelude.--_Neg__Int(arg$2, 1);
            }
        }
    }

As we can see here, the tail recursive function call is removed in the default block instead the arguments are updated for next iteration. The loop would continue to run until the base case is reached where the result is returned.

For isEven and isOdd, the tail calls would be trampolined using JVM’s invokedynamic. Here is the decompiled bytecode.

    public static Object isOdd(int arg$0) {
        switch(arg$0) {
        case 0:
            return 1;
        default:
            return () -> {
                return Runtime.createThunk(isEven(Prelude.--_Neg__Int(arg$0, 1)));
            };
        }
    }

    public static Object isEven(int arg$0) {
        switch(arg$0) {
        case 0:
            return 0;
        default:
            return () -> {
                return Runtime.createThunk(isOdd(Prelude.--_Neg__Int(arg$0, 1)));
            };
        }
    }

Here since the tail call is not to itself, function call is turned into a thunk which would be eventually unwrapped in a loop.

Now let’s look at FFI calls in the above example. jvmStringLength function is an example for Java FFI instance method call. It calls length function on Java’s java.lang.String instance. Similarly intToBinaryString calls toBinaryString static function on class java.lang.Integer. Function name starting with a dot denotes an instance method call differentiating it from a static method call. Precise types can also be passed explicitly as in the example below with argument types and return type (full code here).

%foreign jvm' fileClass "getLine" fileClass "String"
prim__readLine : FilePtr -> PrimIO (Ptr String)

Next steps

Next step would be to use this bootstrap version to compile current Idris 2 and then porting the JVM backend from here which is on Idris 1 to Idris 2 so there are some interesting things ahead to work on. Meanwhile, if anyone gets a chance to use this boostrap JVM version, please share any feedback or any issues here that can help while we try to compile mainstream Idris 2 for the JVM. Thank you for reading and happy coding!

Comments