Skip to content
Hannes Mehnert edited this page Dec 22, 2015 · 71 revisions

This is an area to gather suggestions for projects involving Mirage and OCaml, suitable for different skill levels. Drop a line to [email protected] if you're interested in starting on one of these.

New Libraries

ANSI Terminal emulation

Currently, a Xen console is available. For interactive unikernels (network configuration as a terminal shell), a terminal emulation is useful. Once this is finished, a telnet server is straightforward to implement, and we can interactively explore unikernels!! A recent mail thread already contains some advice.

Status: declarative terminal graphics library

Mentor: Hannes Mehnert (and others)

Difficulty: ★★☆☆☆

WebIDL

The modern web is becoming an application platform in addition to its function as a distributed document store. Instead of native operating system interfaces, the web browser exposes JavaScript APIs with various restrictions. We would like OCaml software to be able to easily transition between a native application and a web application. To accomplish this, a representation for browser interfaces is required.

This project would involve the development of a parser of WebIDL, a W3C standard interface specification language http://www.w3.org/TR/WebIDL/, and a set of rules for mapping WebIDL signatures to OCaml interfaces. Further work could include the development of a js_of_ocaml binding generator or generated interface constraint assertions. These developments would help the OCaml community keep our interfaces up-to-date with the latest web API advancements and provide universal interfaces for common device capabilities like hardware-accelerated 3D rendering and device sensor support.

Mentor: David Sheets

Difficulty: ★★★☆☆

Blog and OPAM-aware static website generator

Static websites are sufficient for most of our needs as they are lightweight, easy to deploy/maintain and trivial to compile into unikernels. Jekyll is one example of a static site generator (made popular by GitHub pages) and it would be extremely useful to have similar generator in OCaml. Such a generator would come with sensible defaults such that creating a site would be quick (cf. Jekyll-Bootstrap) and allow blog posts and pages to be written in Markdown. It would also make it easy for anyone who maintains OPAM packages to build pages that represent their libraries without much additional effort. The goal would be to achieve something as simple to set up and run as Jekyll. It's likely that this project would involve the creation of different libraries to deal with different aspects of the work and a number of libraries already exist. The specific approach can be discussed with those interested in this project and the difficulty arises from getting the design correct, rather than just the writing of code. Use-cases for this tool (once it exists) would be for personal websites, project websites and the components would be useful for more substantive sites like OCaml.org or MirageOS.

Mentor: Amir Chaudhry and David Sheets

Difficulty: ★★★☆☆

Bigarray parser generator

FastParsers is a Scala parser library which uses macros to transform easy-to-write parser combinators into efficient recursive-descent backtracking parsers. The generated parsers are about 20x faster than Scala's parser combinator library even though its interface stay about the same.

An OCaml equivalent that uses Cstruct under the hood to do zero-copy parsing would permit a big speed boost in Mirage's protocol stacks.

Mentor: Jeremy Yallop, Anil Madhavapeddy, and Rudi Grinberg

Mentee: Runhang Li

Difficulty: ★★★★☆

Status : working in progress

Password based key derivation

How to encrypt some private data using a human-enterable and human-memorable password? The answer is to derive a key from a given password (because passwords are biased towards printable alphanumerical ASCII characters). This also increases the required computational power for brute-forcing passwords, because the derivation function is computationally expensive. More details about PBKDF2, spec in RFC 2898, also bcrypt and scrypt are of interest (best would be to implement all of them :).

Mentor: Hannes Mehnert

Difficulty: ★★☆☆☆

Storage

Local synchronisation between Git repositories

ocaml-git is an implementation of Git in pure OCaml. It currently supports only a subset of the Git protocols. It would be nice to extend it to support synchronisation between local repositories.

Related issue: mirage/ocaml-git#27

Mentor: Thomas Gazagnaire

Difficulty: ★☆☆☆☆

Garbage collector for Irmin

git gc is an efficient command which compress "live" objects in Git repositories and remove unused ones. It consists at going through the graph of object in one go to find the live objects (the roots being pointed out byt the references) and then use an efficient sliding-window compression method to generate a pack file. It would be nice to (i) re-implement the same algorithm to be used on any kind of Irmin store (not only Git ones) and evaluate it against git gc; and (ii) explore alternative GC strategies with different trade-offs.

Related issue: mirage/irmin#96

Mentor: Thomas Gazagnaire

Difficulty: ★★★★☆

Network

Qubes FirewallVM

QubesOS is a desktop operating system that runs applications in different Linux VMs for added security. It also uses a separate Linux VM to implement the firewall. This VM is large and slow and runs a lot of C code. It would be interesting to replace this with a small, simple Mirage unikernel.

Initial Qubes support can be found in mirage-qubes.

Mentor: Thomas Leonard

Difficulty: ★★★☆☆

See discussion Unikernels and Qubes on the qubes-users list.

HTTP Performance and Profiling Harness

Cohttp is a very flexible implementation of the HTTP protocol, but we lack confidence in some of the edges of the implementation when it comes to clients and servers issuing odd requests. It would be useful to use some existing projects to issue a sequence of bad requests or responses and ensure that our Cohttp logic does something sensible.

Related issue: mirage/ocaml-cohttp#206

Mentor: Anil Madhavapeddy

Difficulty: ★★☆☆☆

Syslogd Unikernel

Syslog is a standard protocol used to convey event notification messages. It would be nice to have a unikernel that could act as a syslogd receiver. The unikernel would listen appropriately (initially in cleartext though SSL/TLS support is also desirable), receive and verify syslog messages, and then write them into an Irmin store.

Basic implementation underway https://github.com/verbosemode/syslog-message and https://github.com/verbosemode/syslogd-mirage by Jochen Bartl (@verbosemode).

Mentor: Richard Mortier

Difficulty: ★★☆☆☆

XMPP Server

XMPP is the extensible messaging and presence protocol. A privacy-preserving robust server implementation purely in OCaml should be developed. As a basis, the streaming XML parser xmlm and the unicode library uutf should be used. Only selected parts of the enhancement proposals are needed (look at some initial requirements here). Relies on the SASL library mentioned below.

Mentor: Hannes Mehnert

Difficulty: ★★★☆☆

Network Time Protocol (NTP)

The Network Time Protocol (NTP) is used to synchronise clocks on the Internet. A pure OCaml implementation of the protocol itself, as well as integration into MirageOS would benefit unikernels to not rely on dom0 for clock information. This includes to extend the current Clock API to be able to set the current time (offset).

Mentor: Hannes Mehnert

Status: Kia is working on it via Outreachy at mirage-ntp

Difficulty: ★★☆☆☆

Simple Network Management Protocol (SNMP)

To monitor unikernels, statistics should be gathered (byte throughput etc.). The Simple Network Management Protocol (SNMP) is an established protocol to provide such information, and additionally defines so called traps which can be triggered when the system is under heavy pressure or in an alert state. A first prototype should hook up to MRTG or to rrdtool to graph monitored data.

Mentor: Hannes Mehnert

Difficulty: ★★☆☆☆

SSH Server

SSH is a common protocol used for secure shell access to remote machines. It would be very useful to have an implementation for Mirage, enabling unikernels to provide simple, secure command-line access. An early implementation by @avsm exists at @avsm/ocaml-ssh but requires considerable updating: swapping out ounix for Lwt, removing the MPL extensions in favour of Cstruct, using modules in preference to objects, and a number of other items.

Mentor: Richard Mortier

Difficulty: ★★★★★

Foreign function interface

Java backend for ocaml-ctypes

Bindings to C code written using ctypes are independent of the OCaml value representation, which means that we could in principle reuse them with OCaml-Java. Ctypes already supports multiple backends, for dynamic call generation and static code generation, so the architecture is in place to add additional backends for OCaml-Java support.

This message outlines what needs to be done to extend ctypes with OCaml-Java support.

Mentor: Jeremy Yallop

Mentee: Kenneth Adam Miller

Status: Work-in-Progress

Difficulty: ★★★☆☆

Multi-process/vm support for ocaml-ctypes

OCaml code that interfaces with C typically involves combining two runtimes in a single process. This weakens OCaml's safety guarantees, since a rogue C pointer can corrupt the OCaml heap. The high-level programming model available with ctypes makes it possible to switch between linking strategies without changing binding code; one such strategy is to run all foreign (C) code in a separate process or separate VM, limiting the damage it can do.

We have a prototype of this multi-process linking strategy, but work is needed to turn it into a fully usable system which supports easy switching between IPC mechanisms, robust error handling, etc.

Mentor: Jeremy Yallop

Mentee: Cosmin Boaca

Status: Work-in-Progress

Difficulty: ★★★★☆

Metaprogramming

Macros for OCaml

There are currently two ways of generating OCaml code from within OCaml programs: camlp4 (and its successor, ppx), which produces untyped syntax, and MetaOCaml, which produces typed code.

We have a design for an OCaml extension which combines advantages of the two approaches. The system will allow users to write type MetaOCaml-style code generators that both interact cleanly with the language abstractions like modules and run entirely during compilation. There are various applications within Mirage and more widely, including generic programming, HTML templates, foreign function interface generation and embedded DSLs.

There's an abstract with further details about the design, which is to be presented at OCaml 2015.

Mentor: Jeremy Yallop

Difficulty: ★★★★☆

Testing

Create a tiny VM for easy load testing

The goal of this project is to create a specific unikernel that can be configured to generate a specific I/O pattern, and to create configurations that mimic the boot sequence of Linux and Windows guests. The resulting unikernel will then enable cheap system load testing. The first task is to generate an I/O trace from a VM. For this we could use 'xen-disk', a userspace Mirage application which acts as a block backend for xen guests (see http://openmirage.org/wiki/xen-synthesize-virtual-disk). Following the wiki instructions we could modify a 'file' backend to log the request timestamps, offsets, buffer lengths. The second task is to create a simple kernel based on one of the MirageOS examples (see http://github.com/mirage/mirage-skeleton). The 'block' example shows how reads and writes are done. The previously-generated log could be statically compiled into the kernel and executed to generate load. Outcomes would be (1) a repository containing a unikernel (see http://github.com/mirage/mirage-skeleton) and (2) at least 2 I/O traces, one for Windows boot and one for Linux boot (any version).

Mentor: David Scott

Difficulty: ★★★☆☆

Web stack testing

MirageOS has an emerging web toolstack that's broken up as a series of libraries -- for example, Cohttp, Uri, Cow, Ipaddr, RSS and Cowabloga. This project will get you familiar with them by building a protocol testing framework that can generate traffic using off-the-shelf tools such as httperf, and evaluate the results vs applications such as Apache or Nginx. Outcomes would be (1) a test harness for HTTP and (2) some results of the evaluation using the test harness.

Mentor: Anil Madhavapeddy

Difficulty: ★★★☆☆

Fuzz testing Xen with Mirage

We would like to use the Mirage/Xen libraries to fuzz test all levels of a typical cloud toolstack. Mirage has low-level bindings for Xen hypercalls, mid-level bindings for domain management, and high-level bindings to XCP for cluster management. This project would build a QuickCheck-style fuzzing mechanism that would perform millions of random operations against a real cluster, and identify bugs with useful backtraces. The first task would be to become familiar with a specification-based testing tool like Kaputt (see http://kaputt.x9c.fr/). The second task would be to choose an interface for testing; perhaps one of the hypercall ones. Outcomes would be (1) a repo containing a fuzz testing tool and (2) some unexpected behaviour with a backtrace (NB it's not required that we find a critical bug, we just need to show the approach works).

See also: http://wiki.xenproject.org/wiki/GSoC_2013#fuzz-testing-mirage

Mentor: Anil Madhavapeddy

Difficulty: ★★★☆☆

Authentication

SASL library

SASL is used widely for authentication of users - we need it for MirageOS. Various mechanisms are available - ranging from md5 digests, over plain, towards cram/scram and external. Both the client side as well as the server side needs to be done - some preliminary client code can be adapted from the XMPP library. For the server side it is crucial to not save passwords in clear text or hashed without salt. TLS client certificate integration (SASL external mechanism) would be great to have as well!

Mentor: Hannes Mehnert

Difficulty: ★★★☆☆

Documentation and Outreach

Screencasts

As we produce more libraries we also try to produce material around them to ease the process of trying things out and getting more involved. Currently we do this through blog posts, examples on mirage-skeleton and links to implementations. It would be really helpful to add screencasts to this list of resources and we've made early steps already. This is a slightly unorthodox project as it's not about code but it would have a substantial and positive impact on the project. It's likely to be an ongoing project as screencasts about anything MirageOS-related are fair game! If you need some ideas, then things that would be useful are: installing the dev environment on various architectures, walk-throughs of the existing tutorials and demos, and examples of deployment steps (and many more). I'm offering support/guidance to anyone who'd like to have a go at this.

Mentor: Amir Chaudhry

Difficulty: ★☆☆☆☆


Completed projects!

Below is a list of projects that we've marked as 'completed'. However, as these are driven by real and ongoing needs there is likely to be scope to continue the work (and we simply haven't gotten around to defining a follow-on project). If you're interested in working on the continuation of any of these, please send a note to [email protected].

OCaml Implementation of libmacaroons

libmacaroons is an implementation of macaroons which is a cryptographic bearer token construct (like cookies) that can be attenuated by third-parties. Macaroons provide a decentralized authorization framework for access control which excels at delegation. The system is based on the libsodium library which wraps djb's NaCl.

Macaroons was initially implemented in C and projects are underway to implement it in Java, JavaScript, Go, and Python. We would like an OCaml implementation of macaroons based on ocaml-sodium to use in Mirage.

Mentor: David Sheets

Difficulty: ★★★★☆

Pull request for macaroons 0.1.0

Irmin inside the browser

Irmin is a library for creating Git-like stores. It is written in pure OCaml and it should be possible to compile to it JavaScript to run in a browser (modulo implementing in Javascript the few missing external symbols). But someone has to try it and fix the inevitable glitches will will happen. The ultimate goal is to have a version controlled local-storage, with asynchronous synchronisation between the browser and the server via Git over websockets.

Related issue: mirage/irmin#96

Mentor: Thomas Gazagnaire

Difficulty: ★★☆☆☆

See Thomas Leonard's email to the list

Encryption layer for Irmin

Irmin is a library for creating Git-like stores. It provides a nice abstraction on top of various lower-level backend (such as the Git format) and it is (relatively) easy to add new ones. It would be nice to design a new backend to support encryption.

Related issue: mirage/irmin#96, work done

Mentor: Thomas Gazagnaire

Difficulty: ★★★☆☆

Semantics of mergeable data-structures

See online

Mentor: Thomas Gazagnaire

Difficulty: ★★★★☆

Fix warnings in Xen C code

mirage-platform contains C code that produces various warnings when compiled (make xen-build). An easy but useful way to increase our confidence in the code would be to go through these and fix them.

Done by Len Maxwell; see https://github.com/mirage/mirage-platform/pull/141

Mentor: Thomas Leonard

Difficulty: ★☆☆☆☆

DHCP Server

DHCP is a common protocol for automatically discovering and managing network settings. Mirage already includes a minimal DHCP client in the mirage-tcpip repository (for configuring network settings on unikernels on networks that have a working DHCP server), but currently there is no implementation which allows Mirage to serve and manage DHCP leases for other hosts on a network. Even a minimal IPv4 implementation would be helpful for demonstration purposes.

Done by Christiano Haesbaert, https://github.com/haesbaert/charrua-core and related; and by Alistair Fisher, https://github.com/alistairfisher/irmin-dhcp.

Mentor: Richard Mortier

Difficulty: ★★☆☆☆

Deflate (zlib) in pure OCaml

Multiple pure-OCaml implementations of the zlib inflate algorithm already exist: for instance, in extlib or here. A pure implementation in Haskell of both inflate and deflate also exists and is available here. What is missing is a non-blocking, streaming implementation of both inflate and deflate in pure OCaml, in a style similar to jsonm and ocaml-imap. This is an independent project which will be useful on its own -- it will also allow to use lzip in the browser with js_of_ocaml.

Mentor: Thomas Gazagnaire

Status: https://github.com/oklm-wsh/Decompress

Difficulty: ★★☆☆☆

SSL Command-line utilities

Basically all the utilities known from openssl (being it s_client, s_server, asn1parse, dgst, enc, and verify) should be implemented using the mirleft libraries (tls, asn.1, nocrypto, x.509) as standalone Unix executables (using cmdliner). Plus points for drop-in replacement (full argument compatibility where applicable). This is also applicable for other, non-security, libraries as well (e.g. Syndic). Please contact the mailing-list if you'd like to work on any of these.

Mentor: Hannes Mehnert (and others)

Work which has been done: rand certify tlsclient tlstunnel

Difficulty: ★☆☆☆☆

Clone this wiki locally