Skip to content

Compiler Engineer Job

andychu edited this page Mar 31, 2022 · 84 revisions

Summary

The Oil project needs a compiler engineer with experience in C++ and garbage collection to help "finish" the project! We're funding it through a mix of grants and donations, and I encourage discussions about compensation. (More information is expected in April 2022.)

Overview

  • What is it?
    • Oil is a new Unix shell. It's our upgrade path from bash to a better shell and runtime! It's also for Python and JavaScript users who avoid shell.
  • What do we need done?
    • Write a 4K-8K line compiler in Python, and a 3K-10K lines garbage-collected runtime in C++. (There is a lot of working code, some of which may need to be rewritten. We can talk about it.)
    • There is nothing fancy here -- it's very much a job in need of solid engineering!
  • Why are you doing it this way? What progress have you made?
  • How do I apply?
    • For now, send mail to andy@oilshell.org introducing yourself, interest, and experiences. We can chat on https://oilshell.zulipchat.com/ and then have a video call.
    • I may want to do some kind of "paid interview" which involves making a failing test pass in Oil. Details to come on the blog.
  • How long does it last?
    • At least 3 months. It depends on funding, but I could easily imagine 12 - 24 months of work.

Code Overview

I made an HTML page that lists the code you'll be working with and working near: https://www.oilshell.org/release/latest/pub/metrics.wwz/line-counts/for-translation.html.

Note the line counts are quite small. This is not a 100K line project; it's more like 10K lines. (The big components are inputs and outputs to the compiler, not code we need to write.)

Skills Sought

In order of importance:

  1. Hard-won C++ experience and knowledge
    • Generating correct C++ code with a translator (i.e. C++ that works with all compilers)
    • Debugging it, analyzing its performance, and optimizing it
    • Comfortable using standard tools like gdb / CLion, ASAN, etc.
  2. Understanding of Garbage Collection
    • We have a working garbage collector, but I found this to be one of the most difficult parts of the project!
  3. Test-driven and terminal-based workflow (on some kind of Unix)
    • The job is very metrics-driven; the idea is to "make more tests pass". I've found that this strategy enables a lot of creativity and productivity!
  4. Type systems, and the relationship between types and garbage collection.
    • It's likely that we want to write our own type checker rather than relying on MyPy.
    • If you understand this Mozilla blog post, that's a good sign: Clawing Our Way Back to Precision (2013)
  5. Python
    • Most of the code is written in Python. However I think this can be learned on the job, whereas the C++ parts can't.

General attributes desired:

  1. You should consider yourself a "finisher". You should be able to prioritize work and not get lost in micro-optimization (although there are plenty of opportunities for such skills on the project). This is not a research project; the goal is to make a production quality shell.
  2. You should have good communication skills, and be able to explain your work. (We encourage applicants in any country; however English is used for all docs and communication.)
    • Bonus: if you can write nice blog posts. I frequently do this, e.g. with posts tagged #project-updates, and I find it helps me organize work and attract new contributors.
  3. Generally speaking, you should be excited about the high level goals of the Oil project. The blog should not be boring to you :-)

Good Signs ...

  • If you think our C++ is ugly! That means you have ideas on how to make it better. What exists is a proof of concept, designed to show the strategy will work and can perform well. There are many improvements that can be made. If you are convinced a complete rewrite is necessary, then please make a case that it's feasible justified by a survey of the code.
  • If you enjoy debugging C++ code! And then writing tests to make sure the bug never comes back.
  • If you like using ASAN, profilers, and other such tools (uftrace). Maybe you have a nice debugger configuration.
    • (note: Oil has a GDB pretty printer for ASDL data structures)
  • If you can read the existing code in oil-native! If not, the job isn't probably a good fit.
  • If you understand how Rust is influenced by C++ (positively and negatively) and ML, that's a good sign. In a similar way, Oil is written with algebraic data types at the core, but we also want it to be efficient.
  • Understanding the Mozilla blog post above -- or better, pointing to even more relevant references!
    • This post is relevant since we also have a precise collector. I didn't find that many documents describing such issues on real world, deployed language projects. Our GC is also meant to be 100% portable C++.

Compiler Engineer Notes

Clone this wiki locally