Transformers that read Warning: highly WIP This work investigates how we can extend transformer language models to operate with image inputs.