With this challenge we would like to see a little bit more about how you work and the way you make decisions. Specifically, the challenge will help us see:
- Your current technical knowledge
- Your thought process
- The way you work and organize
The challenge will also help you get a glimpse of the real type of technical work we develop on a daily basis and the type of datasets we work with.
Please, feel free to surprise us, and showcase any skills that you think are important!
Keep in mind that there is no right or wrong answer. If you feel like your process isn't perfect, don't worry. This is just meant to be an exercise to help us gauge where you are in terms of your current capacity and be a talking point during the next interview.
The challenge consists of creating a data pipeline that takes a raster dataset, summarizes it by administrative regions and stores the results in a relational database. Specifically, we want to summarize the total ecosystem carbon of the northern lakes region in the USA using data from the National Forest Carbon Monitoring System.
The result must be a database with the total carbon values for each county of the states of Michigan, Wisconsin and Minnesota. To achieve this, we want you to create a simple Python pipeline that loads the rasters, computes the zonal stats and loads the values to the db.
-
Total ecosystem carbon raster from the National Forest Carbon Monitoring System.
-
Use any administrative boundary source for USA counties you think is appropriate.
As a minimum, we expect you to deliver these 4 points:
- Deliver a simple and reproducible python data pipeline.
- The pipeline must be easily reproducible end to end. This means that all the setup instructions or programs must be part of the deliverable.
- The results must be accurate and correct (watch out for the units, there are some clues in the metadata documents.)
- Include instructions on how to query the results, so that, after executing the pipeline, we are able to perform such queries.
If you feel confident, want to go the extra mile, show us more skills, and surprise us, you can add all/some of the points below:
- Share an initial exploration of the input datasets with some visualization in a notebook or similar medium.
- A map with the results.
- Do you think something is missing/you can add useful features? Go for it!
Apart from Python, use any tools you are comfortable with.
- Be pragmatic and mindful of the trade-off between feature-completeness and complexity/performance. Completeness is better than show-offs. Keep it simple.
- About the use of AI assistance: as with any other tool, we do allow you to use it. Nonetheless, we expect that the delivered project is entirely yours and that you understand and are capable of defending all the aspects of your decisions. We want to know how you approached the problem, not how an LLM does it so keep them contained and under your control.
As a link to a reproducible and self-contained repository on your preferred git platform (GitHub, GitLab, Codeberg...)
Based on our experience, we believe you shouldn't spend more than 6 hours. But ultimately, how much time you dedicate to the challenge is up to you. We will also be talking about allocated time during the interview.
Email us any questions and we will answer as soon as possible.
- In the upcoming interview we’ll focus on your coding challenge submission. We will expect you to explain your code to an audience that will include members of the Science team but also one or two people from other functional areas (Design, Front-End, Back-End, Project Manager,..). We will ask you any clarifying questions we might have.
- This will be an opportunity for you to provide some more context about the challenge, the assumptions you made, and add anything that you might want. The technical solution is not the only thing that we value, also your approach and explanations.
- Finally, we will also allocate some time for you to ask any questions about anything and everything you would like to know more about (ie. role, how we work at Vizzuality, our culture, benefits, etc.)
- The interview will last at most 120 minutes.