Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed Cron system with Prometheus monitoring (Main) #55

Open
4 tasks
berkeli opened this issue Dec 5, 2022 · 0 comments
Open
4 tasks

Distributed Cron system with Prometheus monitoring (Main) #55

berkeli opened this issue Dec 5, 2022 · 0 comments
Assignees
Labels
Large day or more
Milestone

Comments

@berkeli
Copy link
Owner

berkeli commented Dec 5, 2022

Extras:

  • Kafka Chaos

    Try running multiple Kafka brokers and Zookeeper servers with our producers and consumers (using another of the conduktor/kafka-stack-docker-compose) configurations. Experiment with downing Kafka and Zookeeper containers.

    How many containers being down can our system tolerate?

    What happens to the Kafka system logs and the metrics that our binaries export? Did our alerts fire? If not, consider how they could be improved - remember, the point of them is to tell us when something's wrong!

  • Dealing with long-running jobs and load (challenging)

    What does our system do if someone submits a very long-running job? Try testing this with the sleep command.

    If this is an issue for the stable operation of our system, or for running jobs in a timely fashion, what can we do about this?

    If your system had problems, did our alerts fire?

    How can we prevent our consumers getting overloaded if compute-intensive jobs are submitted?

  • Security using Firecracker VMs (challenging)

In an earlier note it was mentioned that there are security issues with simply exec-ing code in this way.

A better solution would be to use a [Firecracker VM](https://github.com/firecracker-microvm/firecracker/) to run the cron commands. Firecracker is an open-source virtualization technology that lets us start lightweight virtual machines very quickly and cheaply. It was developed at AWS to support services like AWS Lambda.

Here are some demos and examples of projects built with Firecracker:

https://stanislas.blog/2021/08/firecracker/
https://jvns.ca/blog/2021/01/23/firecracker--start-a-vm-in-less-than-a-second/
There is a [Firecracker SDK for Golang](https://github.com/firecracker-microvm/firecracker-go-sdk). If you have a significant amount of extra time available, updating the system to run commands in Firecracker VMs instead of exec-ing the commands provided would be a very good challenge.
@berkeli berkeli converted this from a draft issue Dec 5, 2022
@berkeli berkeli self-assigned this Dec 5, 2022
@berkeli berkeli added the Large day or more label Dec 5, 2022
@berkeli berkeli added this to the Sprint 4 milestone Dec 5, 2022
@berkeli berkeli changed the title Distributed Cron system with Prometheus monitoring Distributed Cron system with Prometheus monitoring (Main) Dec 5, 2022
@berkeli berkeli moved this from Todo to In Progress in Immersive Go Course Dec 8, 2022
@berkeli berkeli moved this from In Progress to In Review in Immersive Go Course Dec 9, 2022
@berkeli berkeli moved this from In Review to Done in Immersive Go Course Dec 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Large day or more
Projects
Status: Done
Development

No branches or pull requests

1 participant