long running processes

Hi,

Currently we have a specific topic/consumer that can run for more then 1 or 2 days (we have a lot of data coming from our IOT devices) and until now we've been doing this with sidekiq. Since we already have kafka in place it makes sense to move our sidekiq backgrounds jobs to apache kafka also...but is it a good approach? We think so but let me know, it would reduce our technology stack.

From what I've found there's a lot of "issues" with kafka for running long processes and we can see that in this article:

https://medium.com/codex/dealing-with-long-running-jobs-using-apache-kafka-192f053e1691

I need some pointers how to do it with racecar. 

In the article the second possible solution is to increase the timeout and in racecar we have it as:

max_poll_interval 

Is this type of config only available to all consumers or can it be specific for one? Can't find a way to do it specifically! 

Third solution is to call stop and resume consumer or calling poll in a different thread to keep the consumer alive. I've tried these two approaches but both of them seem to fail somehow:

```
def process(message)
  @consumer.pause(message.topic, message.partition, message.offset)
  # execute something...
  @consumer.resume(message.topic, message.partition)
end
```

```
def process(message)
  running = true
  calling_home = Thread.new do
    while running
      @consumer.poll
      sleep 0.1
    end
  end
  # execute something...
  running = false
  calling_home.join
end
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

long running processes #278

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

long running processes #278

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions