RubyJob
RubyJob is a framework for running jobs.
The current version behaves much like Sucker Punch, in that it only supports an In-Memory Job Store implemented through a fast Fibonacci Heap.
The initial version runs 200% faster than Sucker Punch, capable of processing 1,000,000 simple jobs in 28 seconds vs. Sucker Punch’s 59 seconds (measured on a MacBook Pro 2.3GHz with 16GB of RAM).
Additional features are in the works, including:
- Support for external configuration of multiple queues & queue priorities
- Persistent Job Stores for:
- Redis
- Cassandra
- Batches & Job nesting
- Make retries more thread efficient, by avoiding
sleep
calls
Installation
Add this line to your application’s Gemfile:
gem 'ruby_job'
And then execute:
$ bundle
Or install it yourself as:
$ gem install ruby_job
Usage
A simple example
Define your worker class
class MyWorker
include RubyJob::Worker
def perform
#job code goes here
end
end
Setup your JobStore
MyWorker.jobstore = RubyJob::InMemoryJobStore.new
MyWorker.perform_async
Run your server
server = RubyJob::ThreadedServer.new(num_threads: 10, jobstore: MyWorker.jobstore)
server_thread = server.start
server_thread.join
Job Stores
A Job Store is an abstraction which allows us to keep track of the various jobs your application wants to run.
The abstract class JobStore
defines the following methods (each of which raises NotImplementedError
) that
must be defined in the subclass:
- enqueue(job)
- dequeue(job)
- fetch
- size
- pause_at(time)
- next_uuid
#enqueue(job)
The enqueue
method is responsible for adding the specified job to the Job Store. It is an error to
attempt to enqueue a job that is already enqeueued.
#dequeue(job)
The dequeue
method is responsible for removing the specified job from the Job Store. It is an error to
attempt to dequeue a job that has never been enqueued, or has been dequeued.
#fetch
The fetch
method is responsible for fetching the next job that needs to run from the Job Store.
The “next job to run” is defined as being the job with the earliest start_at
time that:
- is less than or equal to
Time.now
- is less than or equal to the time specified by the most recent invocation of
#pause_at
When no job matches these conditions, fetch
will wait until such conditions are met, by sleeping
the amount of time specified by the :wait_delay
option and retrying, if the :wait
option is set, or
will return nil
otherwise.
size
The size
method returns the number of jobs presently being tracked in the Job Store.
pause_at(time)
The pause_at
method effects the behaviour of #fetch
, as defined above. Essentially, it causes
the Job Store to behave as if it’s empty when the time specified is reached. Passing nil
unpauses the Job Store.
next_uuid
The next_uuid
method must return a unique identifier that will be assigned to the next job to be
enqueued to the Job Store. The identifier must be unique across the timespan that the Job Store
guarantees job tracking. For example, in the InMemoryJobStore
implementation, the next_uuid
is simply an auto-incrementing integer stored in the JobStore’s instance itself. This is sufficient
because the InMemoryJobStore
only guarantees tracking of jobs during the lifespan of the currently
running process. In an implementation that were to guarantee, say, tracking across server restarts
for many weeks or months in a high-volume environment, the identifier would likely need to be closer
to a true universally unique identifier.
Note: Due to its dynamic and non-statically-typed nature, Ruby doesn’t provide true abstract classes,
but implementing the JobStore
class this way does help simplify and improve tests for classes that have
dependencies on JobStore
subclasses. In particular, by leveraging RSpec’s verify_partial_doubles
capabilities,
tests can mock a JobStore
instance and rely on RSpec to verify that only valid methods have been called.
Setting up the default JobStore
Jobs are enqueued to the default JobStore of the worker class:
MyWorker.jobstore = RubyJob::InMemoryJobStore.new # attach the JobStore to the MyWorker class
If the worker class doesn’t have a JobStore attached to it, jobs will be enqueued to Worker.jobstore
.
Worker.jobstore = RubyJob::InMemoryJobStore.new # jobs will be queued here, if MyWorker doesn't have `jobstore` set.
Enqueuing jobs
There are 2 ways you can enqueue your jobs:
Using #perform_* (recommended approach)
MyWorker.jobstore = RubyJob::InMemoryJobStore.new
MyWorker.perform_async # will enqueue on `MyWorker.jobstore`, or `Worker.jobstore` if the former isn't set.
Note: you must ensure either MyWorker.jobstore
or Worker.jobstore
is set to a valid JobStore.
Using Job#enqueue
MyWorker.jobstore = RubyJob::InMemoryJobStore.new
job = Job.new(worker_class_name: 'MyWorker', args: [], start_at: Time.now)
job.enqueue
Dequeuing jobs
In some situations, it’s important to remove a previously enqueued job from the queue, so that it does not run in the future. To do so:
job.dequeue
Job arguments
Your Job class’ #perform
method signature is:
def perform(*args)
end
When you invoke #perform_async
(or similar methods), the arguments passed in will get sent to #perform
.
For example, MyWorker.perform_async(1, 'hello world!', x: 7)
will end up calling perform(1, 'hello world!', x: 7)
when the job runs.
Note: Whether and how the arguments are serialized depends on the JobStore being used. RubyJob::InMemoryJobStore
, the only JobStore
currently shipped out of the box with this gem, has no need to serialize the arguments, given that everything runs in a single operating system
process. However, keep in mind that the Job
class, which is used to represent instances of jobs to run, defines methods #to_json
and
.json_create(hash)
which use JSON.dump
and JSON.parse
, respectively, to marshall the arguments. If you’re going to implement
your own JobStore, feel free to avail yourself of these methods.
Pro Tip: In order to ensure your job code is portable across different JobStore implementations (e.g. in case at some point you think you’ll need a persistent backing store such as Redis or Cassandra to keep track of your mission critical jobs), ensure the arguments you pass serialize and deserialize as you’d expect.
Schedule a Job for execution (asynchronously)
Note: Jobs are scheduled to nearest millisecond of the specified start time.
Immediately (ASAP)
MyWorker.perform_async # schedule to run asynchonously, asap
Delayed
MyWorker.perform_in(5.5) # schedule to run asynchonously, in 5.5 seconds
At a specific time
MyWorker.perform_at(a_particular_time) # schedule to run asynchonously, at the specified time
Executing a Job immediately (synchronously)
MyWorker.perform # run the job synchronously now
Threaded Server (the job processor)
A threaded server is provided to process the queued jobs. It is instantiated by specifying the number of workers (threads) to spawn, and the JobStore it will be processing.
server = RubyJob::ThreadedServer.new(num_threads: 10, jobstore: MyWorker.jobstore)
Server options
server.set(wait: true, wait_delay: 0.5)
wait
[boolean]: determines whether the server should wait or exit when there aren’t any processable jobs in the queue. Defaults totrue
.wait_delay
[float]: if the server is going to wait, the number of seconds to delay before looking for jobs again. Defaults to0.5
.
Note: The wait
/wait_delay
parameters apply independently to each worker thread.
Starting the server
Queued jobs will only run when a Server, attached to the JobStore the jobs have been enqueued to, has been started.
server_thread = server.start
server_thread.join # if needed, depending on your use case
Halting the server
A running server can be halted as follows:
server.halt_at(Time.now + 5)
server.halt # equivalent to halt_at(Time.now)
Halting
causes the server to stop processing jobs scheduled to start after the specified halt time. Once the halt time has been
reached, the server waits if the wait
option is true
, or exits otherwise.
Halting the server can be useful in production, when you want to temporarily pause job processing.
Resuming the server
A halted server can be resumed with:
server.resume
server.resume_until(Time.now + 5) # equivalent to: resume && halt_at(Time.now + 5)
With resume
, the server picks up jobs from where it left off and keeps processing them as if it never stopped. Note that a server
that’s been halted for a significant amount of time will pick up old jobs that may have been intended to start significantly in the past, so
ensure you take that into account in your job processing code if you care about this situation.
Retries
By default, jobs that raise errors will be not be retried by default. To have jobs retry, the worker class must define a retry?
method that
returns a tuple indicating whether the job should be retried, and how long the retry delay should be: [do_retry, retry_delay]
MAX_RETRIES = 5
INITIAL_RETRY_DELAY = 0.5
def retry?(attempt:, error:)
# determine whether a retry is required, based on the attempt number and error passed in
do_retry = error.is_a?(MyRetriableError) && (attempt < MAX_RETRIES)
[do_retry, INITIAL_RETRY_DELAY * 2**(attempt-1)] # exponential backoff
end
attempt
starts at 1
and error
is the exception that was raised by the last attempt.
Note: the current implementation uses sleep
to implement the retry delay. This isn’t ideal, as it prevents the thread
processing the job from servicing another job that’s ready to run. In the future, this will be changed such that the job
is put back onto the job queue to start at a later time. Feel free to put together a PR if you’re interested in seeing this
change sooner rather than later.
Note: the retry delay is the time between the end of the last attempt and the start of the new attempt
Blog Posts
Development
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an
interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and
then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/mimperatore/ruby_job. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
License
The gem is available as open source under the terms of the GNU Lesser General Public License Version 3 (LGPLv3).
Code of Conduct
Everyone interacting in the RubyJob project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.
Author
Marco Imperatore, CEO, i-Clique Inc.
- Twitter: @marcoimperatore
- LinkedIn: @marcoimperatore