Learn Ruby on Rails Book

Background job processing using Sidekiq

Tasks like sending an email, sending SMS, generating PDF, generating excel file etc can be time consuming for the server. While the server is trying to do these things the server can't process any other request. This reduces the throughput of the server. This would mean we would need a lot more servers to process incoming requests. A better strategy is to take the request and store it to be processed in the background. In this way server can immediately send a response when someone requests for say "reset password". The user gets a notification the "password email is on its way". In the background the server can process all background jobs one by one.

Rails provides Active Job to process background jobs and making them run on a variety of queueing applications.

Choosing a backend adapter

Active Job has a built-in support for multiple queueing backends. Some of the prominent queueing backends are Sidekiq, Resque and Delayed Job. Queue adapters section of the Rails guides has detailed information about all the queueing adapters supported by Rails by default.

We'll use Sidekiq as the queueing adapter for this application. Sidekiq uses Redis to store all the operational data. So let's setup Redis on the development machine. If you're using macOS, then Redis will be installed by the following command:

1brew install redis

Start the Redis server:

1brew services start redis

Now Redis should be up an running.

Open Gemfile and add the following line:

1gem "sidekiq"

Install the gem:

1bundle install

Now open config/application.rb file. You should find a block of code where all the configurations are set. Add the following line in that block:

1config.active_job.queue_adapter = :sidekiq

Configuring Sidekiq

Now let's add an initializer which can be used to configure Sidekiq to interact with our Redis queue:

1touch config/initializers/sidekiq.rb

Add the following lines to sidekiq.rb:

1Sidekiq.configure_client do |config|
2  config.redis = { url: ENV['REDIS_URL'], size: 4, network_timeout: 5 }
3end
4
5Sidekiq.configure_server do |config|
6  config.redis = { url: ENV['REDIS_URL'], size: 4, network_timeout: 5 }
7end

Here the client is our passenger, which is puma running behind rails, and can be literally termed as anything that pushes jobs to Redis. The server is the sidekiq process which pulls jobs from Redis. That means when deploying, our web dynos in Heroku will use a max of size number of connections to push jobs to Redis, no matter how many threads they have.

The initializer is meant for more complicated config which requires Ruby, for instance the Redis connection info or custom middleware.

Now let's add a config/sidekiq.yml, which is meant to be a persistent config for all options we can pass to sidekiq:

1touch "config/sidekiq.yml"

Add the following to that file:

1development:
2  :concurrency: 1
3production:
4  :concurrency: 1
5:queues:
6  - default

We can change concurrency based on our needs. default is the name of the queue.

Next we need to find Redis URL. In most systems the Redis URL is redis://127.0.0.1:6379/12.

If we need to confirm that the port is correctly set, run the command redis-cli info and check for the field called tcp_port.

Open a new tab in terminal app since we want to run Sidekiq in addition to the Rails server that is already running. In the new tab of the terminal execute the following command:

1REDIS_URL="redis://127.0.0.1:6379/12" bundle exec sidekiq -e development -C config/sidekiq.yml

The REDIS_URL env variable needs to be explicitly passed in local development environments. In production environment this value is set using some settings file.

Another way to persist the REDIS_URL is to run export REDIS_URL="redis://127.0.0.1:6379/12" every time a new shell is created. To automate it we can add this line to our ~/.bashrc or ~/.zshrc.

In most systems the above command will make Sidekiq work with Redis without any problems.

Redis URL changes from system to system. Thus if the above command doesn't work, then use the following:

1bundle exec sidekiq -e development -C config/sidekiq.yml

Creating a job

Run the following command on your terminal:

1rails generate job task_logger
1create    test/jobs/task_logger_job_test.rb
2create  app/jobs/task_logger_job.rb

You'll notice that it creates two files task_logger_job.rb inside app/jobs directory and its corresponding test file task_logger_job_test.rb inside test/jobs directory.

The job file should look like this:

1class TaskLoggerJob < ApplicationJob
2end

ApplicationJob, just is an abstract class where we define configurations for all the jobs.. It inherits ActiveJob::Base. In short, it's analogous to ApplicationRecord that we have in case of models.

ActiveJob internally invokes a method named perform. The method perform is responsible for executing the entire business logic of the job. Let's add perform method inside our TaskLoggerJob class:

1class TaskLoggerJob < ApplicationJob
2  def perform
3    puts "TaskLoggerJob is performed"
4  end
5end

We've just defined a new job.

Executing the job

Open your Rails console using rails c. Now as you'd expect we can execute the perform method that we've defined inside TaskLoggerJob just like any other instance method of a class. So let's see if that works.

1TaskLoggerJob.new.perform
2> "TaskLoggerJob is performed"

Great this works. But you might wonder how this job is different from any other Ruby class. The answer is that this class provides an ability to enqueue this job to the backend queue. Now in your Rails console, run the following:

1TaskLoggerJob.perform_later
2> Enqueued TaskLoggerJob

Notice the output here. Instead of printing the message from perform method of the job, it shows that the job is enqueued. Now go to the Sidekiq window where Sidekiq is running. You should observe that this job had run there and the message "TaskLoggerJob is performed" is printed there.

Let's see what has happened here. We've called a method perform_later, which is a class method available on subclass of ActiveJob::Base. The already configured the application to work with Sidekiq. The details of the job are stored in the Redis. Sidekiq picks up the job when it's available and executes the perform method. upon availability and executes the perform method.

We can also define when we want to run the job by using set option:

1# By providing `wait_until` option, we are asking
2# to not perform the job before the end of the day.
3TaskLoggerJob.set(wait_until: Date.today.end_of_day).perform_later
1# By providing `wait` option we are asking
2# to perform after 1 minute.
3TaskLoggerJob.set(wait: 1.minute).perform_later

Active Job callbacks

There could be cases when we want to execute the perform method synchronously. Calling perform_now method executes the job instantly:

1TaskLoggerJob.perform_now
2"TaskLoggerJob is performed"

Is there any difference between behaviors of perform and perform_now? The answer is yes. The method perform_now is wrapped by the Active Job callbacks. Similar to controllers and models, we can define the following callbacks inside our jobs:

1before_enqueue
2around_enqueue
3after_enqueue
4before_perform
5around_perform
6after_perform

Let's add a before_perform and after_perform in our TaskLoggerJob class as follows:

1class TaskLoggerJob < ApplicationJob
2  # ... existing code
3  before_perform :print_before_perform_message
4  after_perform :print_after_perform_message
5
6  def print_before_perform_message
7    puts "Printing from inside before_perform callback"
8  end
9
10  def print_after_perform_message
11    puts "Printing from inside after_perform callback"
12  end
13end

Now reload Rails console and compare results between perform and perform_now:

1TaskLoggerJob.new.perform
2"TaskLoggerJob is performed"
3
4
5TaskLoggerJob.perform_now
6"Printing from inside before_perform callback"
7"TaskLoggerJob is performed"
8"Printing from inside after_perform callback"

Here we invoked perform_now and we are seeing the messages being printed in Rails console. This is because the task is being performed synchronously in Rails console itself.

If we were to invoke perform_later then we would not be seeing all those messages in Rails console. We would see the message on Sidekiq window.

The behavior however will be different in case of before_enqueue and after_enqueue callbacks. Since perform_now is run synchronously and there is no enqueueing of job, defining these callbacks will have no effect when perform_now is used.

Let's verify this behavior. Inside TaskLoggerJob, remove all existing callbacks and add the following code:

1class TaskLoggerJob < ApplicationJob
2  before_enqueue :print_before_enqueue_message
3  after_enqueue :print_after_enqueue_message
4
5  def perform
6    puts "TaskLoggerJob is performed"
7  end
8
9  def print_before_enqueue_message
10    puts "Printing from inside before_enqueue callback"
11  end
12
13  def print_after_enqueue_message
14    puts "Printing from inside after_enqueue callback"
15  end
16end

Run the following code on Rails console:

1TaskLoggerJob.perform_now
2"TaskLoggerJob is performed"
3
4
5TaskLoggerJob.perform_later
6"Printing from inside before_perform callback"

You'll notice that the before_enqueue message got printed only in case when perform_later was called.

Using job to log task attributes

Now let's use our TaskLoggerJob to actually log something. Let's log the details of the task after the task got created:

1class Task < ApplicationRecord
2  after_create :log_task_details
3
4  # Existing code ...
5
6  def log_task_details
7    TaskLoggerJob.perform_later(self)
8  end
9end

Notice that we have passed an argument to perform_later method. The method perform, that we manually define inside the job can take any number and any type of argument. In the above case, we are considering a task record as an argument to perform action.

So let's clear all the actions that we have added in TaskLoggerJob and define perform action as follows:

1class TaskLoggerJob < ApplicationJob
2  def perform(task)
3    puts "Created a task with following attributes :: #{task.attributes}"
4  end
5end

Let's also mention the default queue in which the task job needs to be run and the number of retries in case of failure:

1class TaskLoggerJob < ApplicationJob
2  sidekiq_options queue: :default, retry: 3
3  queue_as :default
4
5  def perform(task)
6    puts "Created a task with following attributes :: #{task.attributes}"
7  end
8end

Now create a task in the browser and notice that the log is printed in the Sidekiq window printing all the attributes of the newly created task.

Let's make use of the queue adapters that we've defined and send email notifications in our next chapter.

Testing Sidekiq jobs

To write a Sidekiq test, first we need to require sidekiq/testing in task_logger_job_test.rb:

1require "test_helper"
2require "sidekiq/testing"
3class TaskLoggerJobTest < ActiveJob::TestCase
4end

Sidekiq provides a few modes of testing our workers. These are Sidekiq::Testing.inline!, Sidekiq::Testing.fake! and Sidekiq::Testing.disable!.

Requiring sidekiq/testing will automatically initialize the Sidekiq::Testing.fake! mode, where fake is the default mode.

For the purpose of testing the TaskLoggerJob, let's create a Log model which will store the message and id of the task:

1bundle exec rails g model log

Add the following into the migration file with suffix _create_logs.rb, which is automatically generated by running above command:

1class CreateLogs < ActiveRecord::Migration[6.1]
2  def change
3    create_table :logs do |t|
4      t.integer :task_id
5      t.text :message
6      t.timestamps
7    end
8  end
9end

Let's run the migration:

1bundle exec rails db:migrate

Let's add validations to the Log model:

1class Log < ApplicationRecord
2  validates :message, presence: true
3  validates :task_id, presence: true
4end

Sometimes the tests fail because the Test db is not prepared. In case of such error run the following command:

1bundle exec rails db:test:prepare

Update the perform method in TaskLoggerJob:

1  def perform(task)
2    msg = "A task was created with the following title: #{task.title}"
3    log = Log.create! task_id: task.id, message: msg
4
5    puts log.message
6  end

We know that our job needs to be performed once we create a new task. So it's necessary to have a reference to a task in our test cases. Let's create that in our setup function, which gets run initially:

1  def setup
2    @task = create(:task)
3  end

Let's add a fake test, which doesn't rely on the response or side effects part of our job:

1  test 'logger runs once after creating a new task' do
2    assert_enqueued_with(job: TaskLoggerJob, args: [@task])
3    perform_enqueued_jobs
4    assert_performed_jobs 1
5  end

This fake testing mode operates in a way in which jobs are queued up in an array rather than being executed immediately. Jobs within the queue can be queried, inspected, and optionally “drained” to process enqueued jobs. This mode is activated(or is set by default) simply with the fake! directive. Testing this way promotes decoupled and faster tests, as the worker doesn’t have to perform any actual work. But using this mode isn’t appropriate for full on integration testing or situations where you want to process jobs during a test.

It is good for testing that the jobs have been enqueued properly and in other scenarios which don't require the result of execution.

But when working on real applications, these jobs and workers perform some side effects and we would be required to assure that those are performed as intended. In such scenarios we can use the inline mode that runs the job immediately instead of enqueuing it. Inline testing mode performs enqueued jobs synchronously within the same process. So let's use inline mode for real testing which needs results of jobs after executing the worker.

By default the mode is fake!. In most tests, we would require a mix of the both the inline! as well as fake! modes. So let's set the inline mode within the block/method as follows and perform the job:

1  test 'log count increments on running task logger' do
2    Sidekiq::Testing.inline!
3    assert_difference "Log.count", 1 do
4      TaskLoggerJob.new.perform(@task)
5    end
6  end

The above test ensures that our new log message/entry is added to the Log table and thus it's count is incremented when we perform the TaskLoggerJob.

The final test file should look like this:

1require "test_helper"
2require "sidekiq/testing"
3
4class TaskLoggerJobTest < ActiveJob::TestCase
5  def setup
6    @task = create(:task)
7  end
8
9  test 'logger runs once after creating a new task' do
10    assert_enqueued_with(job: TaskLoggerJob, args: [@task])
11    perform_enqueued_jobs
12    assert_performed_jobs 1
13  end
14
15  test 'log count increments on running task logger' do
16    Sidekiq::Testing.inline!
17    assert_difference "Log.count", 1 do
18      TaskLoggerJob.new.perform(@task)
19    end
20  end
21end

Voila! That's it. Now we have an idea on how to test our Sidekiq jobs. Sidekiq has some of its own assertion methods and goodies. You can refer the official documentation to know more.

As a final note, just for understanding, the disabled mode signifies that Sidekiq is not in testing mode. Thus the jobs are pushed to Redis. We don't use this mode often while testing unless we want to mock the Redis part too.

Clearing Sidekiq queues

Open the terminal and run the following commands to clear Sidekiq queues:

1redis-cli flushdb
2redis-cli flushall

Usually, when a Sidekiq test fails, it will automatically retry in the next invocation since its state is taken from Redis.

Thus if we don't want it to run again, we can use the above commands to clear the queues.

flushdb command is used to delete all the keys of the currently selected DB, while flushall command removes all the keys of all the existing databases in Redis.

These commands come in handy during testing, where some Sidekiq inline tests behave weirdly.

Now let's commit changes made in this chapter:

1git add -A
2git commit -m "Configured Sidekiq for background job processing"
⌘K
    to navigateEnterto select Escto close
    Previous
    Next