Setting up Rake tasks

Search icon
Search Book

What is Rake and why do we need it?

In the previous chapter, we manually created some demo users for reviewing and testing the changes we have made. What if we need to share the code with someone else and they need to review it too? For example, a peer trying to review your application.

We need some way to automate setting up the project with all required configurations and necessary data for testing.

Can't we use migration scripts for creating necessary data? No! We can't.

Why? Because the test data shouldn't be created on our production server when we deploy the application.

When we deploy applications, all migrations are automatically run as part of our pipeline.

In the case of our previous chapter, these test users are purely for setting up the staging environment.

We can solve this problem using Rake tasks.

Rake stands for Ruby Make. It's a standalone Ruby utility that "replaces the Unix utility 'make', and uses a Rakefile and .rake files to build up a list of tasks".

Basically, it is a task runner for Ruby. Rails uses Rake Proxy to delegate some of its tasks to Rake.

We have used rails db:migrate in the previous chapters. When rails db:migrate is run, what happens internally is that Rails checks if db:migrate is supported natively. In this case db:migrate is not natively supported by Rails, so Rails delegates the execution to Rake via Rake Proxy.

Setting up Rake tasks

We can also write our custom Rake tasks in Rails environment by creating files with .rake extension in ./lib/tasks.

Often when creating a new project, we need to setup some defaults, like say populating the user database with default users etc. For such cases we can write those tasks in ./lib/tasks/setup.rake. Let's add the code below in our setup.rake:

1task :populate_with_sample_data do
2  puts 'Seeding with sample data...'
3  User.create!(
4    name: 'Oliver',
5    email: '',
6    password: 'welcome',
7    password_confirmation: 'welcome'
8  )
9  puts 'Done! Now you can login with "" using password "welcome"'

It's a common practice after cloning a repository for the first time, to run ./bin/setup, to automatically fetch all the libraries, create db, seed data etc. Therefore it makes sense to invoke our setup.rake from ./bin/setup since it also plays a role in bootstrapping the project.

Remove the following lines, including the comments, from the ./bin/setup under APP_ROOT block:

1puts "\n== Preparing database =="
2system! "bin/rails db:prepare"

Then add the highlighted lines to ./bin/setup file:

1FileUtils.chdir APP_ROOT do
2  # rest of the code
4  puts "\n== Executing yarn =="
5  system!("bin/yarn")
7  puts "\n== Executing rake setup =="
8  system! "bundle exec rake setup"
10  puts "\n== Removing old logs and temp files =="
11  system! "bin/rails log:clear tmp:clear"
13  # rest of the code

Note that, if you run ./bin/setup, then most probably all your DB data will be wiped off and new seed data will be added.

It's recommended to run this setup, only for the first time that we clone a repo.

There are valid cases when we need to rerun this setup.


Let's say as a team we decided we need to modify a migration file. We shouldn't modify a migration in the first place. But let's say it happened.

Then one of the easiest ways to rerun the updated migration is by running this setup. Or you can rollback the migrations and manually commit it once again.

Executing the Rake task

Run this command to execute our Rake task:

1bundle exec rake populate_with_sample_data

You will get an error saying something like:

1Seeding with sample data...
2rake aborted!
3NameError: uninitialized constant User

This is because the Rake has no way of knowing the models and classes we have defined in our Rails environment. It isn't able to find a reference to our User without loading Rails environment.

To fix this problem, we need to add our Rails environment into the Rake task. Update setup.rake with the below code:

1task populate_with_sample_data: :environment do
2  puts 'Seeding with sample data...'
3  User.create!(
4    name: 'Oliver',
5    email: '',
6    password: 'welcome',
7    password_confirmation: 'welcome'
8  )
9  puts 'Done! Now you can login with "" using password "welcome"'

Now, we can run the command again:

1bundle exec rake populate_with_sample_data

You can see our previous error has disappeared. You might see another error that might look like this:

1Seeding with sample data...
2rake aborted!
3ActiveRecord::RecordInvalid: Validation failed: Email has already been taken

This is perfectly fine and our Rake task is running as expected. We see this error because we already have some demo users in our table with the same email. Adding a new User with the same email would violate the unique constraint we have enforced on our database.

If you get the email validation error, then it means that you can safely move forward to the next section.

If you want to see the validation pass, then change the emails of the sample users in the Rake setup file and run the command once again.

Finalizing the Rake setup

Let us add another task, that destructively recreates the database. This will avoid problems like the one we have encountered in the previous section, which is caused by already existing data.

One more key thing to take care is to ensure that we don't drop the database in production environment. We will be uploading this file to production and staging environment.

But we will be running the rake task only in the staging environment. The reason is that staging environment is where we test out stuff and thus it acts as a simulated production environment. While testing it's useful for us to have some defaults like the default user logins.

Update the setup.rake file with the following lines of code:

1desc 'drops the db, creates db, migrates db and populates sample data'
2task setup: [:environment, 'db:drop', 'db:create', 'db:migrate'] do
3  Rake::Task['populate_with_sample_data'].invoke if Rails.env.development?
6task populate_with_sample_data: [:environment] do
7  if Rails.env.production?
8    puts "Skipping deleting and populating sample data in production"
9  else
10    create_sample_data!
11    puts "sample data has been added."
12  end
15def create_sample_data!
16  puts 'Seeding with sample data...'
17  create_user! email: '', name: 'Oliver'
18  create_user! email: '', name: 'Sam'
19  puts 'Done! Now you can login with either "" or "", using password "welcome"'
22def create_user!(options = {})
23  user_attributes = { password: 'welcome', password_confirmation: 'welcome' }
24  attributes = user_attributes.merge options
25  User.create! attributes

Running this Rake task will drop our database and recreate it from scratch with the demo data.

If you need to preserve your old data, backup the database file db/development.sqlite3 and db/test.sqlite3. But if you're using PostgreSQL, then this trick won't work. You'd have to dump the DB manually and reuse it.

Run this command to execute our Rake setup:

1bundle exec rake setup

The above command should output something similar in the console:

1Dropped database 'db/development.sqlite3'
2Dropped database 'db/test.sqlite3'
3Created database 'db/development.sqlite3'
4Created database 'db/test.sqlite3'
5== 20210104080645 CreateTasks: migrating ======================================
6-- create_table(:tasks)
7   -> 0.0014s
8== 20210104080645 CreateTasks: migrated (0.0015s) =============================
10== 20210106115906 MakeTitleNotNullable: migrating =============================
11-- change_column_null(:tasks, :title, false)
12   -> 0.0086s
13== 20210106115906 MakeTitleNotNullable: migrated (0.0086s) ====================
15== 20210108115051 CreateUser: migrating =======================================
16-- create_table(:users)
17   -> 0.0014s
18== 20210108115051 CreateUser: migrated (0.0015s) ==============================
20Seeding with sample data...
21Done! Now you can login with either "" or "", using password "welcome"

The Rails server should be restarted so that the latest data will be loaded.

Importance of ./bin/setup file

The ./bin/setup file helps in setting up the project for development env in one go. When we are setting up a new project in the machine then a lot of steps like creating a database, performing migrations, seeding the database, installing dependencies, etc. need to be taken care of. We define all these processes in the bin/setup file and now we just need to execute the bin/setup file to setup the project's base.

The bin/setup file performs the following operations:

  • Install all the dependencies

  • Create the database.yml file, if not already exists

  • Execute the rake setup task which creates the db, migrates all the migrations, and populates the sample data.

  • Clear the logs and the temp files

  • Restart the Rails server

We can add more operations in the setup file if needed to setup the project.

At BigBinary, the base of all projects should get setup if we run ./bin/setup file. The developer should not be forced to manually perform migrations or add seed data while setting up the development env. All they have to do is run ./bin/setup file.

Dealing with stale Active Record cache

Consider a scenario where a new column or a table is created using a migration and right after the new column or table is populated with some seed data. When the migration runs to add the seed data, Rails will throw an exception that the new column or the table doesn't exist. This is because Rails will use the schema from the Active Record cache which does not get updated before all the migrations finish running.

1class CreateJobLevels < ActiveRecord::Migration
2  def up
3    create_table :job_levels do |t|
4      t.integer :id
5      t.string :name
7      t.timestamps
8    end
10    %w{assistant executive manager director}.each do |type|
11      JobLevel.create(name: type)
12    end
13  end
15  def down
16    drop_table :job_levels
17  end

In the above example, Rails will throw an exception when new records are created in the JobLevels table because when Rails tries to create the records, it will use the stale Active Record data which doesn't contain the JobLevel table.

This issue can also occur when migrations are run during a chained task. For example, when running something like ./bin/setup or rails db:migrate db:seed the schema is cached and sometimes the newly added columns or tables are not reflected in the cached schema.

According to the official Rails guide, we should invoke reset_column_information method to reset all the cached information about columns, which will cause them to be reloaded on the next request.

You can invoke the reset_column_information method in the up method of the migration where a new column or a new table is created and simultaneously data is being added to that column. This will ensure that the latest schema is loaded before the request to add data to the new column is processed and any errors due to stale Active Record cache can be avoided.

The example above can be fixed like so:

1class CreateJobLevels < ActiveRecord::Migration
2  def up
3    create_table :job_levels do |t|
4      t.integer :id
5      t.string :name
7      t.timestamps
8    end
10    reset_column_information
12    %w{assistant executive manager director}.each do |type|
13      JobLevel.create(name: type)
14    end
15  end
17  def down
18    drop_table :job_levels
19  end

Note that the CreateJobLevels migration is only illustrated to explain the use case of resets_column_information method. Please do not add this migration to the Granite application.

The schema cache can also be cleared by running the rails db:schema:cache:clear or rails db:schema:cache:dump command from the console. But in a production environment it would not be feasible to do so because you cannot pause the migrations to update the cache before the next request is processed. Hence using the reset_column_information method is the right way to fix this issue.

We have successfully setup the Rake tasks for this project.

How to add TODO comments?

At BigBinary, we write code in a self-explanatory way. This means the code is more or less like reading an English sentence. Thus it would be self-explanatory if the sentence makes sense. That's why we don't use comments for explaining the code. More than that comments don't scale. If we add a detailed comment pointing out how a utility function works and how it can impact other functions based on top of it, then once there is a change to this utility function, we would have to take the effort to change all the comments which utilize this function. Thus it doesn't make sense to add comments to explain the working of the code, unless and until there's no other way to convey the same.

But there are cases where comments make sense. Example: We can add some comments for the work that needs to be worked upon in the future. Such comments should be prefixed with the TODO: keyword. This helps in easily jumping into the comments that need to be fixed.

For example:

1# Wrong
2# Update rake task to populate articles
4# Correct
5# TODO: Update rake task to populate articles

There are many ways for navigating through TODO: comments like the rake notes task as mentioned in this thread. This command will return the list of all TODO comments from the Ruby files. We can easily navigate to each of the TODO comments from our text editors too. In VSCode we can install the Todo Tree extension to jump through the TODO comments.

Let's commit the changes:

1git add -A
2git commit -m "Added rake tasks"