Back
Chapters

Setting up Rake tasks

Search icon
Search Book
⌘K

What is Rake and why do we need it?

In the previous chapter, we manually created some demo users for reviewing and testing the changes we have made. What if we need to share the code with someone else and they need to review it too? For example, a peer trying to review your application.

We need some way to automate setting up the project with all required configurations and necessary data for testing.

Can't we use migration scripts for creating necessary data? No! We can't.

Why? Because the test data shouldn't be created on our production server when we deploy the application.

When we deploy applications, all migrations are automatically run as part of our pipeline.

In the case of our previous chapter, these test users are purely for setting up the staging environment.

We can solve this problem using Rake tasks.

Rake stands for Ruby Make. It's a standalone Ruby utility that "replaces the Unix utility 'make', and uses a Rakefile and .rake files to build up a list of tasks".

Basically, it is a task runner for Ruby. Rails uses Rake Proxy to delegate some of its tasks to Rake.

We have used rails db:migrate in the previous chapters. When rails db:migrate is run, what happens internally is that Rails checks if db:migrate is supported natively. In this case db:migrate is not natively supported by Rails, so Rails delegates the execution to Rake via Rake Proxy.

Setting up Rake tasks

We can also write our custom Rake tasks in Rails environment by creating files with .rake extension in ./lib/tasks.

Often when creating a new project, we need to setup some defaults, like say populating the user database with default users etc. For such cases we can write those tasks in ./lib/tasks/setup.rake. Let's add the code below in our setup.rake:

1task :populate_with_sample_data do
2  puts 'Seeding with sample data...'
3  User.create!(
4    name: 'Oliver',
5    email: 'oliver@example.com',
6    password: 'welcome',
7    password_confirmation: 'welcome'
8  )
9  puts 'Done! Now you can login with "oliver@example.com" using password "welcome"'
10end

It's a common practice after cloning a repository for the first time, to run ./bin/setup, in order to automatically fetch all the libraries, create db, seed data etc. Therefore it makes sense to invoke our setup.rake from ./bin/setup since it also plays a role in bootstrapping the project.

Add the following lines to ./bin/setup under APP_ROOT block which already exists:

1puts "\n== Setting up the app =="
2system! 'bundle exec rake setup'

Note that, if you run ./bin/setup, then most probably all your DB data will be wiped off and new seed data will be added.

It's recommended to run this setup, only for the first time that we clone a repo.

There are valid cases when we need to rerun this setup.

Example:

Let's say as a team we decided we need to modify a migration file. We shouldn't modify a migration in the first place. But let's say it happened.

Then one of the easiest ways to rerun the updated migration is by running this setup. Or you can rollback the migrations and manually commit it once again.

Executing the Rake task

Run this command to execute our Rake task:

1bundle exec rake populate_with_sample_data

You will get an error saying something like:

1Seeding with sample data...
2rake aborted!
3NameError: uninitialized constant User
4....

This is because the Rake has no way of knowing the models and classes we have defined in our Rails environment. It isn't able to find a reference to our User without loading Rails environment.

To fix this problem, we need to add our Rails environment into the Rake task. Update setup.rake with the below code:

1task populate_with_sample_data: :environment do
2  puts 'Seeding with sample data...'
3  User.create!(
4    name: 'Oliver',
5    email: 'oliver@example.com',
6    password: 'welcome',
7    password_confirmation: 'welcome'
8  )
9  puts 'Done! Now you can login with "oliver@example.com" using password "welcome"'
10end

Now, we can run the command again:

1bundle exec rake populate_with_sample_data

You can see our previous error has disappeared. You might see another error that might look like this:

1Seeding with sample data...
2rake aborted!
3ActiveRecord::RecordInvalid: Validation failed: Email has already been taken
4....

This is perfectly fine and our Rake task is running as expected. We see this error because we already have some demo users in our table with the same email. Adding a new User with the same email would violate the unique constraint we have enforced on our database.

If you get the email validation error, then it means that you can safely move forward to the next section.

If you want to see the validation pass, then change the emails of the sample users in the Rake setup file and run the command once again.

Finalizing the Rake setup

Let us add another task, that destructively recreates the database. This will avoid problems like the one we have encountered in the previous section, which is caused by already existing data.

One more key thing to take care is to ensure that we don't drop the database in production environment. We will be uploading this file to production and staging environment.

But we will be running the rake task only in the staging environment. The reason is that staging environment is where we test out stuff and thus it acts as a simulated production environment. While testing it's useful for us to have some defaults like the default user logins.

Update the setup.rake file with the following lines of code:

1desc 'drops the db, creates db, migrates db and populates sample data'
2task setup: [:environment, 'db:drop', 'db:create', 'db:migrate'] do
3  Rake::Task['populate_with_sample_data'].invoke if Rails.env.development?
4end
5
6task populate_with_sample_data: [:environment] do
7  if Rails.env.production?
8    puts "Skipping deleting and populating sample data in production"
9  else
10    create_sample_data!
11    puts "sample data has been added."
12  end
13end
14
15def create_sample_data!
16  puts 'Seeding with sample data...'
17  create_user! email: 'oliver@example.com', name: 'Oliver'
18  create_user! email: 'sam@example.com', name: 'Sam'
19  puts 'Done! Now you can login with either "oliver@example.com" or "sam@example.com", using password "welcome"'
20end
21
22def create_user!(options = {})
23  user_attributes = { password: 'welcome', password_confirmation: 'welcome' }
24  attributes = user_attributes.merge options
25  User.create! attributes
26end

Running this Rake task will drop our database and recreate it from scratch with the demo data.

If you need to preserve your old data, backup the database file db/development.sqlite3 and db/test.sqlite3. But if you're using PostgreSQL, then this trick won't work. You'd have to dump the DB manually and reuse it.

Run this command to execute our Rake setup:

1bundle exec rake setup

The above command should output something similar in the console:

1Dropped database 'db/development.sqlite3'
2Dropped database 'db/test.sqlite3'
3Created database 'db/development.sqlite3'
4Created database 'db/test.sqlite3'
5== 20210104080645 CreateTasks: migrating ======================================
6-- create_table(:tasks)
7   -> 0.0014s
8== 20210104080645 CreateTasks: migrated (0.0015s) =============================
9
10== 20210106115906 MakeTitleNotNullable: migrating =============================
11-- change_column_null(:tasks, :title, false)
12   -> 0.0086s
13== 20210106115906 MakeTitleNotNullable: migrated (0.0086s) ====================
14
15== 20210108115051 CreateUser: migrating =======================================
16-- create_table(:users)
17   -> 0.0014s
18== 20210108115051 CreateUser: migrated (0.0015s) ==============================
19....
20Seeding with sample data...
21Done! Now you can login with either "oliver@example.com" or "sam@example.com", using password "welcome"

The Rails server should be restarted so that the latest data will be loaded.

Dealing with stale Active Record cache

Consider a scenario where a new column or a table is created using a migration and right after the new column or table is populated with some seed data. When the migration runs to add the seed data, Rails will throw an exception that the new column or the table doesn't exist. This is because Rails will use the schema from the Active Record cache which does not get updated before all the migrations finish running.

1class CreateJobLevels < ActiveRecord::Migration
2  def up
3    create_table :job_levels do |t|
4      t.integer :id
5      t.string :name
6
7      t.timestamps
8    end
9
10    %w{assistant executive manager director}.each do |type|
11      JobLevel.create(name: type)
12    end
13  end
14
15  def down
16    drop_table :job_levels
17  end
18end

In the above example, Rails will throw an exception when new records are created in the JobLevels table because when Rails tries to create the records, it will use the stale Active Record data which doesn't contain the JobLevel table.

This issue can also occur when migrations are run during a chained task. For example, when running something like ./bin/setup or rails db:migrate db:seed the schema is cached and sometimes the newly added columns or tables are not reflected in the cached schema.

According to the official Rails guide, we should invoke reset_column_information method to reset all the cached information about columns, which will cause them to be reloaded on the next request.

You can invoke the reset_column_information method in the up method of the migration where a new column or a new table is created and simultaneously data is being added to that column. This will ensure that the latest schema is loaded before the request to add data to the new column is processed and any errors due to stale Active Record cache can be avoided.

The example above can be fixed like so:

1class CreateJobLevels < ActiveRecord::Migration
2  def up
3    create_table :job_levels do |t|
4      t.integer :id
5      t.string :name
6
7      t.timestamps
8    end
9
10    reset_column_information
11
12    %w{assistant executive manager director}.each do |type|
13      JobLevel.create(name: type)
14    end
15  end
16
17  def down
18    drop_table :job_levels
19  end
20end

Note that the CreateJobLevels migration is only illustrated to explain the use case of resets_column_information method. Please do not add this migration to the Granite application.

The schema cache can also be cleared by running the rails db:schema:cache:clear or rails db:schema:cache:dump command from the console. But in a production environment it would not be feasible to do so because you cannot pause the migrations to update the cache before the next request is processed. Hence using the reset_column_information method is the right way to fix this issue.

We have successfully setup the Rake tasks for this project. Let's commit the changes:

1git add -A
2git commit -m "Added rake tasks"