---
title: "Caching result sets and collection in Rails 5"
description:
  "Rails 5 provides cache_key for ActiveRecord::Relation that can help in
  caching result of a collection of records."
canonical_url: "https://www.bigbinary.com/blog/activerecord-relation-cache-key"
markdown_url: "https://www.bigbinary.com/blog/activerecord-relation-cache-key.md"
---

# Caching result sets and collection in Rails 5

Rails 5 provides cache_key for ActiveRecord::Relation that can help in caching
result of a collection of records.

- Author: Mohit Natoo
- Published: February 2, 2016
- Categories: Rails 5, Rails

Often while developing a Rails application you may look to have one of these
[caching techniques](http://guides.rubyonrails.org/caching_with_rails.html) to
boost the performance. Along with these, Rails 5 now provides a way of caching a
collection of records, thanks to the introduction of the following method:

```plaintext

ActiveRecord::Relation#cache_key

```

### What is collection caching?

Consider the following example where we are fetching a collection of all users
belonging to city of Miami.

```ruby

@users = User.where(city: 'miami')

```

Here `@users` is a collection of records and is an object of class
`ActiveRecord::Relation`.

Whether the result of the above query would be same depends on following
conditions.

- The query statement doesn't change. If we change city name from "Miami" to
  "Boston" then result might change.
- No record is deleted. The count of records in the collection should be same.
- No record is added. The count of records in the collection should be same.

Rails community
[implemented caching for a collection of records](https://github.com/rails/rails/pull/20884)
. Method `cache_key` was added to `ActiveRecord::Relation` which takes into
account many factors including query statement, updated_at column value and the
count of the records in collection.

### Understanding ActiveRecord::Relation#cache_key

We have object `@users` of class `ActiveRecord::Relation`. Now let's execute
`cache_key` method on it.

```ruby

 @users.cache_key
 => "users/query-67ed32b36805c4b1ec1948b4eef8d58f-3-20160116111659084027"

```

Let's try to understand each piece of the output.

**`users`** represents what kind of records we are holding. In this example we
have collection of records of class `User`. Hence `users` is to illustrate that
we are holding `users` records.

**`query-`** is hardcoded value and it will be same in all cases.

**`67ed32b36805c4b1ec1948b4eef8d58f`** is a digest of the query statement that
will be executed. In our example it is
`MD5( "SELECT "users".* FROM "users" WHERE "users"."city" = 'Miami'")`

**`3`** is the size of collection.

**`20160116111659084027`** is timestamp of the most recently updated record in
the collection. By default, the timestamp column considered is `updated_at` and
hence the value will be the most recent `updated_at` value in the collection.

## Using ActiveRecord::Relation#cache_key

Let's see how to use `cache_key` to actually cache data.

In our Rails application, if we want to cache records of users belonging to
"Miami" then we can take following approach.

```ruby

# app/controllers/users_controller.rb

class UsersController < ApplicationController

  def index
    @users = User.where(city: 'Miami')
  end
end

# users/index.html.erb

<% cache(@users) do %>
  <% @users.each do |user| %>
    <p> <%= user.city %> </p>
  <% end %>
<% end %>

# 1st Hit
Processing by UsersController#index as HTML
  Rendering users/index.html.erb within layouts/application
   (0.2ms)  SELECT COUNT(*) AS "size", MAX("users"."updated_at") AS timestamp FROM "users" WHERE "users"."city" = ?  [["city", "Miami"]]
Read fragment views/users/query-37a3d8c65b3f0f9ece7f66edcdcb10ab-4-20160704131424063322/30033e62b28c83f26351dc4ccd6c8451 (0.0ms)
  User Load (0.1ms)  SELECT "users".* FROM "users" WHERE "users"."city" = ?  [["city", "Miami"]]
Write fragment views/users/query-37a3d8c65b3f0f9ece7f66edcdcb10ab-4-20160704131424063322/30033e62b28c83f26351dc4ccd6c8451 (0.0ms)
Rendered users/index.html.erb within layouts/application (3.7ms)

# 2nd Hit
Processing by UsersController#index as HTML
  Rendering users/index.html.erb within layouts/application
   (0.2ms)  SELECT COUNT(*) AS "size", MAX("users"."updated_at") AS timestamp FROM "users" WHERE "users"."city" = ?  [["city", "Miami"]]
Read fragment views/users/query-37a3d8c65b3f0f9ece7f66edcdcb10ab-4-20160704131424063322/30033e62b28c83f26351dc4ccd6c8451 (0.0ms)
  Rendered users/index.html.erb within layouts/application (3.0ms)

```

From above, we can see that for the first hit, a `count` query is fired to get
the latest `updated_at` and `size` from the users collection.

Rails will write a new cache entry with a `cache_key` generated from above
`count` query.

Now on second hit, it again fires `count` query and checks if cache_key for this
query exists or not.

If cache_key is found, it loads data without firing `SQL query`.

##### What if your table doesn't have updated_at column?

Previously we mentioned that `cache_key` method uses `updated_at` column.
`cache_key` also provides an option of passing custom column as a parameter and
then the highest value of that column among the records in the collection will
be considered.

For example if your business logic considers a column named `last_bought_at` in
`products` table as a factor to decide caching, then you can use the following
code.

```ruby

 products = Product.where(category: 'cars')
 products.cache_key(:last_bought_at)
 => "products/query-211ae6b96ec456b8d7a24ad5fa2f8ad4-4-20160118080134697603"

```

### Edge cases to watch out for

Before you start using `cache_key` there are some edge cases to watch out for.

Consider you have an application where there are 5 entries in `users` table with
`city` Miami.

_**Using limit puts incorrect size in cache key if collection is not loaded.**_

If you want to fetch three users belonging to city "Miami" then you would
execute following query.

```ruby

 users = User.where(city: 'Miami').limit(3)
 users.cache_key
 => "users/query-67ed32b36805c4b1ec1948b4eef8d58f-3-20160116144936949365"

```

Here users contains only three records and hence the `cache_key` has 3 for size
of collection.

Now let's try to execute same query without fetching the records first.

```ruby

 User.where(name: 'Sam').limit(3).cache_key
 => "users/query-8dc512b1408302d7a51cf1177e478463-5-20160116144936949365"

```

You can see that the count in the cache is 5 this time even though we have set a
limit to 3. This is because the implementation of
ActiveRecord::Base#collection_cache_key
[executes query without limit](https://github.com/rails/rails/blob/39f383bad01e52c217c9007b5e9d3b239fe6a808/activerecord/lib/active_record/collection_cache_key.rb#L16)
to fetch the size of the collection.

#### Cache key doesn't change when an existing record from a collection is replaced

I want 3 users in the descending order of ids.

```ruby

 users1 = User.where(city: 'Miami').order('id desc').limit(3)
 users1.cache_key
 => "users/query-57ee9977bb0b04c84711702600aaa24b-3-20160116144936949365"

```

Above statement will give us users with ids `[5, 4, 3]`.

Now let's remove the user with id = 3.

```ruby

 User.find(3).destroy

 users2 = User.where(first_name: 'Sam').order('id desc').limit(3)
 users2.cache_key
 => "users/query-57ee9977bb0b04c84711702600aaa24b-3-20160116144936949365"

```

Note that `cache_key` both `users1` and `users2` is exactly same. This is
because none of the parameters that affect the cache key is changed i.e.,
neither the number of records, nor the query statement, nor the timestamp of the
latest record.

There is [a discussion undergoing](https://github.com/rails/rails/pull/21503)
about adding ids of the collection records as part of the cache key. This might
help solve the problems discussed above.

#### Using group query gives incorrect size in the cache key

Just like `limit` case discussed above `cache_key` behaves differently when data
is loaded and when data is not loaded in memory.

Let's say that we have two users with first_name "Sam".

First let's see a case where collection is not loaded in memory.

```ruby

 User.select(:first_name).group(:first_name).cache_key
 => "users/query-92270644d1ec90f5962523ed8dd7a795-1-20160118080134697603"

```

In the above case, the size is 1 in `cache_key`. For the system mentioned above,
the sizes that you will get shall either be 1 or 5. That is, it is size of an
arbitrary group.

Now let's see when collection is first loaded.

```ruby

 users = User.select(:first_name).group(:first_name)
 users.cache_key
 => "users/query-92270644d1ec90f5962523ed8dd7a795-2-20160118080134697603"

```

In the above case, the size is 2 in `cache_key`. You can see that the count in
the cache key here is different compared to that where the collection was
unloaded even though the query output in both the cases will be exactly same.

In case where the collection is loaded, the size that you get is equal to the
total number of groups. So irrespective of what the records in each group are,
we may have possibility of having the same cache key value.

## Links

- [Human page](https://www.bigbinary.com/blog/activerecord-relation-cache-key)
