Ruby 2.6 adds String#split with block

July 17, 2018

This blog is part of our Ruby 2.6 series.

Before Ruby 2.6, String#split returned array of split strings.

In Ruby 2.6, a block can be passed to String#split (Link is not available) which yields each split string and operates on it. This avoids creating an array and thus is memory efficient.

We will add method is_fruit? to understand how to use split with a block.

def is_fruit?(value)
%w(apple mango banana watermelon grapes guava lychee).include?(value)
end

Input is a comma separated string with vegetables and fruits names. Goal is to fetch names of fruits from input string and store it in an array.

String#split

input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

splitted_values = input_str.split(", ")
=> ["apple", "mango", "potato", "banana", "cabbage", "watermelon", "grapes"]

fruits = splitted_values.select { |value| is_fruit?(value) }
=> ["apple", "mango", "banana", "watermelon", "grapes"]

Using split an intermediate array is created which contains both fruits and vegetables names.

String#split with a block

fruits = []

input_str = "apple, mango, potato, banana, cabbage, watermelon, grapes"

input_str.split(", ") { |value| fruits << value if is_fruit?(value) }
=> "apple, mango, potato, banana, cabbage, watermelon, grapes"

fruits
=> ["apple", "mango", "banana", "watermelon", "grapes"]

When a block is passed to split, it returns the string on which split was called and does not create an array. String#split yields block on each split string, which in our case was to push fruit names in a separate array.

Update

Benchmark

We created a large random string to benchmark performance of split and split with block

require 'securerandom'

test_string = ''

100_000.times.each do
test_string += SecureRandom.alphanumeric(10)
test_string += ' '
end

require 'benchmark'

Benchmark.bmbm do |bench|

bench.report('split') do
arr = test_string.split(' ')
str_starts_with_a = arr.select { |str| str.start_with?('a') }
end

bench.report('split with block') do
str_starts_with_a = []
test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
end

end

Results

Rehearsal ----------------------------------------------------
split              0.023764   0.000911   0.024675 (  0.024686)
split with block   0.012892   0.000553   0.013445 (  0.013486)
------------------------------------------- total: 0.038120sec

                       user     system      total        real
split              0.024107   0.000487   0.024594 (  0.024622)
split with block   0.010613   0.000334   0.010947 (  0.010991)

We did another iteration of benchmarking using benchmark/ips.

require 'benchmark/ips'
Benchmark.ips do |bench|

bench.report('split') do
splitted_arr = test_string.split(' ')
str_starts_with_a = splitted_arr.select { |str| str.start_with?('a') }
end

bench.report('split with block') do
str_starts_with_a = []
test_string.split(' ') { |str| str_starts_with_a << str if str.start_with?('a') }
end

bench.compare!
end

Results

Warming up --------------------------------------
               split     4.000  i/100ms
    split with block    10.000  i/100ms
Calculating -------------------------------------
               split     46.906  (± 2.1%) i/s -    236.000  in   5.033343s
    split with block    107.301  (± 1.9%) i/s -    540.000  in   5.033614s

Comparison:
    split with block:      107.3 i/s
               split:       46.9 i/s - 2.29x  slower

This benchmark shows that split with block is about 2 times faster than split.

Here is relevant commit and discussion for this change.

The Chinese version of this blog is available here.

If this blog was helpful, check out our full blog archive.

Ruby 2.6 adds String#split with block

String#split

String#split with a block

Update

Benchmark

Stay up to date with our blogs.