Ruby String Methods You're Probably Underusing — format, scan, match, and gsub with Blocks
String manipulation is one of those areas where developers develop habits early and rarely revisit them. split, include?, gsub, strip — these cover a lot of ground. But Ruby’s String class has methods that go substantially further and that solve common problems more cleanly than what most developers reach for. scan for finding all matches, match for capturing groups, format for templated output, and gsub with a block or hash for transformation — these aren’t obscure; they’re just underused.
format and % — Precise String Interpolation
String interpolation with #{} is fine for simple cases. format (aliased as sprintf and callable with %) gives you precise control over number formatting, padding, and alignment without a gem.
Example:
# Basic format specifiers
format("Hello, %s!", "Ada") # => "Hello, Ada!"
format("Pi is %.2f", 3.14159) # => "Pi is 3.14"
format("Hex: %x", 255) # => "Hex: ff"
format("Padded: %10s", "hello") # => "Padded: hello"
format("Left: %-10s|", "hello") # => "Left: hello |"
format("Zero: %05d", 42) # => "Zero: 00042"
Example:
# Named references (Ruby 2.1+)
format("Name: %{name}, Age: %{age}", name: "Ada", age: 36)
# => "Name: Ada, Age: 36"
# Currency formatting
prices = [9.9, 149.5, 1299.0]
prices.map { |p| format("$%,.2f", p) }
# => ["$9.90", "$149.50", "$1,299.00"]
# Report column alignment
data = [["Alice", 98], ["Bob", 72], ["Charlie", 85]]
data.each do |name, score|
puts format("%-10s %3d", name, score)
end
# Alice 98
# Bob 72
# Charlie 85
The % operator is shorthand: "%-10s %3d" % [name, score] is equivalent to format("%-10s %3d", name, score).
scan — Extract All Matches From a String
scan finds every occurrence of a pattern in a string and returns them as an array. It’s the method you want when match gives you only the first result and you need all of them.
Example:
text = "Prices: $12.50, $8.99, and $249.00"
# Extract all prices
text.scan(/\$[\d.]+/)
# => ["$12.50", "$8.99", "$249.00"]
# Extract capture groups
text.scan(/\$(\d+)\.(\d+)/)
# => [["12", "50"], ["8", "99"], ["249", "00"]]
# Each match becomes an array of capture groups
Example:
# Find all words starting with a capital letter
"Ruby is Fast. Rails is Great. Let's build.".scan(/\b[A-Z]\w*/)
# => ["Ruby", "Fast", "Rails", "Great", "Let"]
# Extract all emails from text
text = "Contact [email protected] or [email protected] for help"
text.scan(/\b[\w.+-]+@[\w-]+\.[a-z]{2,}\b/)
# => ["[email protected]", "[email protected]"]
# With a block — process each match as it's found
"hello world ruby".scan(/\b\w{5}\b/) { |m| puts m.upcase }
# HELLO
# WORLD
scan with capture groups returns an array of arrays — each inner array contains the captured groups from one match. Without capture groups, it returns an array of match strings.
match and match? — Named Captures and Existence Checks
match returns a MatchData object for the first match, giving you access to captured groups by index or name. match? returns a boolean without allocating a MatchData object — use it when you only need to know whether a match exists.
Example:
# Named captures — more readable than numbered groups
pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/
m = "Order placed on 2024-03-15.".match(pattern)
m[:year] # => "2024"
m[:month] # => "03"
m[:day] # => "15"
m[0] # => "2024-03-15" (full match)
Example:
# match? is faster — no MatchData object allocated
"[email protected]".match?(/\A[\w.+-]+@[\w-]+\.[a-z]+\z/) # => true
# The =~ operator also matches (Ruby global $~, $1, $2... set as side effect)
"hello world" =~ /(\w+)\s(\w+)/
$1 # => "hello"
$2 # => "world"
# Prefer named captures over $1/$2 — they survive refactoring
Example:
# Destructuring with named captures in Ruby 3.x
if m = "Ada Lovelace, 1815".match(/(?<name>[\w ]+), (?<year>\d+)/)
name = m[:name]
year = m[:year].to_i
"#{name} was born in #{year}"
# => "Ada Lovelace was born in 1815"
end
gsub With a Block or Hash
gsub with a string replacement is well-known. gsub with a block or a hash is far more powerful — the block receives each match and can return any string, and a hash maps specific matches to replacements.
Example:
# Block form — transform each match
"hello world ruby".gsub(/\b\w+/) { |word| word.capitalize }
# => "Hello World Ruby"
# Access capture groups inside the block via $~
"2024-03-15".gsub(/(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/) do
"#{$~[:day]}/#{$~[:month]}/#{$~[:year]}"
end
# => "15/03/2024"
Example:
# Hash form — map specific matches to replacements
"I love ruby and rails".gsub(/ruby|rails/, "ruby" => "Ruby", "rails" => "Rails")
# => "I love Ruby and Rails"
# Useful for HTML entity escaping
text = "5 < 10 & 'hello'"
text.gsub(/[<>&'"]/, "<" => "<", ">" => ">", "&" => "&",
"'" => "'", '"' => """)
# => "5 < 10 & 'hello'"
Example:
# Interpolation within gsub block — powerful for template processing
template = "Hello , your order is ready"
data = { "name" => "Ada", "order_id" => "ORD-12345" }
template.gsub(/\{\{(\w+)\}\}/) { data[$1] || $& }
# => "Hello Ada, your order ORD-12345 is ready"
chars, bytes, and each_char
For character-level manipulation, Ruby provides several clean interfaces.
Example:
str = "Ruby"
str.chars # => ["R", "u", "b", "y"]
str.bytes # => [82, 117, 98, 121]
str.each_char.map { |c| c.ord } # => [82, 117, 98, 121]
# Counting specific characters
"hello world".chars.count { |c| "aeiou".include?(c) } # => 3 (vowels)
# Caesar cipher using chars
def caesar(str, shift = 3)
str.chars.map do |c|
c =~ /[a-z]/ ? ((c.ord - 97 + shift) % 26 + 97).chr : c
end.join
end
caesar("hello") # => "khoor"
caesar("khoor", -3) # => "hello"
Pro-Tip: When extracting structured data from strings (log lines, CSV-like formats, configuration output),
scanwith named capture groups is often cleaner than splitting on delimiters.line.scan(/(?<key>\w+)=(?<value>[^,\s]+)/)extracts all key-value pairs from a string in one pass, handles variable field counts naturally, and stays readable when the format changes. The alternative — splitting, zipping, and transforming — requires multiple steps and breaks on edge cases that regex handles inline.
Conclusion
Ruby’s string toolkit extends well past the methods that get used automatically. format for controlled number and text formatting, scan for extracting all pattern matches, named captures in match for readable group access, and gsub with blocks or hashes for context-aware transformations — each of these replaces a pattern that developers often implement manually with loops and conditionals. Getting comfortable with the full string interface means writing cleaner, shorter code for text manipulation problems that come up constantly in production applications.
FAQs
Q1: What’s the difference between match and =~?
match returns a MatchData object (or nil). =~ returns the character position of the match (or nil) and sets global variables $~, $1, $2, etc. as side effects. match is generally preferred in modern Ruby — it’s more object-oriented and doesn’t rely on global state. Use match? when you only need a boolean.
Q2: When should I use scan vs scan with a block?
Use scan without a block to collect all matches into an array for further processing. Use scan with a block when you want to process each match immediately without building an intermediate array — useful for large strings where memory allocation matters.
Q3: Does format handle localization (thousands separators, decimal points)?
format uses the C library’s printf conventions, which don’t vary by locale in Ruby. For locale-aware number formatting (European 1.234,56 style), use the number_to_currency helper in Rails or the i18n gem.
Q4: Can gsub with a hash handle overlapping matches?
gsub processes the string left to right and doesn’t re-process replaced content. Overlapping patterns in the hash are fine — the first match wins. For complex replacement rules with ordering requirements, the block form gives more control.
Q5: Is there a performance consideration for scan on very large strings?
scan allocates an array for all matches. For very large strings, consider string.each_line.flat_map { |line| line.scan(pattern) } to process in chunks, or use a block form that streams results without building the full array in memory.
Check viewARU - Brand Newsletter!
Newsletter to DEVs by DEVs - boost your Personal Brand & career! 🚀