The DATA Constant and __END__ Marker
Ruby has a little-known feature inherited from Perl: the ability to embed arbitrary data at the end of your source file using the __END__ marker. Everything after __END__ becomes accessible through the DATA constant.
Basic Usage
puts DATA.read
__END__
This text appears after the code ends.
You can put anything here:
- Configuration data
- Sample inputs
- ASCII art
- Whatever you want!
The DATA constant is a File object pointing to the content after __END__. You can use all standard file methods on it: read, readline, each, etc.
Why Would You Use This?
Embedding Test Data
Single-file scripts can include their test data inline:
def parse_csv(data)
data.lines.map { |line| line.split(',').map(&:strip) }
end
if __FILE__ == $0
result = parse_csv(DATA.read)
puts result.inspect
end
__END__
name,age,city
Alice,30,Brooklyn
Bob,25,Queens
Charlie,35,Manhattan
Self-Contained Scripts
Distributing a single-file tool becomes easier when configuration or assets are embedded:
require 'yaml'
config = YAML.load(DATA)
puts "Connecting to #{config['host']}"
__END__
host: example.com
port: 5432
timeout: 30
Documentation
You can embed long help text or documentation without cluttering the code:
#!/usr/bin/env ruby
def show_help
puts DATA.read
exit
end
show_help if ARGV.include?('--help')
# ... actual program logic ...
__END__
MyScript v1.0
Usage:
myscript [options] file
Options:
--help Show this help
--verbose Enable verbose output
Examples:
myscript input.txt
myscript --verbose data.csv
Important Details
One Per File
You can only have one __END__ marker per file. Everything after it is DATA—there’s no way to have multiple sections.
Only in the Main File
The DATA constant only exists if __END__ appears in the main executed file. Required libraries with __END__ don’t create a DATA constant:
# main.rb
require './library'
puts DATA.read # Works if main.rb has __END__
__END__
Main file data
# library.rb
# DATA is not available here even if we have:
__END__
Library data (not accessible)
It’s a Real File Object
Since DATA is a file object, you can seek, rewind, and read multiple times:
section1 = DATA.readline
DATA.rewind
all = DATA.read
__END__
First line
Second line
Third line
Comparison with Heredocs
Why use __END__ instead of heredocs?
# Heredoc
text = <<~TEXT
Some long text
Multiple lines
Embedded in code
TEXT
# vs __END__
text = DATA.read
__END__
Some long text
Multiple lines
Separated from code
Heredocs are better for small snippets. __END__ is better when:
- Data is large enough to distract from code logic
- You want syntax highlighting to stop (some editors dim post-
__END__content) - The data is conceptually “附件” rather than part of the program flow
Real-World Usage
This feature appears in several contexts:
- Single-file gems: Embedding gem metadata or documentation
- Installation scripts: Including configuration templates
- Code golf: Minimizing file count in challenges
- Teaching examples: Keeping example data with code
Modern Alternatives
For most use cases, modern Ruby developers prefer:
- External data files (more flexible)
- Embedded resources via constants or methods
- Configuration gems (better structure)
But __END__ remains useful for truly self-contained scripts where distributing a single file is important.
Try It Yourself
The __END__ marker is one of those features you might never need, but when you do need it, nothing else works quite as well. It’s a reminder that Ruby values practicality—even when that means including a quirky Perl-ism that makes certain problems trivial to solve.
Next time you’re writing a throwaway script that needs some test data, give __END__ a try. You might find it’s exactly the right tool for the job.