each_line

a text processing command-line tool that is driven by Ruby's #each_line

Why this project?

I've been juggling various command-line tools for a number of years. Tools like xargs, grep, awk, and sed can certainly get the job done, but I found myself having to go back to the documentation constantly for some of these. If I ever needed to do anything more sophisticated that basic processing I found it more productive pipe the text into ruby instead. I use Ruby a lot so I'm much more comfortable with those APIs too.

e.g.

$ (echo hello; echo world) | ruby -pe '$_ = $_[3]'
l
l

# Or
$ (echo hello; echo world; echo jello) | ruby -e 'puts $stdin.each_line.select { |line| line =~ /ello/ }'
hello
jello

This worked better, but ruby still felt a bit cumbersome. Sometimes I wanted to filter. Sometimes I wanted to map. Sometimes I wanted to reduce things into JSON. I always forgot to puts my result and had to skip back to the beginning of iteration. ruby had all the features, but again I found myself fighting the tools

Thus the goal of this project is to provide another interface for Ruby that I think strikes a better balance for ad-hoc text processing on the command-line.

Installation

gem install each_line

Examples

# Run a script directly against #each_line
$ (echo hello; echo world) | each_line 'to_a.map(&:strip).join(",")'
hello,world

# If your Ruby supports numbered parameters then you might find that this interface is all you really need
$ (echo hello; echo world; echo jello) | each_line 'select { _1 =~ /ello/ }'
hello
jello

# Run a block method (e.g. map, select, etc) on #each_line
# Use the -m option to specify the method to be run
# The first argument will be treated as the block body
# By default block args are positionally bound to $_1, $_2, etc
# If your version of Ruby supports numbered parameters (e.g. _1, _2, etc) you may find it easier do something along the lines of the example above
$ (echo hello; echo world; echo jello) | each_line -m select '$_1 =~ /ello/'
hello
jello

$ (echo hello; echo world; echo jello) | each_line -m reject '$_1 =~ /ello/'
world

# The -m option can also specify a chain of methods which can be useful in some situations
# Any block or args specified will be passed to the last method in the chain
$ (echo hello; echo world; echo jello) | each_line -m with_index.map '"#{$_2}\t#{$_1}"'
0       hello
1       world
2       jello

# The chain of methods can also accept args and blocks
# The last method in the chain must use the -a and -b options
$ (echo hello; echo world; echo jello) | each_line -m 'each_slice(2).map' '$_1.inspect'
["hello\n", "world\n"]
["jello\n"]

# A more complicated example using #reduce
# The -a option can be used to specify args for reduce; args should be comma-separated
$ (echo hello; echo world; echo jello) | each_line -m reduce -a '{}' '$_1[$_2.strip] = $_2[3]; $_1'
{"hello"=>"l", "world"=>"l", "jello"=>"l"}

# We can customize the names of block args too to avoid mixing up the block args
# Use either -v or --block_vars and pass a comma-separated list of arg names
$ (echo hello; echo world; echo jello) | each_line -m reduce -a '{}' -v '$acc,$line' '$acc[$line.strip] = $line[3]; $acc'
{"hello"=>"l", "world"=>"l", "jello"=>"l"}

# A finalizer can be specified as well
# The -f option can accept a script to run
# Additionally the -r and -I options can be used to require libraries and modify the load-path similarly to how they do with the `ruby` executable
$ (echo hello; echo world; echo jello) | each_line -m reduce -a '{}' -v '$acc,$line' '$acc[$line.strip] = $line[3]; $acc' -f to_json -rjson
{"hello":"l","world":"l","jello":"l"}

# An initializer can be specified
# This is an arbitrary chunk of code that will be run once before any scripting is done
# You can use the -i option to specify an intializer
$ (echo hello; echo world; echo jello) | each_line -i '$lookup = %w(alpha beta charlie)' -m each_with_index.map -v '$el,$idx' '$lookup[$idx]'
alpha
beta
charlie

# The -d option can be used to specify a delimiter other than new-line
$ echo hello world jello | each_line -d ' ' -m map '$_1.gsub /e/, "q"'
hqllo
world
jqllo

$ echo hello,world,jello | each_line -d ',' -m map '$_1.gsub /e/, "q"'
hqllo,
world,
jqllo

# You can pass the --strip option which will strip off the delimiter before it is passed to your block
$ echo hello,world,jello | each_line --strip -d ',' -m map '$_1.inspect'
"hello"
"world"
"jello\n"

$ (echo hello; echo world; echo jello) | each_line --strip -m map '$_1.inspect'
"hello"
"world"
"jello"