GitHub - learnbyexample/regexp-cut: Use awk to provide cut like syntax for field extraction (original) (raw)

regexp-cut

Uses awk to provide cut like syntax for field extraction. The command name is rcut.

⚠️ ⚠️ Work under construction!

Motivation

cut's syntax is handy for many field extraction problems. But it doesn't allow multi-character or regexp delimiters. So, this project aims to provide cut like syntax for those cases. Currently uses mawk in a bash script.

ℹ️ Note that rcut isn't feature compatible or a replacement for the cut command. rcut helps when you need features like regexp field separator.

Features

⚠️ ⚠️ Work under construction!

Examples

$ cat spaces.txt 1 2 3
x y z i j k

by default, it uses awk's space/tab field separation and trimming

unlike cut, order matters

$ rcut -f3,1 spaces.txt 3 1 z x k i

multi-character delimiter

$ echo 'apple:-:fig:-:guava' | rcut -d:-: -f2 fig

regexp delimiter

$ echo 'Sample123string42with777numbers' | rcut -d'[0-9]+' -f1,4 Sample numbers

fixed string delimiter

$ echo '123)(%)#^&(@#.{1}\xyz' | rcut -Fd')(%)#^&(@#.{1}' -f1,2 -o, 123,xyz

multiple ranges can be specified, order matters

$ printf '1 2 3 4 5\na b c d e\n' | rcut -f2-3,5,1,2-4 2 3 5 1 2 3 4 b c e a b c d

last field

$ printf 'apple ball cat\n1 2 3 4 5' | rcut -nf-1 cat 5

except last two fields

$ printf 'apple ball cat\n1 2 3 4 5' | rcut -cnf-2: apple 1 2 3

suppress lines without input field delimiter

$ printf '1,2,3,4\nhello\na,b,c\n' | rcut -sd, -f2 2 b

-g option will switch to gawk

$ echo '1aa2aa3' | rcut -gd'a{2}' -f2 2

See Examples.md for many more examples.

Tests

You can use script.awk to check if all the example code snippets are working as expected.

$ cd examples/ $ awk -f script.awk Examples.md

TODO

Similar tools

Contributing

License

This project is licensed under MIT, see LICENSE file for details.