There are a bunch of utilities that I use daily, but that I learned how to use embarassingly late in my programming journey. I plan to do a sequence of short tutorials highlighting the use of these tools in the hope that others can skip the long, frustrating discovery process that I went through before I knew them.
cut
is part of coreutils
, the set of utilities that *nix systems expect to be in place at all times. You should already have it if you’re running Linux, any BSD, macOS, or Windows Subsystem for Linux. It reads lines of input from stdin and cuts them into pieces according to the flags that you provide. It sends the resulting pieces on stdout. A simple example:
Let us say that we have this excellent poem by Dylan Thomas in a file called thomas.txt
Do not go gentle into that good night,
Old age should burn and rave at close of day;
Rage, rage against the dying of the light.Though wise men at their end know dark is right,
Because their words had forked no lightning they
Do not go gentle into that good night.Good men, the last wave by, crying how bright
Their frail deeds might have danced in a green bay,
Rage, rage against the dying of the light.Wild men who caught and sang the sun in flight,
And learn, too late, they grieved it on its way,
Do not go gentle into that good night.Grave men, near death, who see with blinding sight
Blind eyes could blaze like meteors and be gay,
Rage, rage against the dying of the light.And you, my father, there on the sad height,
Curse, bless, me now with your fierce tears, I pray.
Do not go gentle into that good night.
Rage, rage against the dying of the light.
We could run:
$ cut -d ' ' -f 1,3,5,6,8 < thomas.txt Do go into that night, Old should and rave close Rage, against dying of light. Though men their end dark Because words forked no they Do go into that night. Good the wave by, how Their deeds have danced a Rage, against dying of light. Wild who and sang sun And too they grieved on Do go into that night. Grave near who see blinding Blind could like meteors be Rage, against dying of light. And my there on sad Curse, me with your tears, Do go into that night. Rage, against dying of light.
The -d
flag specifies a “delimiter”, which is the symbol that cut
will use to break lines of input into pieces. In this case, I’ve specified that the delimiter should be a space. This had to be quoted to ensure that my shell actually passed a space character instead of ignoring the extra space.
The -f
flag specified the “fields” that we want to keep. These “fields” are the pieces that cut
broke the input into. We refer to them like elements of an array, though the first element is at position 1 instead of 0. The example above asks for the first, third, fifth, sixth, and eighth field, which resulted in our output string. When the delimiter is a space, the fields are essentially words (plus surrounding punctuation).
We don’t always need to be so verbose though. The fields can be specified using several different syntaxes:
-f N-
takes fields from the Nth field onward-f -N
takes fields through the Nth field (inclusive)-f N-M
takes fields from the Nth through the Mth
These can all be combined with comma separation, so you can do:
$ cut -d ' ' -f -3,5,7- < thomas.txt Do not go into good night, Old age should and at close of day; Rage, rage against dying the light. Though wise men their know dark is right, Because their words forked lightning they Do not go into good night. Good men, the wave crying how bright Their frail deeds have in a green bay, Rage, rage against dying the light. Wild men who and the sun in flight, And learn, too they it on its way, Do not go into good night. Grave men, near who with blinding sight Blind eyes could like and be gay, Rage, rage against dying the light. And you, my there the sad height, Curse, bless, me with fierce tears, I pray. Do not go into good night. Rage, rage against dying the light.
Okay, so when would I actually use this?
Well, one thing I end up doing is manipulating CSV data in my terminal. Let’s say that we have this example CSV file called example.csv
:
tool,source,language cut,coreutils,c tr,coreutils,c sponge,moreutils,c rg,ripgrep,rust fzf,fzf,go
What if we want to extract only the tool name and language from this file into a new file? We could open it in some kind of spreadsheet editor and delete the center column, but what if this file is huge or we don’t have a spreadsheet editor available? Cut makes this simple:
$ cut -d ',' -f 1,3 < example.csv tool,language cut,c tr,c sponge,c rg,rust fzf,go
Note that the delimiter (-d
flag) now uses a comma instead of a space. We could use this technique to take any number of columns from the file. It’s actually quite fast, as cut
is able to stream the data one line at a time.
Cut can also take specific characters or bytes instead of delimited fields. I use these features less often, so I won’t cover them now. Perhaps in a future post.