sort -k

TIL, when you call sort -k3, you’re not just sorting by the third field, but by whatever the value between the third field up to the end of the line is.
Not only that, in the case of ties, by default it will use also the first field.

Consider this example.

$ cat data
theta AAA 2
gamma AAA 2
alpha BBB 2
alpha AAA 3

Sorting with -k2 gives:

$ sort data -k2 --debug
sort: using simple byte comparison
gamma AAA 2
     ______
___________
theta AAA 2
     ______
___________
alpha AAA 3
     ______
___________
alpha BBB 2 
     _______
____________

Notice I’ve also added --debug, to show which parts are used in the comparisons.
So, first comes “AAA 2”, then “AAA 3”.
Also, for the two lines that have “AAA 2”, the first field is used, so “gamma” comes before “theta”.

Forget about the ties for now.
To consider field 2 only, rather than field 2 and all following fields, you need to specify a stop. This is done by adding “,2” to the -k switch. More in general, -km,n means “sort by field m up to n, boundaries included”.

$ sort data -k2,2 --debug
sort: using simple byte comparison
alpha AAA 3
     ____
___________
gamma AAA 2
     ____
___________
theta AAA 2
     ____
___________
alpha BBB 2 
     ____
____________

As you can see, field 2 only is taken into account at first.
“AAA 3” comes before “AAA 2” because, being a tie, the first field is used as a second comparison.

Taking this a step further, to actually only consider field 2 and resort to the original order in case of ties, that is, to have a stable sort, you need to pass the -s switch.

$ sort data -k2,2 -s --debug
sort: using simple byte comparison
theta AAA 2
     ____
gamma AAA 2
     ____
alpha AAA 3
     ____
alpha BBB 2 
     ____

This look similar to the first snippet, but actually the first two lines in the output are swapped. Here they appear in the original order.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s