sorting - sort -k #.#n different from sort -k #.# -n? -


[edited add -b, had tried no effect]

i had file column of numbers in parentheses wanted sort on, this:

x (10) x (11) x (1) x (2) 

i thought sort -b -k 2.2n work. didn't. discovered sort -b -k 2.2 -n work, generating desired output

x (1) x (2) x (10) x (11) 

can explain why? know -n treats columns numeric (not selected ones), i'm surprised makes difference here. thought -k 2.2n sort second column numerically, starting @ second position, , don't understand why didn't work. (though sort's nuances have eluded me before.)

this "sort (gnu coreutils) 5.93", if makes difference. [and later tried on machine coreutils 8.5 , saw same results.]

coreutils 5.93 old. newer versions have nice --debug option shows how field selectors apply each line.

the problem you're having sort field-splitting default includes whitespace between fields part of following field. in case, second field " (1)" leading space -k2.2 selecting substring starting parenthesis, fails recognized number.

you can fix adding b flag. sort -k2.2nb or sort -k2.2 -n -b should work.

on more recent coreutils (8.23), can't reproduce behavior of sort -k2.2 -n - behaves same sort -k2.2n, failing match numbers , falling on "whole line string comparison" sort.

update (3): have reproduced result, in versions 5.93, 8.13, , 8.23. , there definite explanation it.

both of suggestions (sort -k2.2nb , sort -k2.2 -n -b) work. sort -b -k2.2n doesn't. in version supports --debug, says:

$ printf '%s\n' 'x ('{1,2,10,11}')'|sort -k 2.2n -b --debug sort: using simple byte comparison sort: leading blanks significant in key 1; consider specifying 'b' sort: key 1 numeric , spans multiple fields sort: option '-b' ignored x (1)   ^ no match key _____ x (10)   ^ no match key ______ x (11)   ^ no match key ______ x (2)   ^ no match key _____ $ 

the option '-b' ignored explanation of what's happening. attach 1 flag (the n) key specification given -k, all globally-specified flags ignored key.

the gnu info documentation coreutils makes point (i marked important part in bold):

the following options affect ordering of output lines. may specified globally or part of specific key field. if no key fields specified, global options apply comparison of entire lines; otherwise global options inherited key fields that not specify special options of own. in pre-posix versions of ‘sort’, global options affect later key fields, portable shell scripts should specify global options first.

(and -b in list follows.)

the man page wording less clear:

keydef f[.c][opts][,f[.c][opts]] start , stop position, f field number , c character position in field; both origin 1, , stop position defaults line's end. if neither -t nor -b in effect, characters in field counted beginning of preceding whitespace. opts 1 or more single-letter ordering options [bdfgimhnrrv], override global ordering options key. if no key given, use entire line key.

you interpret meaning each single-letter option overrides global option of same letter only. you'd wrong...

the posix definition pretty clear (i bolded important piece again):

the following options shall override default ordering rules. when ordering options appear independent of key field specifications, requested field ordering rules shall applied globally sort keys. when attached specific key (see -k), specified ordering options shall override global ordering options key.

the unclear part -b doesn't appear in list following paragraph. there's single sentence after list, introducing list containing -b , -t options, might argue -b isn't within scope of "override global ordering options" rule.

after all, -b isn't ordering option, field splitting option, , when attach key definition can apply separately beginning, end, or both. not of other options. @ least in gnu implementation, does follow "override global ordering options" rule.


Comments