Linux Admin – sort Command

  • Post author:
  • Post category:Linux
  • Post comments:0 Comments

sort has several optimizations for sorting based on datatypes. This command writes sorted concatenation of all files to standard output. However, be wary, complex sort operations on large files of a few GigaBytes can impede the system performance.

When running a production server with limited CPU and/or memory availability, it is recommended to offload these larger files to a workstation for sorting operations during peak business hours.

SwitchAction
-bIgnore leading blank lines
-dDictionary order, consider only blanks and alphanumeric characters
-fIgnore case, folding lower and upper characters
-gGeneral numeric sort
-MMonth sort
-hHuman readable numeric sort 1KB, 1MB, 1GB
-RRandom sort
-mMerge already sorted files

Feel free to copy the tabular text below and follow along with our sort examples. Be sure each column is separated with a tab character.

first namelast nameoffice
TedDaniel101
JennyColon608
DanaMaxwell602
MarianLittle903
BobbieChapman403
NicolasSingleton203
DaleBarton901
AaronDennis305
SantosAndrews504
JacquelineNeal102
BillyCrawford301
RosaSummers405
KellieCurtis903
MattDavis305
GinaCarr902
FranciscoGilbert101
SidneyMack901
HeidiSimmons204
CristinaTorres206
SonyaWeaver403
DonaldEvans403
GwendolynChambers108
AntoniaLucas901
BlancheHayes603
CarrieTodd201
TerenceAnderson501
JoanParsons102
RoseFisher304
MalcolmMatthews702

Using sort in its most basic, default form −

[root@centosLocal centos]# sort ./Documents/names.txt 

Aaron         Dennis         305

Antonia       Lucas          901

Billy         Crawford       301

Blanche       Hayes          603

Bobbie        Chapman        403

Carrie        Todd           201

Cristina      Torres         206

Dale          Barton         901

Dana          Maxwell        602

Donald        Evans          403

Francisco     Gilbert        101

Gina          Carr           902

Gwendolyn     Chambers       108

Heidi         Simmons        204

Jacqueline    Neal           102

Jenny         Colon          608

Joan          Parsons        102

Kellie        Curtis         903

Malcolm       Matthews       702

Marian        Little         903

Matt          Davis          305

Nicolas      Singleton       203

Rosa         Summers         405

Rose         Fisher          304

Santos       Andrews         504

Sidney       Mack            901

Sonya        Weaver          403

Ted          Daniel          101

Terence      Anderson        501

[root@centosLocal centos]#

Sometimes, we will want to sort files on another column, other than the first column. A sort can be applied to other columns with the -t and -k switches.

-t : define a file delimiter

-k : key count to sort by (think of this as a column specified from the delimiter.

-n : sort in numeric order

Note − In some examples, we have used cat piped into grep. This was to demonstrate the concepts of piping commands. Outputting cat into grep can increase the system load hundreds of times over with large files while adding complex sorting. This will make veteran Linux administrators cringe.

Now that we have a good idea of how the pipe character works, this poor practice will be avoided in the chapters to follow. The key to keeping the system resources low with commands like sort is learning to use them efficiently.

[root@centosLocal centos]# sort -t ‘    ‘ -k 3n ./Documents/names.txt 

Ted           Daniel           101

Francisco     Gilbert          101

Jacqueline    Neal             102

Joan          Parsons          102

Gwendolyn     Chambers         108

Carrie        Todd             201

Nicolas       Singleton        203

Heidi         Simmons          204

Cristina      Torres           206

Billy         Crawford         301

Rose          Fisher           304

Aaron         Dennis           305

Matt          Davis            305

Bobbie        Chapman          403

Donald        Evans            403

Sonya         Weaver           403

Rosa          Summers          405

Terence       Anderson         501

Santos        Andrews          504

Dana          Maxwell          602

Blanche       Hayes            603

Jenny         Colon            608

Malcolm       Matthews         702

Antonia       Lucas            901

Dale          Barton           901

Sidney        Mack             901

Gina          Carr             902

Kellie        Curtis           903 

Marian        Little           903

[root@centosLocal centos]#

Now we have our list sorted by office number. The astute reader will notice something out of the ordinary after the -t switch; single quotes separated by what appears to be a few spaces. This was actually a literal Tab character sent to the shell. A literal Tab can be sent to the BASH shell using the key combination of control+Tab+v.

Most shells will interpret the Tab key as a command. For example, auto-completion in BASH. The shell needs an escape sequence to recognize a literal Tab character. This is one reason why Tabs are not the best choice for delimiters with Linux. Generally speaking, it is best to avoid both spaces and tabs, as they can cause issues when scripting a shell.

Let us fix our names.txt file.

[root@centosLocal centos]# sed -i ‘s/\t/:/g’ ./Documents/names.txt &&

cat ./Documents/names.txt

Ted:Daniel:101

Jenny:Colon:608

Dana:Maxwell:602

Marian:Little:903

Bobbie:Chapman:403

Nicolas:Singleton:203

Dale:Barton:901

Aaron:Dennis:305

Santos:Andrews:504

Jacqueline:Neal:102

Billy:Crawford:301

Rosa:Summers:405

Kellie:Curtis:903:

Matt:Davis:305

Gina:Carr:902

Francisco:Gilbert:101

Sidney:Mack:901

Heidi:Simmons:204

Cristina:Torres:206

Sonya:Weaver:403

Donald:Evans:403

Gwendolyn:Chambers:108

Antonia:Lucas:901

Blanche:Hayes:603

Carrie:Todd:201

Terence:Anderson:501

Joan:Parsons:102

Rose:Fisher:304

Malcolm: Matthews:702

[root@centosLocal centos]#

Now, it will be much easier to work with the text file. If someone demands it be returned to Tab-delimited for another application (this is common), we can accomplish that task easily as −

sed -i ‘s/:/\t/g’ ./Documents/names.txt

Common end-user applications will work well with Tabs as a delimiter (An Accountant does not want to see a colon separating data columns while working on Spreadsheets.). So learning to transform characters back and forth is a good practice; it comes up often.

Note − Office uses word-processors and spreadsheets with a Graphical User Interface, running on Windows. Hence, it is common for Linux Administrators to get good at completing transformation actions, accommodating end office users (most times, our boss will be an end-user).

Introduced was a command called sed. sed is a stream editor and can be used as a non-interactive text editor for manipulating streams of text and files. We will learn more about sed later. However, keep in mind, for now, using sed, we avoided a need to pipe several filter commands when changing our text file. Thus, making the most efficient use of the tools at hand.

We also introduced a Bash shell operator: &&&& will run the second command only if the first command completes with a successful status of “0”.

[root@centosLocal centos]# ls /noDir &&  echo “You cannot see me”

ls: cannot access /noDir: No such file or directory

[root@centosLocal centos]# ls /noDir ;  echo “You cannot see me”

ls: cannot access /noDir: No such file or directory

You cannot see me

[root@centosLocal centos]# ls /noDir;  echo “You cannot see me”

In the above code, note the difference between && and;? The first will only run the second command when the first has been completed successfully, while; simply chains the commands. More on this when we get to scripting shell commands.

Leave a Reply