General Purpose Tools
Diomidis Spinellis
Department of Management Science and Technology
Athens University of Economics and Business
Athens, Greece
dds@aueb.gr
The Unix Toolchest
Available under Unix, GNU/Linux, *BSD.
For Windows:
Working with Unix Tools
-  Fetch or generate data
 
-  Selection 
 
-  Processing 
 
-  Summarizing 
 
-  Program development 
 
Data Fetching and Generation
-  nm: object files
 
-  nm, ldd: executables
 
-  tar: archives
 
-  ar, jar: libraries
 
-  find: directory hierarchies
 
-  wget: the web
 
-  cvs or svn log or annotate: version control
 
-  jot or dd: generate artificial data
 
Selection
-  awk: delimited records
 
-  cut: fixed width files
 
-  sed: select row parts with regular expressions
 
-  grep: select rows with regular expressions
 
-  Combine the above in pipelines 
 
Processing
-  sort: order by multiple keys 
 
-  uniq: remove duplicates, count 
 
-  diff: compare sequential files
 
-  comm: compare ordered lists
 
-  tr: map between character sets
 
-  recode: map between text representations
 
-  tac/rev: reverse order of lines, characters
 
-  rs: reshape arrays
 
-  join: relational join
 
-  Scripting languages handle more complex tasks
 
Summarizing
-  wc: count lines, characters
 
-  head: first elements
 
-  tail: last elements
 
-  fmt: format into lines
 
-  awk with an END block
 
Plumbing
-  Pass data through pipeline
 
-  tee: tap output for further processing
 
-  xargs: apply command to numerous arguments
 
-  Pipe into a while loop for complicated processing 
 
Example: Analyze Java Files
Examine all Java files located in the directory src, and print the ten files with the highest number of occurrences of a method call to substring.
find src -name ’*.java’ -print |
xargs fgrep -c .substring |
sort -t: -rn -k2 |
head -10
Example: Determine Commit Times
find . -type f |
grep -v CVS |
xargs cvs -d /home/ncvs log -SN 2>/dev/null |
awk '/^date/{
    hour = substr($3, 0, 2)
    lines[$5 " " hour] += $9
  }
 END {
    for (i in lines) print i, lines[i]
  }' |
sed 's/;//g' |
sort >dev-times-lines.dat
join dev-times-lines.dat devlong >dev-times-lines-long.dat
Example Result:Plotted Commit Times
Program Development
-  Editors 
 
-  GNU Compiler Collection 
 
-  Language development tools (bison, flex) 
 
-  Tracers and debuggers 
 
-  Profilers 
 
-  Code formatting tools (cb, indent) 
 
-  Version control systems  
 
(More on the above later.)
Outwit Tools
- winclip - Access the windows clipboard 
 
- winreg - manipulate the windows registry 
 
- docprop - read document properties 
 
- odbc - select data from relational databases 
 
- readlink - resolve shell shortcuts 
 
- readlog - access the windows event log 
 
Outwit Examples
Create an list of fax recipients ordered by the number of faxes they have received.
readlog Application |
awk -F: "/Outbound: Information: Fax Sent/{print $12}" |
sort |
uniq -c |
sort -rn
Extracts the email address from all records from the table users which is part of the database userDB and sends them the file message by email.
fmt message |
mail $(odbc DSN=userDB "select email from users")
An Editor Checklist
- Regular expressions
 
- Syntax coloring
 
- Folding
 
- Error highlighting
 
- Indentation
 
- Marking of matching delimiters
 
- On-line help for API elements
 
- Code browsing
 
- Multiple buffers
 
- Macros
 
- Refactoring support
 
Two candidates: Emacs, vim.
Further Reading
- Alfred V. Aho, Brian W.
  Kernighan, and Peter J. Weinberger.
Awk—a pattern scanning and processing language.
In Unix Programmer's Manual [Unix Programmer's Manual, 1979].
Also available online http://plan9.bell-labs.com/7thEdMan/.
 
- S. R. Bourne.
An introduction to the UNIX shell.
In Unix Programmer's Manual [Unix Programmer's Manual, 1979].
Also available online http://plan9.bell-labs.com/7thEdMan/.
 
- Brian W. Kernighan
  and Rob Pike.
The UNIX
  Programming Environment.
Prentice Hall, Englewood Cliffs, NJ, 1984.
 
- Geoffrey J. Noer.
Cygwin32:
  A free Win32 porting layer for UNIX applications.
In Susan Owicki and Thorsten von Eicken, editors, Proceedings of the 2nd
  USENIX Windows NT Symposium, Berkeley, CA, August 1998. Usenix
  Association.
 
- Eric Steven Raymond.
The Art of
  Unix Programming.
Addison-Wesley, 2003.
 
- Dennis M. Ritchie
  and Ken Thompson.
The UNIX time-sharing system.
Communications of the ACM, 17(7):365–375, July 1974.
 
- Diomidis Spinellis.
Outwit:
  Unix tool-based programming meets the Windows world.
In Christopher Small, editor, USENIX 2000 Technical Conference
  Proceedings, pages 149–158, Berkeley, CA, June 2000. USENIX
  Association.
 
- Diomidis Spinellis.
Dear
  editor.
IEEE Software, 22(2):14–15, March/April 2005.
(doi:10.1109/MS.2005.36 (http://dx.doi.org/10.1109/MS.2005.36))
 
- Diomidis Spinellis.
Software
  engineering glossary, version control, part 2.
IEEE Software, 22(6):c2–c3, November/December 2005.
(doi:10.1109/MS.2005.169 (http://dx.doi.org/10.1109/MS.2005.169))
 
- Diomidis Spinellis.
Software
  engineering glossary, version control, part I.
IEEE Software, 22(5):107, September/October 2005.
(doi:10.1109/MS.2005.141 (http://dx.doi.org/10.1109/MS.2005.141))
 
- Diomidis Spinellis.
Version
  control systems.
IEEE Software, 22(5):108–109, September/October 2005.
(doi:10.1109/MS.2005.140 (http://dx.doi.org/10.1109/MS.2005.140))
 
- Diomidis Spinellis.
Working
  with Unix tools.
IEEE Software, 22(6):9–11, November/December 2005.
(doi:10.1109/MS.2005.170 (http://dx.doi.org/10.1109/MS.2005.170))
 
- UNIX
  Programmer's Manual. Volume 2—Supplementary Documents.
Bell Telephone Laboratories, Murray Hill, NJ, seventh edition, 1979.
Also available online http://plan9.bell-labs.com/7thEdMan/.
 
Exercises and Discussion Topics
-  Acquaint yourself with the command-line of the computer you're
using. If it doesn't have ready access to the Unix tools, install
them.
 
-  Evaluate the editor you're using, and adopt a new one if needed.