23 July 2010

Changing Shell in Restrictive Environments (w/o `chsh`)

In certain restrictive environments you may not have access to the `chsh` command, and therefore you may resign yourself to having to do annoying things like invoking `~/bin/bash` from "~/.cshrc".

This method can be problematic as you are creating a child process of the (in the example above) csh parent process.  You will quickly get tired of having to exit out of bash and then csh, or the handful of other quirks and oddities that may result.

A much better solution is to make use of the "exec()" system call by way of a shell built-in.  This call can be used to replace an existing process rather than creating a child process.

Here is an example of something you might include at the top of your "~./cshrc":
# Replace current shell with Bash if ~/bin/bash exists:
if ( -e ~/bin/bash ) then
   exec bash
endif

From the Bash `help` built-in command:
$ help exec
exec: exec [-cl] [-a name] file [redirection ...]
    Exec FILE, replacing this shell with the specified program.
    If FILE is not specified, the redirections take effect in this
    shell.  If the first argument is `-l', then place a dash in the
    zeroth arg passed to FILE, as login does.  If the `-c' option
    is supplied, FILE is executed with a null environment.  The `-a'
    option means to make set argv[0] of the executed process to NAME.
    If the file cannot be executed and the shell is not interactive,
    then the shell exits, unless the shell option `execfail' is set.

05 June 2010

Another Update

Well it has been 2 months since I started at Cisco Systems and I will be relocating down to San Jose early July.

The night class is over as of last week and I will be able to get back to creating some interesting posts with my reclaimed time fairly soon.

I'm looking forward to creating a set of posts on implementing various MPLS applications on Linux.

10 April 2010

New Job at Cisco + Night Class

To the folks that have been following my blog since I started it just last month: I will begin posting regularly after a brief hiatus.

I am in the process of adopting to a new job at Cisco Systems, and I am also trying to balance a night class.  As I get settled into the job and the night class finishes up I will have more free cycles to provide more thoughtful posts in the future.

Stay tuned!

30 March 2010

xargs: Executing a Fixed # of Processes in Parallel

`xargs` is a really great GNU utility that reads items from standard input and executes a command a certain number of times using the blank delimited input as final arguments.

Many times you can achieve the same result by using `find -exec`, but there is one thing that I really like about `xargs`:  concurrency via the --max-procs option.

I will provide a few basic `xargs` examples, and I will then conclude with an example of using the --max-procs option.

Pass the first field of the first ten lines of '/etc/passwd' (user ID's) as arguments to the `id` command (-I causes newline to become separator):
$ cut -f1 -d: < /etc/passwd | head -10 | xargs -I '{}' id '{}'
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel) context=user_u:system_r:unconfined_t
uid=1(bin) gid=1(bin) groups=1(bin),2(daemon),3(sys) context=user_u:system_r:unconfined_t
uid=2(daemon) gid=2(daemon) groups=1(bin),2(daemon),4(adm),7(lp) context=user_u:system_r:unconfined_t
uid=3(adm) gid=4(adm) groups=3(sys),4(adm) context=user_u:system_r:unconfined_t
uid=4(lp) gid=7(lp) groups=7(lp) context=user_u:system_r:unconfined_t
uid=5(sync) gid=0(root) groups=0(root) context=user_u:system_r:unconfined_t
uid=6(shutdown) gid=0(root) groups=0(root) context=user_u:system_r:unconfined_t
uid=7(halt) gid=0(root) groups=0(root) context=user_u:system_r:unconfined_t
uid=8(mail) gid=12(mail) groups=12(mail) context=user_u:system_r:unconfined_t
uid=9(news) gid=13(news) groups=13(news) context=user_u:system_r:unconfined_t

Pass each line of the output of `pgrep ssh` to `ps` as command-line arguments:
$ pgrep ssh
3312
17036
21839
26978
27535
27538
31416
31419
31833
31837
32539
32542
$ pgrep ssh | xargs ps
  PID TTY      STAT   TIME COMMAND
 3312 ?        Ss     0:00 /usr/sbin/sshd
17036 ?        Ss     0:00 /usr/bin/ssh-agent /usr/bin/dbus-launch --exit-with-session /etc/X11/xinit/Xclients
21839 pts/7    S+     0:00 ssh steve@saturn01
26978 pts/8    S+     0:00 ssh steve@venus01
27535 ?        Ss     0:00 sshd: steve [priv]
27538 ?        S      0:02 sshd: steve@pts/8
31416 ?        Ss     0:01 sshd: steve [priv]
31419 ?        S      1:00 sshd: steve@pts/14
31833 ?        Ss     0:01 sshd: steve [priv]
31837 ?        S      0:02 sshd: steve@pts/15
32539 ?        Ss     0:04 sshd: steve [priv]
32542 ?        S      0:14 sshd: steve@pts/12

Here is an example of how you could use either `xargs` or `find -exec` to achieve the same result:
find -type f | xargs md5sum
d41d8cd98f00b204e9800998ecf8427e  ./test_file3
d41d8cd98f00b204e9800998ecf8427e  ./test_file1
d41d8cd98f00b204e9800998ecf8427e  ./test_file2
d41d8cd98f00b204e9800998ecf8427e  ./test_file4
find -type f -exec md5sum '{}' \;
d41d8cd98f00b204e9800998ecf8427e  ./test_file3
d41d8cd98f00b204e9800998ecf8427e  ./test_file1
d41d8cd98f00b204e9800998ecf8427e  ./test_file2
d41d8cd98f00b204e9800998ecf8427e  ./test_file4

Finally, the example we have all been waiting for.  One of the things I really like about `xargs` is the ability to define how many processes can be invoked concurrently while executing built command lines from the standard input.  The default maximum processes with `xargs`

Here is an example of using `xargs` to allow four concurrent processes to execute at a time, but no more:
$ find -type f | xargs --max-procs=4 -I '{}' -i sh -c "echo '{}' ; sleep 5"
./test_file10
./test_file8
./test_file3
./test_file12
<~5 second delay for each>
./test_file1
./test_file6
./test_file13
./test_file11
<~5 second delay for each>
./test_file7
./test_file2
./test_file14
./test_file5

Here is the output of the process table at 5 second intervals for the previous example:
$ ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
steve     1078 12879  0 14:48 pts/10   00:00:00 xargs --max-procs=4 -I {} -i sh -c echo '{}' ; sleep 5
steve     1079  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file10' ; sleep 5
steve     1080  1079  0 14:48 pts/10   00:00:00 sleep 5
steve     1081  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file8' ; sleep 5
steve     1082  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file3' ; sleep 5
steve     1083  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file12' ; sleep 5
steve     1084  1083  0 14:48 pts/10   00:00:00 sleep 5
steve     1085  1082  0 14:48 pts/10   00:00:00 sleep 5
steve     1086  1081  0 14:48 pts/10   00:00:00 sleep 5
steve     1087 12879  0 14:48 pts/10   00:00:00 ps -f
steve    12876 29766  0 09:41 pts/10   00:00:00 su -
steve    12879 12876  0 09:41 pts/10   00:00:00 -bash
$ ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
steve     1078 12879  0 14:48 pts/10   00:00:00 xargs --max-procs=4 -I {} -i sh -c echo '{}' ; sleep 5
steve     1090  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file1' ; sleep 5
steve     1091  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file6' ; sleep 5
steve     1092  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file13' ; sleep 5
steve     1093  1090  0 14:48 pts/10   00:00:00 sleep 5
steve     1094  1091  0 14:48 pts/10   00:00:00 sleep 5
steve     1095  1092  0 14:48 pts/10   00:00:00 sleep 5
steve     1096  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file11' ; sleep 5
steve     1097  1096  0 14:48 pts/10   00:00:00 sleep 5
steve     1100 12879  0 14:48 pts/10   00:00:00 ps -f
steve    12876 29766  0 09:41 pts/10   00:00:00 su -
steve    12879 12876  0 09:41 pts/10   00:00:00 -bash
$ ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
steve     1078 12879  0 14:48 pts/10   00:00:00 xargs --max-procs=4 -I {} -i sh -c echo '{}' ; sleep 5
steve     1102  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file7' ; sleep 5
steve     1103  1102  0 14:48 pts/10   00:00:00 sleep 5
steve     1104  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file2' ; sleep 5
steve     1105  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file14' ; sleep 5
steve     1106  1078  0 14:48 pts/10   00:00:00 sh -c echo './test_file5' ; sleep 5
steve     1107  1104  0 14:48 pts/10   00:00:00 sleep 5
steve     1108  1105  0 14:48 pts/10   00:00:00 sleep 5
steve     1109  1106  0 14:48 pts/10   00:00:00 sleep 5
steve     1112 12879  0 14:48 pts/10   00:00:00 ps -f
steve    12876 29766  0 09:41 pts/10   00:00:00 su -
steve    12879 12876  0 09:41 pts/10   00:00:00 -bash

Hopefully you have found this post useful, and it has perhaps shown you one of the less frequently known perks of using `xargs`.

26 March 2010

Two Types of "For" Loops in Bash

There are two ways to structure "for" loops in Bash.  One way allows for the iteration over a list of items, and the other way is by using the more traditional arithmetic expression.

Using "for" to iterate over a list of items is the method I prefer, I will use the following examples to indicate why.

Method #1
By specifying a small list of numbers you could iterate over them:
$ for each in 1 2 3 4; do echo "Number: $each"; done
Number: 1
Number: 2
Number: 3
Number: 4

You could specify a path using path expansion to create the list:
$ for each in ./{test_file1,test_file{2,3,4,5}}; do echo "Name: $each"; done
Name: ./test_file1
Name: ./test_file2
Name: ./test_file3
Name: ./test_file4
Name: ./test_file5

You could also use command substitution to create the list:
$ for each in $(echo "this is a test"); do echo "Result: $each"; done
Result: this
Result: is
Result: a
Result: test

And here is the example that will indicate why I rarely need to use the type of "for" loop described in Method 2:
$ for each in $(seq 1 5); do echo "Number: $each"; done
Number: 1
Number: 2
Number: 3
Number: 4
Number: 5

By using command substitution and the `seq` GNU utility you can create a list of numbers (in forward or reverse order) and use Method 1 to iterate over each item in the list. Unless you needed to loop over a complicated arithmetic expression, using `seq` would normally be sufficient.

Method #2
The more traditional arithmetic type of "for" loop can also be used, as shown in the following example:
$ for (( each=1 ; each<=32 ; each=$each*2)); do echo "Number: $each"; done
Number: 2
Number: 4
Number: 8
Number: 16
Number: 32

Method 2 allows for the use of complicated arithmetic expressions, whereas Method 1 relies on the construction of a list of items to iterate over.  I would imagine that if you needed to loop over a very large range of numbers Method 2 would consume substantially less memory and fewer CPU cycles compared to using Method 1 and some form of command substitution.

25 March 2010

strace: A Great Troubleshooting Tool

`strace` is a utility that allows for the tracing of system calls and signals.  There are several options that can be used in order to only display specific system calls, display statistics, and otherwise modify the format of the output.

You can attach to an existing PID or you can specify a command and arguments to be traced as a part of the `strace` command line:
usage: strace [-dffhiqrtttTvVxx] [-a column] [-e expr] ... [-o file]
              [-p pid] ... [-s strsize] [-u username] [-E var=val] ...
              [command [arg ...]]
   or: strace -c -D [-e expr] ... [-O overhead] [-S sortby] [-E var=val] ...
              [command [arg ...]]
-c -- count time, calls, and errors for each syscall and report summary
-f -- follow forks, -ff -- with output into separate files
-F -- attempt to follow vforks, -h -- print help message
-i -- print instruction pointer at time of syscall
-q -- suppress messages about attaching, detaching, etc.
-r -- print relative timestamp, -t -- absolute timestamp, -tt -- with usecs
-T -- print time spent in each syscall, -V -- print version
-v -- verbose mode: print unabbreviated argv, stat, termio[s], etc. args
-x -- print non-ascii strings in hex, -xx -- print all strings in hex
-a column -- alignment COLUMN for printing syscall results (default 40)
-e expr -- a qualifying expression: option=[!]all or option=[!]val1[,val2]...
   options: trace, abbrev, verbose, raw, signal, read, or write
-o file -- send trace output to FILE instead of stderr
-O overhead -- set overhead for tracing syscalls to OVERHEAD usecs
-p pid -- trace process with process id PID, may be repeated
-D -- run tracer process as a detached grandchild, not as parent
-s strsize -- limit length of print strings to STRSIZE chars (default 32)
-S sortby -- sort syscall counts by: time, calls, name, nothing (default time)
-u username -- run command as username handling setuid and/or setgid
-E var=val -- put var=val in the environment for command
-E var -- remove var from the environment for command

Today at work a colleague was trying to figure out where certain configuration files for a third party application were located on the filesystem.  After he spent some time hunting around for them I turned him onto the notion of using `strace` in order to see which files the particular application was attempting to open().  He was then able to determine where the configuration files were located.

To trace all of the system calls which take a file name as an argument, you can use the '-e trace=file' option.  You can more specifically trace only the open() call by using '-e trace=open'.

Here is an example of using `strace`to trace the open() system calls by the MySQL client:
$ strace -e trace=open mysql
<...>
open("/etc/my.cnf", O_RDONLY|O_LARGEFILE) = 3
open("/etc/nsswitch.conf", O_RDONLY)    = 3
open("/usr/lib/mysql/libnss_files.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
open("/lib/libnss_files.so.2", O_RDONLY) = 3
open("/etc/services", O_RDONLY|O_CLOEXEC) = 3
open("/usr/share/mysql/charsets/Index.xml", O_RDONLY|O_LARGEFILE) = 4
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.1.42 Source distribution

open("/root/.mysql_history", O_RDONLY|O_LARGEFILE) = 4
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

open("/etc/localtime", O_RDONLY)        = 4
open("/usr/share/terminfo/x/xterm", O_RDONLY|O_LARGEFILE) = 4
open("/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE) = 4
open("/etc/inputrc", O_RDONLY|O_LARGEFILE) = 4
open("/usr/lib/gconv/gconv-modules.cache", O_RDONLY) = 4
mysql>

`strace` is an enormously useful diagnostic tool.  I highly recommend that any Linux administrator that hasn't taken the time to become familiar with it make the time.

24 March 2010

Commonly Used Gzip Utilities

There are a several useful utilities that are commonly included in the gzip package on most Linux distributions.  Most folks are familiar with `gzip`, so I will focus on some of the others.  Most of these commands could be performed by first extracting the the archive and running the equivalent normal (non-z prefixed) utility.

These utilities typically work on both compressed files as well as uncompressed files.  This allows you to easily perform operations on combinations of compressed and uncompressed files.

`zcat` is identical to `gunzip -c` and uncompresses a list of files (or its standard input) and writes it to standard output:
$ zcat /usr/share/man/man1/gzip.1.gz | groff -Tascii -man
GZIP(1)                                                                GZIP(1)



NAME
       gzip, gunzip, zcat - compress or expand files

SYNOPSIS
       gzip [ -acdfhlLnNrtvV19 ] [-S suffix] [ name ...  ]
       gunzip [ -acfhlLnNrtvV ] [-S suffix] [ name ...  ]
<...>

`zmore` and `zless` provide an easy way to page through the contents of a gzip archive.  Trying using it instead of `zcat` in the previous example.

`zgrep`, as you might expect, allows you to search compressed files for a regular expression by invoking `grep`.

`zforce` takes a list of files and forces that the ".gz" extension be suffixed onto the filename.  This prevents inadvertently compressing files twice.

`zdiff` or `zcmp` invokes diff or cmp on the contents of two files.

`gzexe` is an interesting one.  It attempts to compress an executable in place, and have it automatically decompress and execute at runtime:
$ ls -l debugfs
-rwxr-xr-x. 1 root root 88528 Mar 24 21:08 debugfs
$ gzexe debugfs
debugfs:     57.0%
$ ls -l debugfs*
40 -rwxr-xr-x. 1 root root 38871 Mar 24 21:09 debugfs
88 -rwxr-xr-x. 1 root root 88528 Mar 24 21:08 debugfs~

The original file is backed up to "debugfs~", and if you open up the new copy of "debugfs" you will notice that about the first 43 lines of the file contain shell script intended to facilitate a decompress of the remainder of the file and then execute it.

Hopefully this post has at least exposed a little bit more about those oddly (but appropriately named) extra gzip utilities.