Thursday, August 8, 2013

Newbie help: Finding yourself (assuming you're a script)

Sometimes a script needs to figure out its own filename.  This is useful when creating error output and logs -- basically, anytime that the source of a chunk of output might be ambiguous.  This is done via basename:

myname = `basename $0`

Why basename $0?  If you think about handling arguments in shell scripts, you'll recall that the first argument is $1, the second is $2, and so on.  So the full pathname of the script is the zeroth argument.  Basename simply strips that string down to the filename.

To do the same thing, but get the containing directory instead, use

mydir = `dirname $0`

Easy enough.

Unfortunately, $0 only gives you the path of the script as it was run.  So if we do this:

cd /opt/somedir
./subdir/script.sh

then dirname $0 will only return ./subdir .  That's not useful if we need to change directories during the course of script.sh without losing track of where we started.

Here's the solution:

SCRIPTPATH=$( cd "$(dirname "$0")" ; pwd -P )

Spawn a subshell, cd into the directory using the relative or absolute path from dirname $0, and then print the working directory of the subshell.  This version resolves symlinks; omit the -P to preserve them.

(This post is mostly an excuse to write that last trick down.  Thanks to the folks on this post for coming up with it.)

Wednesday, May 8, 2013

Newbie help: find basics

The find command is a general file-finding utility with much a much broader range of talents than locate.  Find starts from a given directory and locates all files below that directory that meet specified constraints.  You can look for files by name (or regex), size, age, ownership, permissions, or any combination.

Find has built-in actions, such as delete, ls, print[ to file], and exec[ute-another-command].  You can also pipe the output into other commands.  I'll cover actions in a later post.

Basic syntax:
find path -constraint value

Add as many constraints as you like.   Any flag can be negated with a bang; see the very last examples for the syntax.

The most common constraints are size, and modified date (-mtime).  Both of these flags can serve as an upper or lower bound.

Example:  Which files below the current directory are less than five days old and more than 5kb?  Which are more than five days old and less than 5kb?

find . -mtime -5 -size +5000
find . -mtime +5 -size -5000

The -type flag allows you to specify whether you're looking for files or directories.  By default, find returns both.  Depending on your OS, the values will be either "f" or "file", and either "d" or "dir".  Check man for details.

The -name flag can take a full case sensitive filename, or a regex pattern.  If you use a pattern, enclose it in double quotes.

find /app -name applog
find /app -name "*log"
find . -name "*dunno*"
find . -name "[j,J]ava*"

Find can also identify files owned by a specific user or group.  (This flag is available on OS X, but doesn't appear to work.)

find /tmp -user kexline

Combine flags to ask real-world questions about your files.  Example:  What largish, non-gzipped  files did I create in my application directory since this time yesterday?

find /app -type f -mtime -1 -user kexline -size +100000 ! -name "*gz"

As promised, there's a bang negation.  Any flag can be prepended with a bang to reverse its meaning.  What files did all the other non-root users create yesterday, anywhere on the machine?

sudo su 
find / -type f -mtime -1 ! -user kexline ! -user root 

Newbie help: The "locate" command

Some *nixes, including Centos and Red Hat have an indexed finder utility called locate.  It's a very limited tool, but when you know what you're looking for, it can be the quickest way to find a file.

Basic usage:
locate [-i insensitive, -w wholename] a-filename-substring

It will generally return too much info, so grep for other known substrings to pare down the results.

Example:  I know exactly what file I want, but where did I put it?
$locate iCanHasPortlet.war
/app/liferay/iCanHasPortlet.war

Example:  I only know bits of a ridiculously long filename, what was it?
Mac:~ kexline$ locate -i license | grep -i prod | grep -i acme
/Users/kexline/Documents/liferay/licenses/license-production-production-5.6sp3-acmecomputercompany-main.xml

If locate doesn't cough up the goods, then your index is probably older than the desired file.  (Locate is not useful for new files; but then again, if something is very new you should know what or where it is.)  On RH/Centos, become root, run updatedb & , and get some coffee. 

Newbie help: Is this file in use?


You can use "fuser" to check whether any processes have a file open.

$ fuser somefile 
somefile:        20719
$ fuser anotherfile
$

Pid 20719 is using somefile, but nothing is using anotherfile.

Pipe queen bonus:  I want to compress some old stuff in a directory.  Here are some big files:
$du -sk ./* | sort -n | tail -5
71796 ./applog.20120418.gz
349252 ./applog
422952 ./applog.20120613
598672 ./applog.20120627
2816980 ./applog.20130204

Which of the largest five files are in use, if any?
$du -sk ./* | sort -n | tail -5 | awk '{print $2}' | xargs fuser
./applog:      44827

Pid 44827 is using the current applog.  The last three are safe to compress.

(Tangent:  Compression generally writes to the current filesystem.  Therefore, you can't compress in a filesystem that's 100% full.)

Newbie help: Why is my filesystem full?

If a filesystem fills up and you have no idea where to start, try the disk usage command. This will at least help you find breathing room, and it may help you find the overgrown or unexpected file that caused the problem.

If you haven't already, use df to make sure you know which filesystem is full.  Many systems only have one filesystem, but it's more common to have a few.  For the purpose of this discussion, you're only concerned about files below the mount point of one overutilized filesystem.


Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vgroot     30G  3.6G   25G  13% /
tmpfs                 1.9G  4.0K  1.9G   1% /dev/shm

/dev/mapper/vgapp      20G  6.7G   13G  96% /app

nfssrv:/vol/devshare   10G  192K   10G   1% /app/dshare
nfssrv:/export/home/fred
                      102G   90G   12G  89% /home/fred


cd to the mount point or a point in the directory tree where you suspect things might have gone awry, then do this:

du -sk ./* | sort -n 

In English, that's "disk usage -specific_file -kilobytes, then sort -numeric." These flags will be the same on any *nix.

Non-root users will receive a lot of "Permission denied" errors when running this command in system or shared directories. This can be a good thing; if you're interested in finding files created by a certain user, the errors show you where not to look. On the other hand, if you're really in the dark and/or troubleshooting a multiuser system, you may need to become root.

You can use du to identify the largest directories, which may help you find your problems. 

An example:

$cd /app
$ls -atlr
total 8167528
dr-xr-xr-x. 30 root   root      4096 Mar 20 03:51 ..
drwxr-xr-x.  7 lruser lrgrp     4096 Mar 26 20:04 common
drwxrwxr-x.  9 lruser lrgrp  1490944 Apr 23 16:24 .
drwxr-xr-x. 16 lruser lrgrp     4096 May  1 16:52 calendar
drwxr-xr-x. 14 lruser lrgrp     4096 May  7 18:16 liferay

Not much help, right?  But du makes it easy to decide where to look first.
$du -sk ./* | sort -n
644116 ./common
1563248 ./calendar
8613876 ./liferay

So the Liferay directory's pretty chubby.  Why is that?
$cd liferay
$du -sk ./* | sort -n
0 ./jdk-liferay
4 ./deploy
520884 ./data
1092408 ./logs
1232588 ./backup
5492700 ./appserver

Unsurprisingly, the JVM accounts for most of Liferay's 8gb.  I should also look for things to delete from the backup directory, but let's stick with the big fish for now.
$cd appserver
$du -sk ./* | sort -n
...
149408 ./temp
419056 ./web
4759364 ./logs

And again unsurprisingly, the logs directory accounts for most of the JVM's utilization.  Du works on files as well, so you could use it to quickly find the largest file in a directory.
$cd logs
$du -sk ./* | sort -n
54340 ./applog.20121001.gz
71796 ./applog.20120418.gz
349252 ./applog
422952 ./applog.20120613
2816980 ./applog.20130204

Wait, I'm lazy.  How big was that last file?
$ls -lh applog.20130204
-rw-r--r--. 1 lruser lrgrp 2.7G Feb  4 16:55 applog.20130204

Many commands that show file sizes, such as df,  du, and ls, have a -h (-human_readable) flag.  We use du -k because the -h output is not easily sortable.

If you haven't freed enough space or found your culprit, pick another place in the tree and du down from there.