Shell scripting

Rough Notes

  • file1 > file2 > file3 > /dev/null: Send content of file1,file2 to file3.
  • backslashes escape special characters.
  • tar czf: To zip a directory, czf means compress, zip and file.
  • du: Check disk usage.
  • Wildcards: Things like *.pdf will expand to pdf1.pdf pdf2.pdf and so on.
  • wc: prints words, characters and lines for list of files, e.g. wc file1.txt file2.txt.
  • To push output from shell to a file, put > outputfilename at the end.
  • cat: prints contents of files one after the other.
  • sort: output sorted contents of file.
  • head: head -k filename outputs first k lines in given file.
  • To use output of one command as input of another command, use the pipe like command 1 | command 2.
  • UNIX philosophy is to create simple tools that do one job well and work well with each other.
  • Pipes and filters model of shell programming.
  • -rwxrwxrwx here - means its a file, if its d its a directory. The first 3 rwx refer to the permissions the owner has, next 3 is the permissions the group owner permissions, last 3 is for everyone else. rwx stands for read, write, execute. For directories, x means being able to traverse the directory but not look at its direct contents, i.e. they can see contents of subdirectories inside that directory but not what is inside the directory.
  • Use ; to use multiple commands in the same line.
  • cat > filename gives user prompt, the input of which goes into the file (use ^D i.e. Control-D to finish), useful for creating quick shell scripts.
  • grep stands for global/regular expression/print. It finds and prints lines in files that match the given regexp pattern. Some flags include -w for only whole word matches, -n to print line number as well.
  • find can be used to find files instead of lines.
  • find and grep are commonly used together.
  • ps lists running processes.
  • To run a process in the background, put & at the end. To bring it to the foreground use fg.
  • set shows variables in the shell. The value of shell variables are strings. It is up to programs to convert them to other types like int when necessary. The PATH variable's value defines the directories that the shell looks for runnable programs.
  • echo prints its arguments, using with $ to print out values of variables, e.g. to print the value of the variable PATH, use echo $PATH.
  • Assigning a value to a variable only changes the variable's value in the shell it is run on. To solve this, one can use the export command, often placed in the .bashrc file which is executed when the shell starts. It is also common to use alias inside this file.
  • ssh, scp. Can run commands directly via ssh user@remote 'command'. Here first we need to separate the two communication channels: server to user and user to server, then public key encryption is used in each channel.
  • The 2> can be used to redirected stderror, since > will not redirect stderror to files. To redirect stdout and stderror, one can do for e.g. command > stdoutputfile 2> stderrorfile. To redirect both to the same file, use &>. Also, 1>>, 2>> (1> same as >) will append (and create if a file does not exist).
  • Pipes and filters work by using a single program on some input, however we might want to run the same program separately for each input. Loops can help here. For e.g. to zip all files with a certain filetype in the directory, the following loop will do: for file in *.pdb; do zip $file.zip $file; done. If such a command returns some output, we can pipe extra commands on them.
  • Merge PDFs via

    gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf file1.pdf file2.pdf
    
  • du -h -x -d 1 to list size of each directories in the current directory.
  • du --inodes -h -x -d 1 to list number of files within each directory in the current directory (add | sort -nr to sort by descending order - note that this is not robust since it would put 900 as higher than 900K for example).
  • ps -U $USER --no-headers -o rss | awk '{ sum+=$1} END {print int(sum/1024) "MB"}' to see memory consumption of current user - helpful when in clusters.
  • Compressing PDFs:

    gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
    
-dPDFSETTINGS Option Description
-dPDFSETTINGS=/screen Has a lower quality and smaller size. (72 dpi)
-dPDFSETTINGS=/ebook Has a better quality, but has a slightly larger size (150 dpi)
-dPDFSETTINGS=/printer Output is of a printer type quality (300 dpi)
-dPDFSETTINGS=/default Selects the output which is useful for multiple purposes. Can cause large PDFS.
  • Multiple patterns can be passed to find to narrow down results. E.g. to search for PDF files with 2023 in the filename:

    find . -name "*.pdf" -not -name "*2023*" 
    

Emacs 29.4 (Org mode 9.6.15)