Introduction
to Bash Scripting
Katherine Holcomb
October 18, 2017
www.arcs.virginia.edu
Outline
- Scripting Introduction
- Bash Script Structure
- Hands-on: * Executing a Script * Variables and Expressions * Conditionals * Comparisons (Integer Values and Strings) * Loops * Command Line Arguments * String Operations * Arrays * Dealing with Files
More Tutorials and Resources
- Linux Shell Scripting Tutorial:
- Advanced Scripting Tutorial
- Regex Tutorials
- Sed and (g)awk tutorial * http://www.grymoire.com/Unix/Sed.html * http://www.grymoire.com/Unix/Awk.html
What is a Script, What Can it be used for?
A Bash script is a plain text file that contains instructions for the computer to execute.
Scripts are interpreted at runtime and executed line-by-line. Scripts are not standalone executables but must be run through an interpreter.
Anything that can be executed or evaluated on the bash command line can be placed into a script.
Frequently used to automate repetitive tasks:
* File handling\, data backups
* Schedule computing jobs\, e\.g\. on UVA’s Rivanna High\- Performance Computing Cluster\.
How to write a script
1) Bash shell environment to execute scripts
- 2) Needed: Text Editor to create and edit script files * - vi, vim, nano * - gedit through FastX * - avoid Windows Notepad, Microsoft Word
Our Workspace Setup
Basic Unix commands:
cd <path> changes current directory to <path>
cd ~/ changes to your home directory
pwd shows full path of current directory
mkdir –p <path> create entire directory <path>
ls lists non-hidden files in current directory
ls *.sh lists non-hidden files that end in .sh
mv <old_filename> <new_filename> move or rename file
cd ~/
pwd
mkdir scripts
cd scripts
pwd
ls
Bash Script Structure: A Simple Example
Create script hello.sh
#!/bin/bash
#This is a simple bash script
echo “Hello world!” #print greeting to screen
echo “I am done now!”
The first line tells the system to start the shell and execute the commands. It is sometimes called a “shebang” . An alternative is:
#!/ usr /bin/ env _ bash_
Scripts can be annotated with comments, characters after the # character are ignored by the interpreter
The third and fourth line are commands.
Executing A Script
Suppose the previous script was in a file hello.sh
By default it is just text. You must explicitly make it executable:
chmod u+x hello.sh
Then you can just invoke it by name at the command line:
./hello.sh
Alternative: Run via bash interpreter
bash hello.sh
Comments and Line Continuations
All text from the # to the end of the line is ignored
To continue a statement on to the next line, type \ <enter>
#This is a comment
echo “This line is too long I \
want it to be on two lines”
Multiple statements can be placed on the same line when separated by the ;
a=20 ; b=3
Sourcing
- If you execute the script as a standalone script, it starts a new shell -- that is what the shebang does -- if you do not use that you can type
- bash myscript.sh
- Sometimes you just want to bundle up some commands that you repeat regularly and execute them within the current shell. To do this you should _not include _ a shebang , and you must source the script
- . <script> or source <script>
Variables: Placeholders to store values
The shell permits you to define variables.
__Scalar Variable: __ Holds single value
color = blue
echo $ color
__Array Variable: __ Holds multiple values
colors = (blue red green)
echo $ {colors[@]} # prints all values
echo $ {colors[0]} # prints first value
echo $ {colors[1]} # prints second value
A ssigning a value to a variable. Do not put spaces around the __ = __ sign.
R eferencing a variable. The $ tells the interpreter to look up the value of the variable symbol.
Exercise
Create a new script color.sh
Define a variable and assign it a value
Print the variable’s value to the screen
Executing a command
Characters between a pair of backticks ` are interpreted and executed as commands
The output of an executed command can be stored in variable or used as input for another command
In current shell versions $(COMMAND) is equivalent to `COMMAND`
__Create new script __ command.sh
#!/bin/bash
echo ` date ` # exec date command, print output
p= ` pwd ` # assign output to variable first
echo Current directory: $p
Escaping Special Characters
Bash uses several special characters like $, “, ‘, `, \, and others that are interpreted with special meaning.
At your bash prompt, type
echo Cool. You won $100,000 in the lottery.
# Try with escaping the $ character
Echo Cool. You won \ $100,000 in the lottery.
The \ character instructs the interpreter to treat the next character as a literal and ignore (escape) any possible special meaning
Curly Braces: Keeping a tight wrap on variable names
Create script name.sh
#!/bin/bash
firstname=Peter
lastname=Miller
fullname="$firstname $lastname"
# simple variable operation
echo Hello $fullname
#Try to echo “Where are the Millers?"
echo Where are the $lastname s?
echo Where are the $lastnames?
echo Where are the ${lastname}s?
Curly braces {} are used to separate a variable name from adjacent characters.
Strings, Single (') and Double (") quotes
Bash is not very good at math so much of what we do with Bash involves strings , e.g. a sequence of text and special characters.
Pair of Single Quotes ‘'
preserves the literal value of each character within the quotes.
Pair of Double Quotes “"
preserves the literal value of all characters within the quotes with the exception of $ , ` , and \ , which retain their special meaning.
$ Shell Expansion, e.g. variable name follows
\ Escape character
`COMMAND` Executes COMMAND and returns its output
stream; equiv. to $(COMMAND) in newer versions
Comparing Single and Double Quotes
__Example: __
Open name.sh and add the following statements
# Compare single and double quotes
echo ‘Where is Mr. ${lastname}?’
echo “Where is Mr. ${lastname}?”
Output:
Where is Mr. ${lastname}?
Where is Mr. Miller?
Conditionals
- The shell has an if else construct to allow execution of distinct command sets based on a predefined condition
- if [[ <condition> ]]
- then
- commands
- elif [[ <condition> ]] _#optional _ commands
- else #optional
- more commands
- fi _#ends the if condition block _
__Note: A space is required after __ __[[ __ __and before __ ]]
String Comparison Operators
- The conditions take different forms depending on what you want to compare.
- String comparison operators:
- = equal
- != not equal
- < lexically before (may need to be escaped)
-
lexically after \(may need to be escaped\)
- -z zero length
- -n not zero length
- == can be used but behaves differently in [] versus [[]]
Example: String Comparisons
__Create new script __ ifname.sh
- #!/bin/bash
- if [[ $name == “Tom” ]]; then
- echo ${name}, you are assigned to Room 12
- elif [[ $name == “Harry” ]]; then
- echo ${name}, please go to Room 3
- elif [[ -z $name ]]; then
- echo You did not tell me your name
- else
- echo “${name}, I don’t know where \
- you are supposed to go”
- fi
Exercise
- Insert this line below the shebang:
- name=Tom
- Rerun the script.
- Change the name to an empty string:
- name=””
- Rerun the script.
Numeric Comparisons
Numeric comparisons are possible:
-eq equal
-ne not equal
-gt greater than
-ge greater than or equal
-lt less than
-le less than or equal
a=2
if [[ $a -eq 2 ]] ; then
echo It is two.
fi
Testing File Properties
- This is a very common occurrence in bash scripts so we have a set of operators just for this.
- The most common operators are:
- -e <file> : file exists
- -f <file> : file exists and is a regular file
- -s <file> : file exists and has length > 0
- -d
: exists and is a directory
Example: Testing File Properties
- if [[ -d $to_dir ]]; then
- if [[ -f $the_file ]]; then
- cp $the_file $to_dir
- fi
- if [[ -f $the_file ]]; then
- else
- mkdir $to_dir
- echo “Created $to_dir”
- fi
Other Conditional Operators
! not ; negates what follows
-a logical and ; for compound conditionals ;
can use && with [[ ]]
-o logical or ; for compound conditionals
can use || with [[ ]]
Case Statement
- case expression in
- pattern1 )
- statements ;;
- pattern2 )
- statements ;;
- pattern3 )
- statements ;;
- *)
- statements ;; # default statement
- pattern1 )
- esac # defines end of case block
Example
case $filename in
*.c)
echo “C source file”
;;
*.py)
echo “Python script file”
;;
*) #optional, indicates default
echo “I don’t know what this file is”
;;
esac
For Loops
The bash for loop is a little different from the equivalent in C/C++/Fortran (but is similar to Perl or Python)
for variable in iterator
do
commands
done # defines end of for loop
Examples
__Create new script __ forloop.sh :
- for i in 1 2 3 4 5 ; do
- echo “Loop 1: I am in step $i”
- done
- for i in {1..5} ; do #bash 3.0 and up
- echo “Loop 2: I am in step $i”
- done
- for i in 0{1..9} {90..100} ; do
- echo “Loop 3: I am in step $i”
- done
“Three-Expression” For Loop
for (( EXPR1; EXPR2; EXPR3 ))
do
statements
done
Open __ forloop.sh __ and add:
name=“file”
IMAX=10
for (( i=0 ; i<${IMAX} ; i=i+2 )); do
echo “${name}.${i}”
done
While Loop
Iterate through loop as long as <condition> is evaluated as true
- while [[ condition ]]
- do
- command
- command
- command
- done # defines end of while loop
One of the commands in the while loop _must _ update the condition so that it eventually becomes false.
break
If you need to exit a loop before its termination condition is reached you can use the break statement.
- while [[ condition ]]
- do
- if [[ disaster ]]; then
- break
- fi
- command
- command
- if [[ disaster ]]; then
- done
continue
To skip the commands for an iteration use continue
- for i in iterator
- do
- if [[ condition ]]
- then
- continue # skips to next iteration
- fi
- command_1
- command_2
- done
Example: Combine while and case
- while plus case
- while [[ $# -gt 0 ]]; do
- case “$1” in
- -v) verbose=“on”;;
- -*) echo >&2 “USAGE: $0 [-v] [file]”
- exit 1;;
- *) break;; # default
- esac
- shift
- done
Bash Arithmetic
- We said earlier that bash is bad at arithmetic, but some basic operations can be performed.
- Expressions to be evaluated numerically must be enclosed in double parentheses .
- x=$((4+20))
- i=$(($x+1))
It works only with integers . If you need more advanced math (even just fractions!) you must use bc.
Math meets Bash: bc
- If you really, really, really must do math in a bash script, most of the time you must use bc
- The syntax is very peculiar
- x=$(echo “3*8+$z” | bc)
Command-Line Arguments
Many bash scripts need to read arguments from the command line. The arguments are indicated by special variables $0, $1, $2, etc.
Command lines arguments are separated by whitespaces:
./Commandline.sh Hello You
$0 is the name of the command/script itself
The subsequent ones are the command line arguments
If you have a variable number of arguments, use the shift built-in to iterate over them in your script.
The special variable $# is the number of arguments (not counting the command name)
$@ expands to array of all command line arguments
Example
__Create script __ commandline.sh
- _#!/bin/bash _
- USAGE=“Usage: $0 arg1 arg2 arg3 ...argN”
- if [[ “$#” -eq 0 ]]; then
- # no command line arguments
- echo “$USAGE”
- exit 1 # return to command line
- fi
- echo “All arguments: $@”
- i=1 # counter
- while [[ “$#” -gt 0 ]]; do
- echo “Argument ${i}: $1”
- shift # move _ args _ 1 position to the left
- ((i++)) # increment counter
- done
String Operations
- Bash has a number of built-in string operations.
- Concatenation
- Just write them together (literals should be quoted)
- newstring=${string}".ext”
- String length
- ${#string}
- Extract substring
- Strings are zero-indexed, first character is numbered 0
- ${string:pos} # Extract from _ pos _ to the end
- ${string:pos:len} _# Extract _ len _ characters _
- _ starting at _ pos
Clipping Strings
- It is very common in bash scripts to clip off part of a string so it can be remodeled.
- Delete shortest match from front of string
- ${string#substring}
- Delete longest match from front of string
- ${string##substring}
- Delete shortest match from back of string
- ${string%substring}
- Delete longest match from back of string
- ${string%%substring}
Replacing Portions of a String
- Often a specific portion (substring) of a string needs to be replaced.
- Replace first match of substring in string with replacement string
- ${string/substring/replacement}
- Replace all matches of substring in string with replacement string
- ${string//substring/replacement}
String Manipulation Example
Create script strings.sh
#!/bin/bash
name=“Miller”
echo ${name}
# string length
echo “Your name has ${#name} letters.”
# clipping string
echo “I turned you into a ${name%“er”}.”
# replacing substring
single_repl=${name/“l”/"*"}
echo ${single_repl}
all_repl=${name//“l”/"*"}
echo ${all_repl}
Arrays
- Arrays are zero based so the first index is 0.
- Initialize arrays with a list enclosed in parentheses.
- Obtaining the value of an item in an array requires use of ${}:
- arr=(blue red green)
- echo ${arr[@]} # All the items in the array
- ${#arr[@]} # Number of items in the array
- ${arr[0]} # Item zero
- ${arr[1]} # Item one
- ${#arr[0]} # Length of item zero
Array Example
!#/bin/bash
arr=(blue red green)
for (( i=0 ; i<${#arr[@]} ; i++ ))
do
echo ${arr[i]}
done
#!/bin/bash
arr=(blue red green)
for color in ${arr[@]}
do
echo ${color}
done
Common Task: File Archiving
Unix/Linux offers the tar command which can be used to bundle and compress many input files into a single archive file, a .tar file.
Create .tar file
tar -czvf file.tar inputfile1 inputfile2
-c create archive
-f archive is a file
-v verbose output
-z compress
Unpack .tar file
tar –xzvf file.tar
-x extract/unpack archive
Open your terminal window
Change current directory to ~/scripts
Execute tar czvf script_archive.tar *.sh
Execute ls, verify that there is a new file archive.tar
Automate Archiving Process
Task: Create a script that creates a single tarball archive file for all files with a certain file extension in a given directory
__Basic Requirements: __
Use command line arguments to allow user to specify
directory with files to be archived
extension of the files to be archived, e.g. “.sh”
destination directory where the archive file is to be saved
Advanced Requirements:
add command line argument to allow user to specify new extension used to rename (homogenize) extensions of selected files
Autogenerate tarball archive filename that contains creation timestamp
More Examples
echo `date` # exec date command, print output
echo ‘This is worth $2’
echo “This is worth $2”
var=2
echo “This is worth ${var}”
echo “This is worth \$${var}”
Herefiles
- A herefile or here document is a block of text that is dynamically generated when the script is run
- CMD « Delimiter
- line
- line
- Delimiter
Example
- #!/bin/bash
- # ‘echo’ is fine for printing single line messages,
- # but somewhat problematic for message blocks.
- # A ‘cat’ here document overcomes this limitation.
- cat «End-of-message
- -------------------------------------
- This is line 1 of the message.
- This is line 2 of the message.
- This is line 3 of the message.
- This is line 4 of the message.
- This is the last line of the message.
- -------------------------------------
- End-of-message
Functions
- function name() {
- statement
- statement
- statement
- VALUE=integer
- return $VALUE
- }
- the keyword function is optional in newer versions of bash. The parentheses are always left empty.
- Function definitions must precede any invocations.
Function Arguments
- Function arguments are passed in the caller . In the function they are treated like command-line options.
- #!/bin/bash
- function writeout() {
- echo $1
- }
- writeout “Hello World”
Function Variables
Variables set in the function are global to the script!
#!/bin/bash
myvar=“hello”
myfunc() {
myvar=“one two three”
for x in $myvar; do
echo $x
done
}
myfunc # call function
echo $myvar $x
Making Local Variables
We can use the keyword local to avoid clobbering our global variables.
#!bin/bash
myvar=“hello”
myfunc() {
local x
local myvar=“one two three”
for x in $myvar ; do
echo $x
done
}
myfunc echo $myvar $x
Return Values
- Strictly speaking, a function returns only its exit status.
- The returned value must be an integer
- You can get the value with $?
- e.g.
- myfunc $1 $2
- result=$?
Regular Expressions
- Regular expressions are generalizations of the wildcards often used for simple file commands.
- A regular expression consists of a pattern that attempts to match text.
- It contains one or more of:
- A character set
- An anchor (to the line position)
- Modifiers
- Without an anchor or repeat it will find the leftmost match and stop.
Regex Character Sets and Modifiers
The character set is the set of characters that must be matched literally.
Modifiers expand the potential matches.
* matches any number of repeats of the pattern before it (note that this is different from its use in the shell) including 0 matches.
? Matches 0 or 1 character (also different from the shell wildcard).
+ matches one or more, but not 0, occurrences.
. matches any single character, except newline
More Modifiers
- \ escapes the preceding character, meaning it is to be used literally and not as a regex symbol.
- \ can also indicate non-printing characters, e.g. \t for tab.
- () group the pattern into a subexpression
- | pipe is or
- [gray|grey] equivalent to [gr(a|e)y]
Regex: Ranges and Repetition
[] enclose a set of characters to be matched.
- indicates a range of characters (must be a subsequence of the ASCII sequence)
{n} where n is a digit, indicates exactly n repetitions of the preceding character or range.
{n,m} matches n to m occurrences.
{n,} matches n or more occurrences.
Regex: Anchors and Negation
^ inside square brackets negates what follows it
^ outside square brackets means “beginning of target string”
$ means “end of target string”
. Matches “any character in the indicated position”
Note: the “target string” is usually but not always a line in a file.
Regex Examples
AABB* matches
AAB
AABB
AABBBB
But not
AB
ABB
ABBBB
Regex Examples (Cont.)
[a-zA-Z] matches any letter
[^a-z] matches anything except lower-case letters
.all matches all, call, ball, mall, and so forth. Also matches shall (since it contains hall).
Regex patterns are said to be greedy since they find a match with the most generous interpretation of the pattern.
Extensions and Shortcuts
Most shells and languages support some shortcuts:
\w : [A-Za-z0-9_]
\s : [ \t\r\n] some flavors have a few more rare whitespace characters
\d : [0-9]
\D : ^\d
^\W: ^\w
^\S: ^\s
NB \D\S is not the same as ^\d\s; in fact it matches anything. ^\d\s matches a but not 1
Grep, Sed and Awk
grep or egrep can be used with regular expressions.
sed is the stream editor . It is used to script editing of files.
awk is a programming language to extract data and print reports.
grep examples
- grep “regex” filename
- The quotes are often needed; they make sure it’s interpreted as a regex
- We assume Gnu grep on Linux (it is also called egrep)
- egrep matches anywhere in the string
- grep ^root /etc/passwd
- grep :$ /etc/passwd
sed examples
- sed operates on standard input and outputs to stdout unless told otherwise. Thus you must redirect.
- The -i option will tell it to overwrite the old file. This is often a mistake (if you do this, be sure to debug the sed command carefully first).
- sed ‘command’ < old > new
- Note hard quotes—best practice is to use them
- sed ’s/day/night/’ < old > new
- Remember, expressions are greedy ; day/night here changes Sunday to Sunnight
awk examples
awk ‘pattern {actions}’ file
Similar to C, awk “action” lines end with ;
For each record (line) awk splits on whitespace. The results are stored as fields and can be referenced as $1, $2, etc. The entire line is $0 and $NF indicates the number of fields in the line.
awk ‘pattern1 {actions} pattern2 \ {actions}’ file
awk ‘Smith’ employees.txt
awk ‘{print $2, $NF;}’ employees.txt