Beginners Guide to Unix Shell Programming
Beginners Guide to Unix Shell Programming
This is from http://www.umr.edu/~mchilds/
-
- Provide a convenient and consistent way of specifying patterns to be matched
-
Match Any Character: The Period (.)
- a period in a regular expression matches any single character, no matter what it is
-
Matching the Beginning of a Line: ^
- when the caret character (^) is used as the first character in a regular expression, it matches the beginning of the line
-
Matching the End of a Line: $
- the dollar sign ($) is used to match the end of a line
-
Matching Special Characters
- in general, to match any of the characters that have a special meaning in forming regular expressions, you must precede the character by a backslash () to remove that special meaning
-
Matching a Choice of Characters: The [...] Construct
- the characters [ and ] can be used in a regular expression to specify that one of the enclosed characters is to be matched
- a range of characters can be specified within the brackets by separating the starting and ending characters with a -`
-
Special Characters in the Replacement String
- in general, regular expression special characters are only meaningful in the search string and have no special meaning when they appear in the replacement string
-
Match Inverting Withing the [...] Construct: ^
- if a caret (^) appears as the first character after the left bracket ([), then the sense of the match is inverted
-
Matching Zero of More Characters: *
- in regular expressions, the asterick is used to match zero or more occurences of the preceding character in the regular expression
-
Matching a Precise Number of Characters: {...}
- by using the construct {min,max} where min specifies the minimum number of occurences of the preceding regular expression to be matched, and max specifies the maximum, one can specify a precise number of characters to be matched
- if only one number is enclosed then that number specifies that the preceding regular expression must be matched exactly that many times
- if a single number is enclosed in the braces, followed immediately by a comma, then at least that many occurrences of the previous regular expression must be matched
-
Saving Matched Characters: (...)
- by enclosing characters inside backslashed parentheses, it is possible to capture those characters that are matched and store them in registers numbered 1 through 9
- to retrieve the characters in a particular register, the construct
is used, where n is from 1-9
-
- Command is useful when you need to extract various fields of data from a data file orthe output of a command
-
General Form: cut -c chars file
-
chars specifies what characters you want to extract from each line of a file
- if file is not specified, cut reads its input from standard input
-
The -d and -f Options
-
General Form: cut -d dchar -f fields file
-
dchar is the character that delimits each field of the data
-
fields specifies the fields to be extracted from the file
- if -f is used and -d is not, then the tab character is used as a delimiter
-
- Sort of the inverse of cut; instead of breaking lines apart, it puts them together
-
General Form: paste files
- corresponding lines from each file in files are pasted together to form single lines which are then written to standard output
- the dash character (-) can be used in files to specify that input is from standard input
-
The -d Option
- use if you dont want fields separated by tab characters
-
General Form: paste -d chars files
-
chars is one or more characters that will be used to separate the lines that are pasted together
- the first character in chars will be used to separate lines from the first and second files, the second character in chars to separate the second and third, etc.
- if there are more files than chars, then paste wraps around the list of characters
- its safest to enclose the delimeter characters in single quotes (chars)
-
The -s Option
- tells paste to paste together lines from the same file
- if just one file is specified, then the effect is to merge all lines from the file together, separated by tabs, or by the delimeter characters specified by the -d option
-
-
sed is a program used for editing data. It stands for stream editor. Unlike ed, it cannot be used interactively.
-
General Form: sed command file
-
command is and ed-style command that is applied to each line of the specified file, if no file is specified then standard input is assumed
-
The -n Option
- tells sed that you dont want it to print any lines unless explicitly told to do so
-
Deleting Lines
- used to delete entire lines of text
- by specifying a line number or a range of numbers you can delete specific lines from the input
-
- The tr filter is used to translate characters from standard input.
-
General Form: tr from-chars to-chars
-
from-chars and to-chars are one or more single characters
- any character in from-chars encountered on the input will be translated into the corresponding character in to-chars
- result of translation is written to standard output
- you can also give the octal representation of a character in the form
nn where nnn is the octal representation of the character
-
The -s Option
- use to try and squeeze out multiple occurences of characters in to-chars
- if more than one consecutive occurrence of a character specified in to-chars occurs after the translation is made, the characters will be replaced by a single character
-
The -d Option
- use to delete single characters from a stream of input
-
General Form: tr -d from-chars
- any single character in from-chars will be deleted from standard input
-
- Allows you to search one or more files for particular character patterns.
-
General Form: grep pattern files
- every line of each file that contains pattern is displayed at the terminal
- if more than one file is specified to grep, then each line is also immediately preceded by the name of the file
- its generally a good idea to enclose your grep pattern inside a pair of single quotes
-
grep takes its input from standard input if no file name is specified
-
grep allows you to specify your pattern using regular expressions
-
The -v Option
- use when youre interested in finding lines that dont contain a specified pattern
-
The -l Option
- use when youre only interested in knowing the names of the files which contain the specified pattern
-
The -n Option
- use to show line numbers that pattern is in
-
- By default, sort takes each line of the specified input file and sorts it into ascending order
-
The -u Option
- tells sort to eliminate duplicate lines from the output
-
The -r Option
- use to reverse the order of the sort
-
The -o Option
- use to specify an output file
- list the name of the output file right after the -o option
- usually used when you want to sort the lines in a file and have the sorted data replace the original
-
The -n Option
- specifies that the first field on a line is to be considered numeric
-
Skipping Fields
- you can skip fields by using the +num option where num is the number of fields to skip
-
The -t Option
- use to tell sort that the field delimeter character is something other than the space or tab character
-
- The uniq command is useful when you need to find duplicate lines in a file.
-
General Form: uniq in_file out_file
- in this format, uniq will copy in_file to out_file, removing any duplicate lines in the process
-
uniqs definition of duplicated lines are consecutive-occurring lines that match exactly
-
The -d Option
- tells uniq to write only the duplicated lines to out_file
-
The -c Option
- removes duplicate lines and precedes each output line with a count of the number of times the line occurred in the input
-
- To execute a script file just type the name of the program at the prompt
- Remember that you must have both read and execute permissions for a file before you canexecute it
-
- Whenever the shell encounters the special character # at the start of a word, it takeswhatever characters follow the # to the end of the line as comments
-
- A shell variable begins with an alphabetic or underscore character, and is followed by zeroor more alphanumeric or underscore characters
- To store a value inside a shell variable, you simply write the name of the variable, followedby the equals sign, followed immediately by the value you want to store in the variable(i.e. variable=value)
- Spaces are not permitted on either side of the equals sign
- The shell has no concept of data types
- Variables are not declared before theyre used
-
- The echo command is used to display the value that is stored inside a shell variable
- The dollar sign ($) is a special character in the shell. If a valid variable name follows thedollar sign, then the shell takes this as an indication that the value in the variable is tobe used
-
- A variable that contains no value is said to contain the null value
- To set a variable to the null value you simply assign no value to the variable or you can listadjacent pairs of quotes ( or "") after the =
-
- The shell does not perform file name substitution when assigning values to variables
-
- Use when you want something immediately after the variable name
-
-
The Single Quote
- use to keep characters that are otherwise separated by whitespace characters together
- when the shell sees the first single quote, it ignores any otherwise special characters that follow until it sees the closing quote
- the shell removes the quotes from the command line and does not pass them to the program
- quotes are also needed when assigning values containing whitespace or special characters to shell variables
-
The Double Quote
- ignores all enclosed characters except dollar signs ($), back quotes (`), and backslashes ()
- variable name substitution is done by the shell inside double quotes
- file name substitution is not done inside double quotes
- double quotes can be used to hide single quotes from the shell, and vice versa
-
The Backslash
- the backslash quotes the single character that immediately follows it
-
General Form: c
- where c is the character you want to quote
- any special meaning normally attached to the character is removed
-
The Back Quote
- purpose is to tell the shell to execute the enclosed command and to insert the standard output from the command at that point on the command line
-
General Form: `command`
- where command is the name of the command to be executed and whose output is to be inserted at that point
- you are not restricted to a single command inside back quotes
- the shell does file name substitution after it substitutes the output from the back-quoted commands
-
- A Unix command called expr evaluates an expression given to it on the command line
- Each operator and operand given to expr must be a separate argument
- The usual arithmetic operators (+.-,*,/,%) are recognized by expr
- Remember to use backslashes to protect the expression from the shell
-
expr only evaluates integer arithmetic expressions
- Use the : operator with expr to match characters in the first operand againsta regular expression given as the second argument; by default it returns the number ofcharacters matched
-
- Whenever you execute a shell program, the shell automatically stores the first argument in thespecial shell variable 1, the second argument in the variable 2, and so on; these specialvariables are known as positional parameters and are assigned after the shell has done itsnormal command line processing.
-
- Whenever you execute a shell program, the special shell variable $# gets set to the number of arguments that weretyped on the command line
-
- The special variable $* references all of the arguments passed to the program
- Often useful in programs that take an indeterminate or variable number of arguments
-
- The shift command allows you to effectively left shift your positional parameters
- When shift is executed, $# is automatically decremented by one
- You can shift more than one place by adding a count after shift (i.e. shift 3)
-
-
-
- Whenever any program completes execution under the UNIX system, it returns an exit status back to the system
- The exit status is a number that usually indicates whether a program successfully ran or not
- An exit status of zero is used to indicate that a program succeded and nonzero to indicate failure
- Failures can be caused by invalid arguments passed to the program, or by an error condition that is detected by the program
-
grep returns an exit status of zero if it finds the specified pattern in at least one of the files
-
- Automatically set to the exit status of the last command executed
-
- Send your output here if you dont want to see it
-
- Often used for testing one or more conditions in an if command
-
General Form: test expression
- where expression represents the condition that youre testing
-
test evaluates expression, and if the result is true, it returns an exit status of zero;otherwise the result is false, and it returns a nonzero exit status
-
Alternate Form: [ expression ]
- spaces must appear before and after the brackets
-
test must see all operands as arguments, meaning that they must be delimited by whitespace
-
Table of test String Operators----------------------+--------------------------------------------------- Operator | Returns TRUE (exit status of zero) if----------------------+--------------------------------------------------- string1 = string2 | string1 is identical to string2 string1 != string2 | string1 is not identical to string2 string | string is not null -n string | string is not null -z string | string is null (and string must be seen by test----------------------+---------------------------------------------------
-
Table of test Integer Operators----------------------+--------------------------------------------------- Operator | Returns TRUE (exit status of zero) if----------------------+--------------------------------------------------- int1 -eq int2 | int1 is equal to int2 int1 -ge int2 | int1 is greater than or equal to int2 int1 -gt int2 | int1 is greater than int2 int1 -le int2 | int1 is less than or equal to int2 int1 -lt int2 | int1 is less than int2 int1 -ne int2 | int1 is not equal to int2----------------------+---------------------------------------------------
-
- Each file operator is unary in nature, meaning they expect a single argument to follow
Table of test File Operators-----------------------+--------------------------------------------------- Operator | Returns TRUE (exit status of zero) if-----------------------+--------------------------------------------------- -d file | file is a directory -f file | file is an ordinary file -r file | file is readable by the process -s file | file has nonzero length -w file | file is writeable by the process -x file | file is executable-----------------------+---------------------------------------------------
-
- The unary logical negation operator (!) can be placed in front of any other test expression tonegate the result of the evaluation of that expression
-
- The operator -a performs a logical AND of two expressions and returns true only if the two joinedexpressions are both true
- The -a operator has a lower precedence than the integer, file, and string operators
-
- Forms a logical OR of two expressions
-
- You can use parentheses in a test expression to alter the order of evaluation
- Make sure the parentheses are quoted or backslashed to remove their special meaning
- Spaces must surround the parentheses
-
-
- Enables you to immediately terminate execution of your shell program
-
General Form: exit n
- where n is the exit status that you want returned
- if n is not specified, then the exit status used is that of the last command executed before the exit
-
-
-
- purpose is to do nothing
-
General Form: :
- use to satisfy requirements that a command appear
-
- These constructs enable you to execute a command based on whether or not the previous command succeeds or fails
-
General Form: command1 && command2
- execute command2 if command1 succeeds
-
General Form: command1 || command2
- execute command2 if command1 fails
- Can be used like logical operators in if statements
-
-
-
-
- To make an immediate exit from a loop, use the break command
- If the break command is used in the form break n, then the ninnermost loops are skipped
-
- The continue command causes the remaining commands in a loop to be skipped
- If the continue command is used in the form continue n, then the innermostn loops are skipped
-
- An entire loop can be sent to the background for execution by placing an ampersand (&)after the done
-
- You can redirect the I/O of a loop by placing the redirection after the done
- Input redirected into the loop applies to all commands in the loop that read their data fromstandard input
- Output redirected from the loop applies to all commands in the loop that write to standardoutput
- You can override redirection of the entire loops input or output by explicitly redirecting the input and/or output of commands inside the loop
- To force input or output of a command to come from or go to the terminal, use the fact that/dev/tty always refers to your terminal
- You can also redirect standard error by appending 2>file after the done
-
- A built-in shell command that exists for the express purpose of processing command-linearguments
-
General Form: getopts options variable
- designed to be executed inside a loop
- examines the next command line argument and determines if it is a valid option by checking to see if the argument begins with a minus sign and is followed by any single letter contained inside options; if it is a valid option, getopts stores the matching option letter inside variable and returns a zero exit status
- if the letter that follows the minus sign is not listed in options, getopts stores a question mark inside variable, returns a zero exit status, and sends a message to standard error
- if there are no more arguments left on the command line or if the next argument doesnt begin with a minus sign, getopts returns a nonzero exit status
- To indicate to getopts that an option takes a following argument, you write a coloncharacter after the option letter on the getopts command line; if getoptsdoesnt find an argument after an option that requires one, it will store a question mark inside the specified variable and will write an error message to standard error; otherwisethe actual argument is stored in a variable known as OPTARG
-
-
General Form: read variable(s)
- when this command is executed, the shell reads a line from standard input and assigns the first word read to the first variable listed in variable(s), the second word to the second variable listed in variable(s), and so on
- if there are more words on the line than there are variables listed, then the excess words get assigned to the last variable
-
read always returns an exit status of zero unless an end of file condition is detected in the input
-
Characters Specially Interpreted by echo--------------------------------------------------------------------------- Character | Prints--------------------------------------------------------------------------- | backspace c | the line without a terminating newline f | formfeed
| newline
| carriage return | tab character \ | backslash character nnn | the character whose ASCII value is nnn, where nnn is | a one to three digit octal number that starts with zero---------------------------------------------------------------------------
-
-
- A subshell is an entirely new shell that is executed by your login shell in order to run the desired program
- A subshell has no knowledge of local variables that were assigned values by the login shell
- A subshell cannot change the value of a variable in the parent shell
-
- Makes the value of a variable known to a subshell
-
General Form: export variable(s)
- where variable(s) is the list of variable names that you want exported
- Once a variable is exported, it remains exported to all subshells that are subsequentlyexecuted
-
- PS1 contians your command prompt
- PS2 contains your secondary command prompt
-
-
General Form: . file
- executes file in the current shell
-
- You can use the exec command to replace the current program with a new one
-
General Form: exec program
- where program is the name of the program to be executed
- You can redirect standard input by using exec <</tt>file; any commands thatsubsequently read data from standard input will read from file; the same can bedone for standard output
-
- Use to group commands together
- Use (...) to execute the commands in a subshell and {...;} to executethem in the current shell
- Useful for sending a group of commands to the background to be executed in order
-
- To send the value of a variable to a subshell, you can precede the name of the command withthe assignment of as many variables as you want
-
-
-
${parameter}
- if theres a potential conflict caused by the characters that follow the parameter name, then you can enclose the name inside curly brackets
-
${parameter:-value}
- the constant says to substitute the value of parameter if its not null, and to substitute value otherwise
-
${parameter:=value}
- substitutes the value of parameter if its not null, assigns value to it otherwise
-
${parameter:?value}
- if parameter is not null, the shell substitutes its value; otherwise, the shell writes value to standard error and then exits
-
${parameter:+value}
- substitutes value if parameter is not null; otherwise it substitutes nothing
-
- When you execute a shell program, the shell automatically stores the name of the program insidethe special variable $0
-
- A dual-purpose command used both to set various shell options as well as to reassign thepositional parameters
- The -x Option
- this option turns on trace mode in the shell
- after the set -x command is executed, all subsequently executed commands will be printed by the shell, after file name, variable, and command substitution and I/O redirection have been performed
- you can turn trace off at any time by simply executed set with the +x option
-
set with no arguments
- if you dont give any arguments to set, youll get an alphabetized list of all of the variables that exist in your environment
- Using set to reassign positional parameters
- the only way that positional parameters can be changed is with the shift or set commands
- if words are given as arguments to set on the command line, then the positional parameters $1, $2, ... will be assigned to these words
- The -- Option
- the -- option tells set not to interpret any subsequent arguments on the command line as options
last update: admin, 19.04.2007 17:45
|
|