Lesson 7 | Using strings in variables |
Objective | Assign text strings to variables and manipulate them. |
Manipulate Text Strings stored in Variables
Most programming languages have many functions to manipulate text strings stored in variables.
Shell scripts do not. Strings stored in shell script variables are generally used for holding text that does not change and that does not need to be analyzed closely. You can easily do the following things with a string stored in a shell script variable:
- Assign a string to a variable
- Test the value of a string variable
- Join two strings together
- Print a string
Doing the following tasks with strings is possible, but requires using
additional UNIX commands and is often a challenge. If your program requires many string manipulation tasks like those listed here, a shell script is likely not the best choice as a programming environment. A language like Perl provides functions to complete these tasks more easily than a shell script:
- Extract a portion of text from the middle of a string
- Make a string all uppercase or all lowercase
- Update a pattern within a string
Three UNIX commands
Several UNIX commands are particularly useful when using a shell script for complex string manipulations.
This page mentions only three of them. To analyze a string stored in a shell variable, you generally must use
embedded command execution, which is covered in course Unix Shell Programming.
- The
wc
command displays the number of characters in a stream of input (the variable value). This can be used to determine the length of a string.
- The
cut
command can extract either a whitespace-separated field or a set of characters from a string. This can be used to extract a sub-string or examine the fields of the wc
command.
- The
sed
command can perform complex manipulations on a line of text, searching for patterns and deleting or adding text based on various rules.
wc - word count
wc gives a "word count" on a file or I/O stream:
bash $ wc /usr/share/doc/sed-4.1.2/README
13 70 447 README
[13 lines 70 words 447 characters]
wc -w |
gives only the word count |
wc -l |
gives only the line count |
wc -c |
gives only the byte count. |
wc -m |
gives only the character count. |
wc -L |
gives only the length of the longest line. |
Using wc to count how many .txt files are in current working directory:
$ ls *.txt | wc -l
# Will work as long as none of the "*.txt" files
#+ have a linefeed embedded in their name.
# Alternative ways of doing this are:
# find . -maxdepth 1 -name \*.txt -print0 | grep -cz .
# (shopt -s nullglob; set -- *.txt; echo $#)
Using wc to total up the size of all the files whose names begin with letters in the range d - h
bash$ wc [d-h]* | grep total | awk '{print $3}'
71832
Using wc to count the instances of the word "Linux" in the main source file of a book.
bash$ grep Linux book.sgml | wc -l
Assigning String to Variable
Strings are assigned to variables using the standard format that you are becoming familiar with.
If you need to assign the string “Welcome to our Web site.” to a variable called WELCOME, use this command:
WELCOME="Welcome to our Web site,"
You do not need to specify a length for the variable, or define it as a string variable, as you would in most programming languages.
Printing a string variable is done with the echo command, as you have already seen in a previous lesson.
If the WELCOME variable is assigned a string value as just shown, the following command:
echo $WELCOME
will send the string value to STDOUT (probably printing it on the screen):
Welcome to our Web site,
Joining Strings
When you use the dollar sign notation, the value of the string is simply substituted at that point. This means that you can create a combination of strings by using the individual variables. For example, suppose you have defined a variable named WELCOME with the value of “Welcome to our site,” and a variable named NAME with the value of “Nicholas”.
You want to create a third variable whose value combines the two strings and adds a period at the end of the second one.
This would be done with this command:
HELLO_MSG="$WELCOME ${NAME}."
Printing this variable using the
echo
command shows the following value:
Welcome to our site, Nicholas.
Notice a few things about the assignment of the HELLO_MSG variable:
- A space is included between the WELCOME and NAME variables, so a space appears after the comma in the value of HELLO_MSG.
- The entire variable value is enclosed in quotation marks to define it as a single parameter for the shell, thus avoiding any potential ambiguities.
- The NAME variable is followed immediately by a period, so it must be enclosed in braces to identify NAME as the variable’s name. Without the braces, the shell assumes that the variable name ends with the next space character, so it looks for a variable called
NAME
, which is not a variable we have defined.
- The space and period are literal text added to the value of HELLO_MSG with the existing WELCOME and NAME variables. You can also add other text, as long as the variable names can be identified by the shell.
In a later module, you will learn how to test the values of variables containing strings.
The next lesson describes how to use numbers in your variables.