Laboratory Exercises For Computer Science 151

Strings

Strings

Goals: This laboratory builds upon background material on character data and discusses string processing within Scheme. Such processing includes string literals, zero-based indexing, string procedures, and string predicates.

Literal Strings

A string is a sequence of characters. The external form of a string is the characters enclosed in double-quotes:
   "This is a string!"
Special characters can be included in a string by escaping them with a back-slash:
   "Type \"stop\" to quit."

Zero-based Indexing

While much work with strings does not require access to individual characters within a string, some procedures reference positions in a string. In such cases, Scheme numbers character positions within strings starting at position 0. For example, consider the string:

   "I am very excited by the Scheme programming language!!!"
Scheme regards the first character (I) as being in position 0, followed by a blank or space character in position 1. The letters 'a' and 'm' follow in positions 2 and 3, respectively.

Some String Procedures

The previous lab illustrated the use of the string->list procedure. Other common string procedures are shown in the following table:

Procedure Sample Call Result of Example Comment
string(string #\a #\b #\c) "abc" make a string of the given characters
string? (string?
"sample string")
True (#t) is argument a string?
string-length (string-length
"sample string")
13 number of characters in string
string-append (string-append "Big" "Small") "BigSmall" concatenate two strings
substring (substring
"sample string" 3 10)
"ple str" extract characters from first to before second designated position from string
string-ref (string-ref
"sample string" 4)
#\l return character at given position
string->list(string->list "example") (#\e #\x #\a #\m #\p #\l #\e) makes a list of the characters in a string
list->string (list->string '(#\e #\x #\a #\m #\p #\l #\e)) "example" makes a string of the characters in a list
symbol->string (symbol->string 'example) "example" change a given symbol to a string
string->symbol (string->symbol "example") example convert a given string to a symbol
number->string (number->string 3.141592) "3.141592" change a given number to a string
string->number (string->number "3.141592") 3.141592 convert a given string to a number

Some Comparison Predicates for Strings

Just as with the comparison of individual character, Scheme provides both normal and case-insensitive versions of various predicates to compare strings:

Predicate Comment
string=? Are two strings equal?
string Does first string come first?
string>? Does first string come after?
string<=? Is first string equal the second or does the first come before the second?
string>=? Are the strings equal or does the first come after the second?
string-ci=? Same as string=?, but ignoring case
string-ci Same as string, but considering uppercase and lowercase letters to be equivalent
string-ci>? Same as string>?, but ignoring case
string-ci<=? Same as string<=?, but ignoring case
string-ci>=? Same as string>=?, but ignoring case

Additional Discussion

Reread sections 6.1-6.2 of the textbook. Also, reread the material on strings of the Revised Report(5) on the Algorithmic Language Scheme.

Picking Appropriate Passwords

Since many modern computer systems use passwords as a means to provide protection and security for users, a major issue can be the identification of appropriate passwords. The main point should be to choose passwords that are not easily guessed, but which the user has a chance of remembering. For example, passwords related to birthdays, anniversaries, family names, or common words are all easily guessed and should be avoided.

With this in mind, we might consider how to generate appropriate passwords in some random way. One approach begins by noting that it is relatively easy for a user to remember a sequence of three letters, if the first letter is a consonant, the second letter is a vowel, and the third is again a consonant. Such three-letter "words" generally are pronounceable and possible to remember.

To make passwords more secure, it also is desirable to include a mix of characters, including uppercase and lowercase letters, digits, and punctuation. The following password generator incorporates each of these ideas:


(define gen-password
   (lambda ()
      (let* ((consonants "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ")
             (cons-len (string-length consonants))
             (vowels "aeiouAEIOU")
             (vowel-len (string-length vowels))
             (digits "0123456789")
             (punctuation ".,<>?/`~!@#$%^&*()_-+={[}]|")
             (punc-len (string-length punctuation))
            )
          (string  (string-ref consonants (random cons-len))
                   (string-ref vowels (random vowel-len))
                   (string-ref consonants (random cons-len))
                   (string-ref digits (random 10))
                   (string-ref punctuation (random punc-len))
                   (string-ref consonants (random cons-len))
                   (string-ref vowels (random vowel-len))
                   (string-ref consonants (random cons-len))
          )
      )
   )
)
  1. Run this procedure with the command (gen-password) several times to obtain several candidates for passwords.

  2. Explain how the (gen-password) produces a password, and what format one can expect for this format. To what extent do these passwords satisfy the common guidelines of being hard to guess, but relatively easy for the user to remember?

  3. Determine the number of different passwords which can be generated by this procedure.
    Hint: Counting upper and lower case letters separately, there are 42 consonants, 10 vowels, 10 digits, and 27 punctuation marks in the above lists.

    It has been estimated that a modern password-cracking program working on a medium-level personal computer can analyze about 70,000 passwords in a day to determine if a user has chosen that password. Using this estimate, about how long might it take a modern personal computer to analyze all the passwords generated by the above (gen-password) procedure?

  4. (gen-password) includes a one-digit number in the resulting password, based upon looking up that digit within a string. Another approach would be to generate an random number directly and convert that number to a string. This may be accomplished with the code: (number->string (random 10)). Remove the variable digits in the above procedure, and replace it (and the corresponding string-ref statement) with this direct computation of a number.

    Note: number->string returns a string of characters, and you can use the string-append procedure to put strings of characters together.

  5. How would you change the procedure in the previous step to include a two-digit number instead of a one-digit number in the resulting password?

  6. The (gen-password) procedure generates 2 three-character consonant-vowel-consonant sequences, by explicitly choosing individual characters from specific strings of letters. Another approach would be to define a procedure gen-syllable, which would return a consonant-vowel-consonant string, and then call this procedure twice in the body of (gen-password). Rewrite the original (gen-password) to include a procedure gen-syllable which is defined locally.

Checking For Appropriate Passwords

As already noted, (gen-password) provides a reasonable way to find appropriate passwords. Another common task is to analyze a potential password to see if it might be easily guessed. To begin, observe that characters may be divided into the following categories:

  1. uppercase letters
  2. lowercase letters
  3. digits
  4. punctuation (considered to be anything not a letter or a digit)
With these categories, one guideline for passwords suggests that any password should contain at least 6 characters, including characters from at least three of these categories.

The next part of this lab develops a procedure to determine if a password satisfies this guideline. The first task is to define a procedure that checks if any letter in a string contains an uppercase letter. One approach searches subsequent characters in the string until either an uppercase letter is found or until all letters have been checked. Organizing this work into a husk-and-kernel yields the following code:


(define contains-upper?
   (lambda (str)
      (upper-kernel str (- (string-length str) 1))
   )
)
(define upper-kernel
   (lambda (str pos)
      (cond ((< pos 0) #f)
            ((char-upper-case? (string-ref str pos)) #t)
            (else (upper-kernel str (- pos 1)))
      )
   )
)

  1. Test these procedures on several strings to check that they work as claimed.

  2. Describe in a paragraph how this code works. In your paragraph, be sure to mention the purpose of each test in the cond. Also, explain why contains-upper? must subtract 1 from the string length when calling upper-kernel.

  3. Rewrite contains-upper?, so that upper-kernel is defined locally, using a named let expression.

  4. Write a similar procedure to check if a string contains a lower case character.

  5. Write a similar procedure to check if a string contains a digit.

  6. Write a similar procedure to check if a string contains a punctuation character.

  7. Use your procedures in the previous parts to write a procedure good-password?, which takes a string as parameter and which returns true if the string would make an appropriate password according to the guidelines given above. Thus, good-password? should return the following results:
    
    (good-password? "aEioU")  ==> #f  (only 2 categories)
    (good-password? "aei12U") ==> #t  
    (good-password? "A;e")    ==> #f  (too short)
    (good-password? "abc-?B") ==> #t 
    (good-password? ";!$XYZ") ==> #f  (only 2 categories)
    (good-password? ";3BB45") ==> #t 
    


Work To Turn In:


This document is available on the World Wide Web as

http://www.cs.grinnell.edu/~walker/courses/151.sp99/lab-strings.html

created March 5, 1997
last revised March 13, 1999