copyright Steve J. Hodges

CS 19 Spring 2017

Assignment 2 (Letter Pair Frequency)



Program Description

In this assignment, you will read multiple lines of text from STDIN. The number of lines to be input is not specified. Read all input until EOF (ctrl-d for keyboard input.) After you have completed reading all of the input, you will output a table with 26 rows and 26 columns showing the number of lines of text that each sequential ordered pair of letters appeared on (not the total number of each of the pairs) (either upper or lower case) in the input ("AA", "AB", "AC" ... "AY", "AZ", "BA", "BB", "BC", ... "ZY", "ZZ") The first letter will determine which row your entry will be printed in, and the second letter will determine the column. Spaces, punctuation, or any other non-letter characters are not part of a two letter sequence. Any letter pair is only counted once per line. Please use strings (C++ style string class), not character arrays, for this assignment. In this project you are required to have at least one function in addition to main. That function must be passed a line of text to be scanned/processed (to find the letter pairs.) You may have more than two functions if you wish.

Use arrays for this assignment. Use of Vector is not allowed.

Output Table

Your output should consist of 26 rows of 26 integer values, with a single space after each integer. The first row output will be a count of the pairs AA, AB, AC, AD ... AZ. The second row will be a count of the pairs BA, BB, BC, BD ... BZ. Each row to follow will start with the next letter in the alphabet. The last row will represent the pairs ZA, ZB, ZC, ZD ... ZZ. Remember that the pair AA means that you saw the letter 'a' or 'A' followed by the letter 'a' or 'A' at least once on a given line of text. AB means that you saw the letter 'a' or 'A' followed by the letter 'b' or 'B'. You program will have no output other than the table (result.)


The input line "This is a test." has the following letter pairs: TH, HI, IS, IS, TE, ES, ST. Those are all the occurrences of two letter characters in a row in that line. Note that since a letter pair is only counted once per line, this line would be recorded as containing TH, HI, IS, TE, ES, and ST.

Sample Input

this file contains an input sample 1

Matching Sample Output

this file contains the correct output for that sample

What to turn in

Leave your .cpp file for this program ("freq.cpp") in your home directory on pengo.


You may find it easiest to complete this program in stages. Begin by writing a program that echos input lines of text to output until no more input remains. Then add the ability to pass a line of text to a second function to be processed for letter pairs.

1. Paul Krugman, The Conscience of a Liberal, The Opinion Pages, New York Times online, January 21, 2017