copyright Steve J. Hodges   http://steveh.net/cs19/cs19-hw02.html

CS 19 Spring 2018

Assignment 2 (Letter Pair Frequency)

Filename

freq.cpp

Program Description

In this assignment, you will read multiple lines of text from STDIN. The number of lines to be input is not specified. Read all input until EOF (ctrl-d for keyboard input.) You will process this input to find all pairs of alphabetic characters that occur in sequence on a single line of text. Ignore any whitespace (except newlines) or punctuation in the input (for example, "aa", "a a", and "a.a" all count as an "aa" pair.) Treat all alphabetic text as lowercase, and output the results in lowercase only.

After you have completed reading all of the input, output "Letter Pair Frequency Table". Then, output a table that shows each letter pair (AA-AZ) you observed at least once in the input, along with its count, and the frequency of the letter pair expressed as a percentage. Output a tab character "\t" between entries. Give one letter pair per line, formatted as with the examples provided. Give the table in order of decreasing frequency, and give the letter pairs in alphabetical order in case of a tie in the frequency. The last entry in the table will list the total number of letter pairs that were input. You program will have no output other than the table (result.)

Other Requirements

In this project you are required to modularize your code into several functions to handle the primary tasks of the program. Sorting must be performed by a bubblesort or selection sort function that you code (as described in class and in the textbook.) You may use arrays or vectors to store your data. Contact me for permission if you want to use another type of container.

Example

The input text "This is a test." has the following ten letter pairs: TH, HI, IS, SI, IS, SA, AT, TE, ES, ST.

What to turn in

As usual, leave your .cpp file for this program ("freq.cpp") in your home directory on pengo.

Suggestions

This program will require you to handle several operations, and some challenges may arise as you work on the various steps. Please make sure that you allocate enough time for you to complete the project. I recommend planning the program on paper before you begin, and also plan for changes to your plan before you complete the program. Work the problem in stages, testing each piece before you begin the next.

Hint

You might want to make at least one array of size 676. (why 676?)

Sample Input1

This is a test.

(matching) Sample Output

Letter Pair Frequency Table
is	2	20.0%
at	1	10.0%
es	1	10.0%
hi	1	10.0%
sa	1	10.0%
si	1	10.0%
st	1	10.0%
te	1	10.0%
th	1	10.0%
10 letter pairs in total.