\documentclass[12pt]{article}
\textheight 9.5in
\textwidth 6.6in
\parskip=0in
\parindent=0.167in
\oddsidemargin -.1in
\evensidemargin -.1in
\topmargin -.4in
\headsep 0in
\topsep 0in
%\input{KWRmacs.tex}

\newcommand{\set}[1]{\{\,#1\,\}}
\newcommand     {\Choose} [2]   {(\,^{#1}_{#2}\,)}
\newcommand{\Nat}{{\rm \bf N}}
\newcommand{\into}{\longrightarrow}

\begin{document}
\pagestyle{empty}
\thispagestyle{empty}
\sloppy
\noindent
{\large\bf
\begin{tabular*}{\textwidth}
{@{}l@{\extracolsep{\fill}}c@{\extracolsep{\fill}}r@{}}
CSE250, Fall 2009& Assignment 3 & Due Fri.\ 10/9, 11:59pm
\end{tabular*}
}

\centerline{\it Programming Exercise: A Phrase Class and Driver\/}
\centerline{\small\tt online submission only---no hardcopy}

\smallskip
\noindent
{\bf Reading:}

{\em Before Friday (10/2) class\/}, read this assignment sheet and the
files it references, namely {\tt StringClient.cpp} (posted PS1 answer key)
and {\tt main} in {\tt stringsorts.cpp}.
Also read Chapter~3, sections 3.3 and 3.5--3.6, skimming 3.4.
[Note that {\tt stringsorts.cpp} will compile only on {\tt timberlake},
and only with a trailing {\tt -lrt} after {\tt g++} and the
filename, because of its high-resolution timing code.  You do not need
to read or use the timing parts.]

{\em For next week\/}, read all of 
sections 4.1--4.4, and read about ``Function Objects'' on
pages 303--306 (which don't depend so much on the section 4.10 context).
Then read the C++ code file {\tt RealFn.h}
and its driver {\tt Newton.cpp}.  
Also glance at {\tt CPUTimer.h} just in terms of section 4.4
but in reverse---how undoing a pointer field removes the need for
any of the ``big 3.''
Read section 4.5, but lectures may postpone its details until Ch.~5.
Finally compare how the
array-style code in {\tt stringsorts.cpp} relates to the 
iterator code in {\tt templatesorts.cpp}, as a lead-in to
sections 4.6--4.10 and Week~7.

\bigskip
\noindent
{\bf Short Task Statement}

We want to check how often an author uses phrases in which the
words get progressively longer---or shorter.
A guy who uses words longer growing exhibits mounting verbosity;
meanwhile writers getting briefer should soon come to a .  

Write in C++ a class {\tt Phrase} that wraps a vector of words
(much like {\tt StringStack} did), and
that provides the following public functionality:

\begin{verbatim}
Phrase();                      //other constructors optional
void push(string nextWord);    //see notes on "punctuation" below
string pop();
string top() const;            //top word, without effect of popping it
bool isAscending() const;
bool isDescending() const;
int size() const;
bool empty() const;
string str() const;            //return whole phrase as one string
\end{verbatim}

\bigskip
\noindent
Then write a client file with {\tt main}
that reads words from a long file
(such as {\tt Hamlet.txt}) and outputs a table of how many maximal
ascending and descending sequences of each length it found.
Also output the longest such sequence (any one in case of ties is fine),
and the total number of words read from the file.  Output to screen is fine.
Exactly what ``maximal'' means is for you to decide and explain---in
comments!  You may assume 3 is the minimum sequence length we
care about, and that no sequence will be longer than 20.

\bigskip
\noindent
{\bf What to Submit}

Files {\tt Phrase.h}, {\tt Phrase.cpp}, {\tt PhraseClientNNN.cpp},
and {\tt PhraseClient.make}, where {\tt NNN} are your initials as before.  
Plus {\em optionally\/} {\tt StringStack.h} under choices described below.


Your {\tt PhraseClient.cpp} must have {\tt int main(int argc, char** argv)}
expecting the same command-line arguments as {\tt stringsorts.cpp},
namely a text file like {\tt Hamlet.txt} and an optional number {\tt n}
of words to read.  (You can use exactly the same first thirty-odd lines
of its {\tt main}.)
Per overall course policy, your code must ``make'' and run with
{\tt g++ -Wall} on
{\tt timberlake}.  {\em Note that we are {\bf not} telling you to zip a
directory on your home machine to submit as you may have done before\/}---at
least for now we want you to use the {\tt\verb+submit_cse250+}
command as described on Assignment (1).

{\em Option:\/}
You are welcome---but not required---to re-use some version of the
C++ {\tt StringStack} class from Assignment (1), yours or the
regular-credit key or as-modified etc., by inheritance {\em or\/}
by {\tt Phrase} having a {\tt StringStack} member.
If you do this, you have a further choice: 
You are allowed to to leave its code ``inlined'' Java-style, 
renaming it as a file {\tt StringStack.h}, and 
``\#include'' it into {\tt Phrase.h}.  You should still mention 
{\tt StringStack.h} as a dependence in the make-file, but you would
{\em not\/} need a target line {\tt StringStack.o:}---recall that owing to
the template-linking issue we will have to do this with template
classes {\em anyway\/}, as in {\tt LinkMain.make}.
Alternatively, you may put the {\tt StringStack} class declaration
and prototypes into the file {\tt Phrase.h},
but then you must move its method bodies into {\tt Phrase.cpp}.
(Or you may reject this option and just move some of its
code into {\tt Phrase}.)  Depending on this option, do:

\medskip
\noindent
{\small
{\tt\verb+submit_cse250 Phrase.h Phrase.cpp PhraseClientNNN.cpp PhraseClient.make [StringStack.h]+}
}

{\em There is no ``project report'' file\/}---instead, you must
explain what you are doing in comments.  You may assume the reader
knows the overall purpose described above, so you should focus on
comments that express logical relationships as on Assignment~2.


\bigskip
\noindent
{\bf Further Details and Grading}

The terms ``ascending'' and ``descending'' do not mean strictly-so here,
as the ``A guy\dots'' example phrases indicate.
Technically they should say ``nondescending'' and ``nonascending'' but
those words are even worse.  Anyway, the notion of progression is
thus similar to problem (3) on Assignment~2.  

The word ``maximal'' in the short-spec intends that you not count
a length-10 phrase such as ``A guy\dots verbosity'' as also counting in
the length-9, length-8 etc.\ categories.  However, 
the word ``meanwhile'' immediately following raises two thorny issues:

\begin{enumerate}
\item
Do we try to stop a phrase at a punctuation symbol such as `;' or period?
\item
If an (ascending) phrase ends in two-or-more equal-length words, do we count
{\em all\/} of them as beginning the next (descending) sequence?
Or do we count just the last word, leaving just a one-word overlap
between sequences?
\end{enumerate}

On the former we say {\em no\/}, which makes your life easier!
Then you only have to read blocks of whitespace-separated strings
from the file, and strip anything
that's not between {\tt A} and {\tt Z} or {\tt a} and {\tt z} from
either end.  If the string read has no letters, so stripping
leaves the empty string, then {\tt push} should do
nothing---you optionally may have it print a message like {\tt pop}
did, but don't throw an exception!  This still leaves the
issue of punctuation in the middle, most notably an apostrophe or hyphen.
Let's say each counts as a letter---but you should mention this
decision in the
{\tt\verb+/**+} header comment for {\tt push} in {\tt Phrase.h}.

This means that when you read {\tt verbosity;} and then {\tt meanwhile}
the {\tt ;} gets stripped out, and hence the ascending example above
actually counts as maximal length 11 not 10!  And depending on
the second issue, it may mean that {\tt verbosity} should
also count as the beginning of the next, descending phrase.
{\em Here the answer is up to you\/}---and your choice and
explanation should go in {\tt main}.  

As for stripping leading and trailing non-alpha chars before
storing a word, the question is whether {\tt main} or
{\tt Phrase::push} should do it.  We vote for the latter---which
means that if you inherit from {\tt StringStack}, you need to
override its {\tt push} method---which especially then
should be {\tt virtual}.  This also implies regardless that
your {\tt Phrase.h} must have at least one {\tt CLASS INV} that is
specific to its job of storing particular strings---i.e., not just
repeating invariant(s) that pertain to stacks in general.


\bigskip
{\em Relevance:\/} This is closer to the kind of stylistic detective
work people do with texts, than
Assignment~2 problem (3).  It is OK to ignore punctuation because
e.g.\ Shakespeare does long soliloquies and we want to focus on
all the words.
{\em Maybe\/} we should also ignore character-names in
all-capital letters so we don't pick
up ``HAMLET'' and ``POLONIUS'' too.  However,
you can rationalize either way whether e.g.\ a long character name 
like ``ROSENCRANTZ'' or ``GUILDENSTERN'' should continue or stop
a progression.  So ignore that suggestion---and on any other similar
policy debates, any reasonable choice {\em justified in comments\/} is fine.
Make the simplest code that complies with the stated requirements and
compiles with the stated {\tt g++} and does something relevant.

\bigskip
{\em Logistics and Grading:\/}
Next week's recitations are intended to help you with the
{\tt Phrase} class and with file I/O.  This is even to the point
of ``giving away'' {\tt Phrase.h}---note that two methods can already
be lifted-or-inherited from {\tt StringStack}.  If you come early in the
week, the recitation may help you write all the code ``in-line,''
leaving you to move method bodies into {\tt Phrase.cpp}.
If Thursday, try to have the code already working inline---just as an
outgrowth of {\tt StringStack}---and then you can get hands-on help
making it two separate files.  

As for {\tt PhraseClient.cpp}, first note that you can already start
testing your code by adding to the {\tt main} already written on
Assignment 1, not yet worrying about file I/O.  Then when you
start thinking about file-I/O, go grab the first 30-odd lines
of {\tt main} in {\tt stringsorts.cpp} as already shown in lecture.
You can forget the ``/O''---we're only outputting to the screen---so
that should get you started on the ``file-I'' part.  Indeed you can
let {\tt main} declare its own {\tt\verb+vector<string> items(n)+}
and read all the strings into that---thus using more lines from
{\tt stringsorts.cpp}---though then you'll have to go back over {\tt items}
with a for-loop or while-loop that searches over sequences.
Either way you'll be
creating many little {\tt Phrase} objects, for which pointers
are optional.
Next week's recitations will help you read from smaller files and
test some of your {\tt StringStack/Phrase} methods on them.

Hence the only parts you can't ``steal'' are the Boolean methods
in {\tt Phrase} and the progression-counting in {\tt main}---and
you already have some of that logic handwritten in Assignment 2.
The grading is 30 points for the class, 30 for {\tt main}, 24
for logic-explaining comments, and 6 for using a make-file, for 90 total.

\bigskip
{\em Comparison:\/}
This roughly parallels ``In-lab Exercises'' 1 \& 3 from
CSE250, Spr'09, at

\medskip
\noindent
{\tt http://www.cse.buffalo.edu/faculty/adrienne/SP2009/cse250/Assignments/}

\medskip
\noindent
except that the Boolean tests are in a class rather than be global functions,
and we're not writing to a file or employing a helper function 
like {\tt subPhrases.\{h,cpp\}} there.  Doing two pieces in 9 days is
justified because a lot of the foundation was already set by our
first two assignments, for you to re-use.  If you have friends from the
Spring 2009 class, you are welcome to discuss with them
those labs they did 
and look at their code---they're just not allowed
to help you with (the problem-solving part of) yours\dots 
More about how academic integrity applies to this assignment will
be said in class on Friday.  We do not intend group work until the
main (final) course project, or
maybe on a short assignment like this before it.


\end{document}