stack
to solve a common problem in parsing.
And perhaps make use of a map
too.
Lexer
class that I provided
map,
vector, stack, set
.
<b>This is bold</b>, <i>this is italic</i>, <b>bold again <i>and italic at the same time</i></b>, and <u>underlined</u>, and back to normal.Of course, not all HTML files are correctly tagged. For example, the following is not tagged correctly
<b>Bold text<i> and italic</b>The World Wide Web Consortium (W3C) has a service allowing us to validate any HTML file, confirming whether the file is well-formed. In this assignment, we will make up our own HTML-like language, and implement a "validator" for the language. Also, we will write a function that renders (displays) a text string (or expression) marked up in our language correctly. Let's call our language "250HTML". The language accepts the following tags:
<red>,
<green>,
<yellow>
<blue>
<magenta>
<cyan>
, and their corresponding closing tags
</red>,
</green>,
</yellow>
</blue>
</magenta>
</cyan>
.
<dim>,
<underline>,
<bright>,
</dim>,
</underline>,
</bright>
<red>Red <dim>dim and red</dim> back to red</red> <yellow>Yellow <underline>underlined yellow <dim>dim</dim> underlined yellow</underline> and <cyan>cyan</cyan> and yellow again</yellow>An invalid 250HTML expression consists of the following type of errors:
INVALID TOKEN
: Tags are not formatted correctly, such as a tag with <
but no closing >
(whether or not the tag is valid)
UNKNOWN TAG
: An unknown tag is included, such as <some tag>
EXPRESSION NOT WELL-FORMED
:
<blue>Blue <underline>underlined blue <dim>dim</dim> underlined blue</blue> </underline>
stack
as I discuss in class.
validate [250HTML expression] display [250HTML expression] exitFor example, the following are good commands:
> validate <red>This is red <blue>and this is blue</blue> and back to red</red> > display <red>print this outIn the above commands,in dimmed red and red</red>
<red>This is red <blue>and this is blue</blue> and back to red</red>and
display <red>print this outare 250HTML expressions.in dimmed red and red</red>
exit
, then your program just quits.
validate
command reports whether the expression
is well-formed according to our language 250HTML described above.
display
command prints the expression
where the text attributes (foreground color and 3 other attributes) are
displayed correctly (if the expression is well-formed). All tags are stripped off when
displaying the expression, of course. If the expression is not well-formed,
then an error message is reported. If two foreground color tag pairs are
nested inside one another, then the inner text has the color specified by
the inner tag pair. For example,
<red>This is red <blue>and this is blue</blue> and back to red</red>In general, if there is a conflict the inner-most tag pair has the highest priority.
The program above might seem to be a daunting task, but I have already written a skeleton of the program for you. You can download the source files with
wget http://www.cse.buffalo.edu/~hungngo/classes/2014/Fall/250/assignments/A3.tar tar -xvf A3.tarPlease read all the code in the code base, but you can only modify one file:
cmd.cpp
to implement the two functions that were left empty there. You can compile the program by typing make
. The Makefile
is already written for you.
Makefile
: typing make
produces an executable called
browser
that you can run.
Lexer.h, Lexer.cpp
: interface and implementation of a
Lexer
class that allows for tokenizing a text string into tokens of
various types: TAG
, IDENT
, BLANK
,
ERRTOK
, or ENDTOK
.
The TAG
token represents a 250HTML tag.
error_handling.h, error_handling.cpp
: a few error reporting
routines.
term_control.h, term_control.cpp
: terminal control routines.
browser.cpp
: contains the main()
function.
cmd.h
: contains the interface to the two main commands.
cmd.cpp
: this is the only file you can modify to
write your program. You are going to implement 2 functions
display()
and validate()
, which
implement the display
and validate
commands,
respectively. Feel free to add more functions, types, variables, etc.
into cmd.cpp
if necessary.
.html
file marked up with our language
for you to test, inside the TESTS
sub-directory.
map
somewhere, beyond the command map already implemented.
Lexer
and how to use it in class.
stack
data structure in the context of this problem.
timberlake
. You can
download and run it (in timberlake
) to see how it is supposed
to work. Remember to make it an executable file before trying to run it:
wget http://www.cse.buffalo.edu/~hungngo/classes/2014/Fall/250/assignments/hqn-browser chmod 700 hqn-browser
cmd.cpp
file. We will put your submission into a directory that has all other files in the codebase and compile using make
submit_cse250 cmd.cppNote again that the submission only works if you logged in to your CSE account and the
cpp
file is there. All previous things can be done at home, as long as you remember to upload the final file to your CSE account and run the submit script from there.
0 points
if the program does not compile (using make
).
5 points
if the exit
command works, i.e. if you don't do anything, just resubmit cmd.cpp
you'll get 5 points.
65 points
if the validate
command works. It must be able to report error messages for the three types of syntax errors specified
above. Please run my program to see the error messages. Please output the
exact same messages for automatic grading.
30 points
if the display
command works.
stack
in class.
Read the textbook, the chapter on stacks if you have to.
Read the Terminal control post on the blog.
The following snippet of code is probably helpful too.
You will need Lexer.h
and Lexer.cpp
in the same
directory as the following file to test it.
// lexerDriver.cpp #include <iostream> #include "Lexer.h" using namespace std; // BAD PRACTICE int main() { string line; int i; Token tok; Lexer lexer; while (getline(cin, line)) // Ctrl-Z/D to quit! { cout << "Enter an expression to tokenize: \n> "; lexer.set_input(line); while (lexer.has_more_token()) { tok = lexer.next_token(); switch (tok.type) { case TAG: if (tok.value[0] != '/') cout << "OPEN TAG: " << tok.value << endl; else cout << "CLOSE TAG: " << tok.value.substr(1) << endl; break; case IDENT: cout << "IDENT: " << tok.value << endl; break; case BLANK: cout << "BLANK: " << tok.value << endl; break; case ERRTOK: cout << "Syntax error on this line\n"; break; default: break; } } } return 0; }