Project 1
Last modified: January 29 2008 09:51:07 AM
Introduction
Here is a chance for you to design a data structure on your own that (should be) built upon another data structure. For this assignment, we are not concerned with efficiency (although efficiency is a plus), but rather just the creation of something that meets the requirements.
Objectives
- Modifying pre-defined structures to design your own structure
- Learning more about pointers and other stuff in C++
- Creating a project in C++ that uses multiple files
Lab Description
You will create a program that reads in a text file. The name of this file should be able to be specified by the user. You can assume the file is plain text, but nothing more. Don't be overly concerned with special characters inside the text. Assume the file is basically a plaintext version of a novel or something like that, no fancies. (It may include common punctuation characters, which can be ignored.)
As you are reading in the file, you are to create and keep a list (a linear structure) of all the words contained in the file. If a word appears multiple times in the file, a count should be kept as to how many words are in the file.
The list should be ordered in the following manner:
- The first elements in the list are the words with the greatest frequency, the ones at the end have the least frequency.
- Words that have the same frequency are ordered in the list alphabetically.
- Words that have been searched for more often are at the beginning of the list, less often at the end of the list.
Searching should have a higher priority than frequency. It is up to you to reconcile the two after that.
The underlying structure for your list should NOT be a native C++ array. You may use a vector, list, or any other linear structure from the STL, but not an array.
You should create a text-based user interface that allows the graders to test the functionality of your implementation. That means, they should not only be able to read in a file and search for a word, but also be able to see the contents of your data structure at any time. It is expected that every time the user selects to read a file, the old information is destroyed. (As extra credit, make it so the user can choose to load multiple files, retaining the previous data.)
You should draft a position paper discussing your implementation. The format of this paper should not matter (I think Open Office will read Word docs), but a PDF is preferred. The paper should have three sections:
- discussion of your implementation and implementation decisions you made while working on the project, making special note of how you devised your sorting of the list
- efficiency of your data structure in regards to inserting new elements and finding elements
- ideas of how to make the efficiency better (this could include drastic changes to the design of the data structure - even if you have no idea how to implement them, describe what would be needed)
DO NOT ask any questions about how long the paper should be. It should be long enough to describe what you have completed for this project and discuss in a professional way the things required. As a rough guideline, this paper represents a description of 1/4 of a semester's work and probably tens of hours of your labor. Consider that when writing. The paper should most definitely not be an afterthought.
Minimally, you should separate the definition of the data structure from the parsing routine. There may be other files that you can decompose from this problem, and you should do so.
You should create, use, and submit a makefile with your project.
The program must compile and run to receive any credit.
Due dates
This project is due for all students by 11:59:59pm on Friday, February 15th. You should name your submission Project1.zip
(therefore, you should zip all of your files up before submitting).