CSE250 Final Exam Answer Key Fall 2010 (1) See pictures. The "ah" cannot be inserted in (c). (2) (a) No, can only say time is O(n\sqrt{n}) because we iterate thru \sqrt{n} nodes on each insert. (b) Credit given for both "yes" and "no" with appropriate justification, because it depends on whether /vector/ relocates on a call to /insert/ that happens to be at the end. Note that an insert in any other place /will/ always cause the vector to resize/relocate, so it's possibly an implementation specific matter. Ny preferred answer is "no", still O(n\sqrt{n}) time. (c) Yes, because the "loose" representation here guarantees that the vector will be disturbed only when it hits capacity and the node splits. This causes O(\sqrt{n}) extra work, but only every \sqrt{n} inserts, so the total is still O(n). (d) O(n) time here would imply the existence of an o(nlogn)-time sorting algorithm based on comparisons, which is impossible. (e) Yes: lim_{n->infty} (f(n)^2/g(n)^2) = lim_n (f(n)/g(n))(f(n)/g(n)) = 0*0 = 0. (f) FlexArray was better suited to the word-chains application because it did involve splicing chains in the middle---as shown by the "Josephus.cpp" example, a student-coded FlexArray can outperform the professional "vector" by factors of up to 10 on that task. Project 2, however, involved no modification of data structures at all, just build and iterate. The extra overhead of the FlexArray iterator (two movable fields, not one) alone makes it more costly. (3) 1. (c) (best answer, (b,e) 2 pts.) Usually O(1) time. The iterator already references the node so no "find(key)" call is needed, de-linking the node is O(1) time, and the successor-or-predecessor node to put in its place is /usually/ found in O(1) time. "Amortized" is not quite right because we don't know if the erasures will be in sequence. 2. (c) (still best answer, now 2 pts. for (b,d)): No real change, except that the log-depth of the tree at least guarantees that finding the successor or predecessor will take O(log n) steps. 3. (f) Guaranteed O(\sqrt{n}) time, no better because the pop from the front always causes the vector to re-form. 4. (b) *if* loose representation is used, because a split costing O(\sqrt{n}) work will happen only every \sqrt{n} calls. Else (f). 5. (a) 6. (b) No different from BST example in class. 2 pts. for (d). 7. (d), guaranteed O(log n) time because /set/ uses a Red-Black Tree. 8. (a) 9. (g) 10. (a) because their nodes can simply be linked together. (4) (a) The periods mark the completed phrases root each phrase word node. time or tree. has. also through. title. has. (b) class Node { string word; set nextWords; //or map nextWords; bool completed; public: explicit Node(string word) : word(word), nextWords(set()), completed(false) { } ... }; class WordTree { Node* root; public: WordTree() : root(new Node("")) { } ... }; (c) Loop over p phrases, nesting a loop over n words. In cases like "My my" you might have to back up c-1 spaces to try the next place, so the time really is best given as O(npc), not O(pn). (d) WordTree: O(log p) to place a new pointer every time, and O(log c) to update the others. There will be at most c others. Repeat n times. So time is O(n(log(p) + c*log(c))). (e) The WordTree data structure basically cuts a quadratic time algorithm down to O(n log p). It allows doing everything with only 1 scan thru the text file (which is vital to large-scale batch processing). On Project 2 there was similar efficiency to updating all the movies and ratings in one sweep, as opposed to looping for each title (as some did). Since log(c) << p, yes O(n(log(p) + c*log(c))) = o(npc).