CSE 111, Fall 2000

Great Ideas in Computer Science

Lecture Notes #30

MACHINE ARCHITECTURE
AND ASSEMBLY LANGUAGE

1. Machine Architecture

a) A computer consists of a Central Processing Unit
(or "CPU") and memory.

b)    The memory contains "registers" (i.e., storage
        units, or "memory locations", as we've been
        calling them), each of which can store data
        as well as programs, all coded in 0s and 1s

* but it has no ability to compute

        *    there is no way to distinguish between data
            and program in memory, so it is possible to
            write programs that can consider themselves
            (or other programs) as data, and can therefore
            change other programs or even themselves

+ This is the way that programs that learn
are typically written.

c)    The CPU also contains "registers", but these are
        for arithmetic and string processing on the data
        stored in memory

* the contents of the CPU registers are also
coded in 0s and 1s

d)    For a simple example, think of a calculator,
        which can do arithmetic processing and has
        1 or a few more (maybe a dozen or so) memory
        locations.

e)    The amount of memory can make one computer
        more efficient than another, but won't increase
        its "power".

        *    All computers capable of executing a Pascal
            program (i.e., that can do simple arithmetic,
            that can store and retrieve information in
            memory locations, and that can do sequences,
            selections, and repetitions of the basic
            instructions) are equally powerful.

        *    Computers differ in speed (time) and
            size of memory (space), and therefore
            in ease of use and efficiency, but not in
            power.

2. A toy computer: The P88

a) The P88's CPU contains 4 registers (plus some
circuitry to do arithmetic):

        *    instruction pointer (IP): contains the
            memory address of the next instruction
            to be executed

* instruction register (IR): contains the
current instruction being executed

* condition flag (CF): I'll explain this one later

        *    accumulator (AX):    this is a computing
            register; all arithmetic operations are done
            here

b) There is a graphical picture of all this in
Biermann, p. 261, which I won't repeat here.

But let me add a bit of explanation of that
picture, in particular of the Memory side:

* the numerals 10, 11, ..., 23 are "addresses"
of memory locations.

        *    the data stored in them consist both of
            lines of assembly language code (things
            like "COPY AX,X") and data (things like "7").

        *    actually, what gets stored is this info coded
            into 0s and 1s, so the first line of the program
            might actually look like this:

0010111000010100

            This is called "machine language". When
            a Pascal program is compiled into "assembly
            language", the lines of Pascal code are
            translated into (usually several) lines of
            assembly language. A similar translation
            process occurs when assembly language is
            translated, or "assembled", into machine
            language. (This is not technically what
            actually happens, but is near enough.)

            Assembly language, then, is simply machine
            language translated into something more
            readable to humans than strings of 0s and 1s.

    *    the labels in parentheses ("(X)", etc.) are the
          variable names of the memory locations, as
          they might be declared in a var section of a
          Pascal program.

    *    So, for instance, what is currently stored in
          memory locations 10-23 in the example
        could just as well have been stored in locations
        510-523, or 689-702, or whatever, but in all
         cases, the last 4 locations would be identified
        to the users and in the assembly language as
        "X", "Y", etc.

c) The Fetch-Execute Cycle:

* An algorithm (expressed in Pascal language)
describing how P88 works:

Here's the flowchart:

 ___                       
|   |                      find instr i @ addr in IP  
|   |                     /           |
|   v                    /            v
| FETCH (an instruction)=          IR := i
|   |                    \            |
|   |                     \           v
|   v                      \      update IP
| EXECUTE (the instruction)
|   |
|   |
\___/

program P88;
procedure Fetch;
begin {Fetch}
    find instruction i in Memory at address given by IP;
    IR := i;
    update IP
end;   {Fetch}
procedure Execute;
begin {Execute}
    execute instruction in IR
end;   {Execute}
begin {P88}
    while true do
        begin {while true}
            Fetch;
            Execute
        end    {while true}
end.   {P88}

*    Note that this is an infinite loop!
      In practice, the only way to get the computer
        to stop is to turn it off, or to have a "halt"
        instruction as part of the program.

d) This is all that a computer does!
It fetches instructions and executes them!

e) The Big Question of Computer Science:

* What problems can be solved this way?
* i.e., what can be computed?

3. Assembly Language Programming

a)    Assembly-language programming is programming
        at the level of the machine's operations.

        *    i.e., in terms of what the machine does,
              not in terms of the problem to be solved
                (as in Pascal)

        *    e.g., our Pascal text editor program talked
                about "strings", "inserting", "deleting", etc.
            -    It solved a problem using language
                  appropriate to the problem

            -    but the assembly language translation of
                    that program would not talk about such
                    things; it would only talk about moving
                   information from one register to another.

b) Here is part of our P88 assembly language,
enough to understand the program above:

        syntax                                    semantics
    (what the instruction            (what the instruction
    looks like)                            means, explained
                                                using Pascal syntax)
====================          ==================
COPY AX, memloc                    AX := memloc
COPY memloc, AX                    memloc := AX
ADD AX, memloc                    AX := AX + memloc

c) Here's a trace of the program stored in memory
on p. 261:

We need to keep track of 7 registers:
IP, IR, AX (all in the CPU) and X,Y,Z, CN1 (in mem)

        We'll use one convention for recording the
        trace of the execution: If something is shown
        to be in memory, you can assume that it stays
        there until it is changed. That way, I don't have
        to repeat things, or clutter up the chart below.

        And we need to show each step of the fetch/
        execute cycle (I'll code that as follows:
        f = fetch, g = get instr address, p = put in IR,
         u = update IP, e = execute):

        step    IP    IR                AX    X    Y    Z    CN1
        ====   ==   ==                ==    =    =     =    ===
        0                                           7    4    0        0
        1(fg)    10
        2(fp)          copy ax, x
        3(fu)    11
        4(e)                                7
        5(fp)           add ax,y
        6(fu)    12
        7(e)                                11 (=7+4)
        8(fp)           copy cn1,ax
        9(fu)    13
       10(e)                                                             11
       11(fp)          copy ax,cn1
       12(fu)    14
       13(e)                               11
       14(fp)          copy z,ax
       15(fu)    15
       16(e)                                                    11

*    Hopefully, instruction 15 is a "halt", but the book
        doesn't tell us!
*    There is a little do-nothing "dance" at steps
        8-13, in which 11 is moved back and forth
        seemingly for no reason. There really is
        a reason, having to do with how a Pascal
        program (in this case, for z := x+y) is translated
        into P88 assembly language; see Ch. 9 for
        a detailed explanation.
*    The result is that z := x+y