next up previous contents
Next: 8. Further Assembly Programming Up: Lecture Notes on Computer Previous: 6. The Central Processing

Subsections


7. Assembly Language Programming

7.1 Introduction

Chapter 6 showed how digital logic components can be connected together, and, with a bit of glue formed by the controller, can be made to perform useful elementary operations - like adding the contents of a memory location to the accumulator (ADDD).

We now come to constructing sequences of these elementary operations (ADDD, LODD, JUMP etc. ...) to do useful tasks, i.e. to write programs This chapter will concentrate on the basics: sequence, selection, repetition.

In addition we will describe the operation of a two-pass assembler and show how this can be done manually.

7.2 Programs in Assembly Code

We now want to write some programs for the Mac-1A. For completeness, we repeat Figure 6.5 as Figure 7.1. We can write in binary machine code, using Figure 7.1; but, that is tedious and error prone - because of the distraction of having to translate into a numeric code. It is better to use the mnemonics. The assembly code can then be assembled to machine code; this is done using a program called an assembler program.

Figure 7.1: Mac-1a Instruction Set (limited version of Mac-1)
\begin{figure}\begin{verbatim}-----------------------------------------------...
...-----------------------------------------------------\end{verbatim}
\end{figure}

For the purposes of this chapter, assembly language is a language in which there is one-to-one relationship between symbolic instructions and machine instructions; we will come across pseudo-instructions that are really instructions to the assembler program, but these will be obvious.

7.2.1 Conventions

In the following examples, if we need to mention actual addresses, we will start all code at $ 0x100$ and all data at $ 0x500$. Note again that the largest address is $ 0xFFFH$ (12 bits). Also, be careful with $ 0xFFFC$ onwards ($ 4092_{10}$ to $ 4095_{10}$, which are used for memory-mapped input-output, see Chapter 7.

Often, for brevity, we will assume variables that variables a0, a1, a2 ... (integers) have been declared and are already in locations 0x500, 0x501, 0x502, .... Alternatively, we will use symbols for data. Use these assumptions in the exercises where appropriate.

7.2.2 Simple Program - add two integers

Program. Add the integer in a1 to the integer in a0 and put the result in a2, i.e.

a2 = a0 + a1;

In assembly code:

Start:    LODD a0   /get a0 into accum.
          ADDD a1   /accum. gets a0+a1
          STOD a2   /store result in a2

There are four fields in an assembly instruction:

Label.
Optional, only necessary if you need to jump to that instruction; can also be useful as comment, e.g. above. Always followed by ':', otherwise assembler will try to interpret it as an instruction.

Instruction mnemonic.

Operand
I.e. address of thing to suffer the operation. Either:

Comment
Comments are even more important for assembly language programs than the are for high-level language programs. Make your code maintainable by someone else!

7.2.3 Declaring variables

For a real assembler, we would have to declare all variables. And, tell it where to start main program at. E.g. in the example above,

DW a0 0x10  /a0 is a WORD type, initialised to 10H
DW a1 0x22  /these are not executable instructions
DW a2      /but pseudo-instructions 
           /ie. directives to the assembler to allocate 
           /storage and associate names with that storage
           / just like you declare variables in Pascal

ORG       0x100      /start prog. at 0x100

7.2.4 If-then

7.2.4.0.1 High-level language

      if(a0==a1) a2 = a2 + 1;
      else  next instructions...

7.2.4.0.2 Assembly language

If:       LODD a0
          SUBD a1
          JZER Then
          JUMP Next
Then:     LOCO 1   /load constant 1
          ADDD a2
          STOD a2
EndThen:  JUMP Next  /Unnecessary  
Next:     ...

7.2.5 If-then-else

7.2.5.0.1 High-level language

     if(a0==a1) a2 = a2 + 1;
     else a2 = a2 + 2;
     //next instructions...

7.2.5.0.2 Assembly language

If:       LODD a0
          SUBD a1
          JZER Then
Else:     LOCO 2 
          ADDD a2
          STOD a2
          JUMP Next
Then:     LOCO 1
          ADDD a2
          STOD a2
EndThen:  JUMP Next  /Unnecessary  
Next:     ...

7.2.6 Repetition

7.2.6.0.1 High-level language

   //sum the first n-1 integers
   int sum=0;
   int i=0;
   while(i < n){
     sum= sum + i;
     i++;
   }

7.2.6.0.2 Assembly Language

init:    LOCO 0
         STOD sum
         STOD i
loop:    LODD i
         SUBD n
         JPOS loopend /          if i-n>=0,
                       / same as: if i>=n
                       / same as: if NOT i<n
         /otherwise continue with body of loop
loopbody:
         LODD sum  /load sum into accumulator
         ADDD i    /add i to it
         STOD sum  / and don't forget to store the result!
incr:    LOCO 1
         ADDD i
         STOD i
         JUMP loop
loopend: ...

Exercise. Draw a flowchart representing the program above.

It is straightforward to define deterministic loops, for example,

          for(i = 0; i<n; i++){
            sum= sum + i;
          }

in terms of do-while. Likewise do-until - the continuation test is merely carried out at the end of the loop, instead of at the beginning for do-while.

7.3 Machine Code, Memory Maps

Let us see how the If-then-else from the previous section would assemble into machine code. We use Figure 7.1 to translate.

For ease of understanding, we show the assembly language in the table. Assume that, initially, a0 contains 8, a1, 4 and a2, 2.

<-- Assembly language --->  <------- Machine code ------------>
                             Address   Instruction/Data
If:       LODD a0            0100      0500
          SUBD a1            0101      3501
          JZER Then          0102      5107
Else:     LOCO 2             0103      7002
          ADDD a2            0104      2502
          STOD a2            0105      1502
          JUMP Next          0106      610B
Then:     LOCO 1             0107      7001
          ADDD a2            0108      2502
          STOD a2            0109      1502
EndThen:  JUMP Next          010A      610B
Next:     ...                010B      ...
                             ...
a0                           0500      0008
a1                           0501      0004
a2                           0502      0002

Thus, we can identify two separate sections of memory: (a) program, (b) data. Of course, there is nothing really separating these, and there is nothing to stop assembly programmers modifying instructions as data, nor, for that matter, runaway programs attempting to execute data!

Actually, as shown in Figure 7.2, in a C or C++ program, we can identify four sections of memory. Exact implementation detail may differ depending on compiler, machine architecture and operating system. In some contexts, Figure 7.2 is called a memory map.

Figure 7.2: Memory Layout of a Program - Memory Map
\begin{figure}\begin{verbatim}Low memory address
0 +-------------------------...
...v
+-----------------------------+ ---
Top of memory\end{verbatim}
\end{figure}

7.4 The Assembly Process

Assembly is the process of translating the symbolic assembly code into numeric machine code.

7.4.0.0.1 Example

Assemble (translate) the following assembly program into binary machine code. Assume a0, a1, ... at 0x500, 0x501, .... Start code at 0x100.

DW a0  /shown here for completeness
DW a1
   ....
DW a10
DW one 1
If:       LODD a0
          SUBD a1
          JZER Then
          JUMP Next
Then:     LODD a10
          ADDD one
EndThen:  JUMP Next  /Unnecessary  
Next:     .....

7.4.1 Two-Pass Assembler

This section describes how a two-pass assembly program works. We said that one assembly language instruction maps to one machine instruction; thus, it would appear natural to run through the list of assembly instructions and translate them using a table such as Figure 7.1.

This works fine until we get to JZER Then; what address is Then?

You don't know until you have assembled up to Then.

Therefore, we must have a symbol table - - which is a list of (symbol, address) pairs. If a symbol has been entered in the table, then you can look up its value. Initially, the table is empty.

Also we must have an operation-code (op-code) table - similar to the symbol table, only it is fixed, and contains a table relating instruction code (LODD, STOD, ...) to the 4-bit op-code part of the instruction (0001, 0010, ...). In general, this should also contain the length of each instruction, or some means of calculating it. Unlike most machines, Mac-1a has a single fixed instruction length of one; most machines have instruction lengths which vary according to the instruction.

7.4.2 Pass One

The main job of pass one is to build the symbol table.

  1. Insert the data variables in the symbol-table, Figure 7.3;

    Figure 7.3: Symbol Table - variables
    \begin{figure}\begin{verbatim}Symbol Value Other Information
------ ----- --...
...----
a0 500
a1 501
... etc.
a10 50a
one 50b\end{verbatim}
\par
\end{figure}

  2. Next go through the program to be translated, and build the ILC (instruction location counter); start at 100 (we assume that all programs start at 100) and cumulatively add each instruction length. See Figure 7.4. Note: I have abbreviated some label names.

    Figure 7.4: Partial Translation, pass one
    \begin{figure}\begin{verbatim}Assembly Machine
Label Oper Oper Comment Instr....
...EndT:JUMP Next /Unnecessary 1 106* 6
Next: ..... 107*\end{verbatim}
\end{figure}

  3. Also, in this pass we might as well translate from symbolic op-code to numeric; see right-hand column of Figure 7.4;

  4. Finally, add the label symbols - marked `*' in Figure 7.4 - to the symbol table; see Figure 7.5.

Figure 7.5: Symbol Table - at end of pass one - completed
\begin{figure}\begin{verbatim}Symbol Value Other Information
------ ----- --...
...10 50a
one 50b
If 100
Then 104
EndT 106
Next 107\end{verbatim}
\end{figure}

7.4.3 Pass Two

Now the symbol table is complete and we can complete the translation by filling in the operand fields; see Figure 7.6. Also, we may require the data area must be initialised - to zero or some appropriate default values.

Figure 7.6: Completed Assembly, pass two
\begin{figure}\begin{verbatim}Assembly Machine
Label Oper Oper Comment Instr....
...T:JUMP Next / Unnecessary 1 106 6 107
Next: ..... 107\end{verbatim}
\end{figure}

7.5 Manual Assembly

Here in a slightly less formal way, is a recipe for assembling manually. We do it in a table, first write down the assembly code, the make two columns, one for address, the other for instruction or data. Then translate using Figure 7.1.

Do it in two passes. Leave Jump addresses blank first time round - you don't know the Jump destination addresses until after you have coded the Jump destination.

Use Hexadecimal for the machine code. If necessary, separate the binary fields in Figure 7.1 into groups of four - one for each Hex. digit, and add another column to Figure 7.1 to contain the Hex. representation of the operation code (op-code).

LABEL   ASS. INSTR.      ADDR.     BINARY INSTR.
------------------------------------------------
If:       LODD a0        100       0500
          SUBD a1        101       3501
          JZER Then      102       5(Then)
          JUMP Next      103       6(Next)

Then:     LODD a10       104       050A
          ADDD one       105       250B       

EndThen:  JUMP Next      106       6(Next)  

Next:     .....          107

Now, we know that Then = 104, Next = 107. So we can complete the assembly:

LABEL   ASS. INSTR.      ADDR.     BINARY INSTR.
------------------------------------------------
If:       LODD a0        100       0500
          SUBD a1        101       3501
          JZER Then      102       5104
          JUMP Next      103       6107

Then:     LODD a10       104       050A
          ADDD one       105       250B       

EndThen:  JUMP Next      106       6107  

Next:     .....          107

7.6 Linking and Loading

If the program above was viable to work on their own, it would be possible to load it into the addresses given: 0x100 to 0x107 for program and 0x500 to 0x50b for data, set the PC to 0x100, and the program would run.

Unfortunately, programs are usually made up from more than one module, and the results of each module translation - object files - must be linked together to produce an executable program.

Essentially, at the end of pass two, there will be symbols for which you do not have an address; these will refer to subprograms (see Chapter 7) that are contained in other modules. Also, the ILC will need to contain relative addresses, not absolute addresses as above.

In linking and loading the main jobs to be done are:

Refer also to Figure 7.2.

We further discuss linking (along with issues surrounding compilation and interpretation) in the course on Operating Systems, see the chapter at the end or miscellaneous topics.

7.7 Exercises

In the following, unless specified, assume the executable code is to start at is at 0x100, and a0, a1, a2, ... at 0x500, 0x501, 0x502, ...

  1. Translate the following (separate) Java/C/C++ like instructions into Mac-1A.

    1. a10 = a0;

    2.          if( a1==a0 ) a10 = 1;
               else         a10 = 0;
      

              if( a1>a0 ) a10 = 1;
              else        a10 = 0;
      

              if( a1<a0 ) a10 = 1;
              else        a10 = 0;
      

    3. a3 = a2 - a1;

    4. a3 = a2*2;

    5. a3 = a2*3;

    6. a3 = a2*4;

    7. (more difficult) a3 = a2 * a1; (assume a2, a1 both less than 128 ).

    8. a0 = - a1;

    9. a0 = - 1;

  2. Translate the following Java like code into Mac-1A.

         int a,b,c,d,eq;
    
         a = 16;
         b = 32;
         d = 64;
         c = a + b;
         if(c==d) eq = 1;
         else     eq = 0;
    

  3. Assuming start is at 100Hex, and a0, a1, a2 at 500, 501, 502 (Hex), manually assemble the following:

    Start:    LODD a0
              ADDD a1
              STOD a2
    

  4. Manually assemble the following code.

    DW a0  (at 500Hex)
     ...
    DW a10 (at 50AHex)
    DW x       50B
    DW y       50C
    DW z       50d
    
    Start:    LODD a0   
              ADDD a1   
              STOD a2   
              JZER Zero
              LOCO 0
              STOD x
              STOD y
              LOCO 10
              STOD z
              JUMP Lab1
    Zero:     LOCO FF
              STOD x
              LOCO 1
              STOD y
              LOCO 2
              STOD z
    Lab1:     ...
    

  5. . What does the code in the previous question above do? Translate it into Java.

  6. Manually assemble:

    DW one 1 (one is at 50BHex)
    
    If:       LODD a0
              SUBD a1
              JZER Then
    
    Else:     LODD a10 
              ADDD one
              ADDD one    
              JUMP Next
    
    Then:     LODD a10
              ADDD one
    
    EndThen:  JUMP Next  
    Next:     .....
    

  7. Assuming that Mac-1a uses 16-bit twos complement to store signed integers, why is it impossible to 'load constant' (LOCO) minus 1, or indeed, any negative number. Explain how you would, using LOCO and one other instruction, load a negative constant (say $ -1000$ decimal) into the AC register.

  8. Note the restriction on LOCO for positive numbers as well. Explain how, using LOCO and one other instruction, how you would cause the number 1000H to be loaded into the AC register.

  9. Write any program that will modify continuously its behaviour by modifying its code - i.e. it treats some of its code as 'data'.

  10. Here is a machine language program; usual layout, the program starts at 100; data variables a0, a1, ..., a10 are at 500, 501, 50a, respectively.

         Addr.     Instruction
         -----     -----------
         100       0500
         101       2500
         102       2500
         103       1501
         104       3502
         105       4109
         106       0505
         107       1503
         108       610c
         109       0504
         10a       1503
         10b       610c
    

    1. Disassemble it, i.e. translate back into assembly code;

    2. What does it do? Ans:

                a1 = 3*a0;
                if(a1>a2) a3 = a4;
                else      a3 = a5;
      
      Explain.

7.8 Self assessment questions

These are also recall type questions that appear as parts of examination questions.

  1. Referring to Figure 7.1 convert the following into assembly language:

    (a) a2 = a0 + a1;
    (b) a2 = 32;
    (c) a2 = a0 - a1;
    (d) if (a0 == a1) then a2= 1;
    (e) if (a0 == a1) then a2 = 1;
        else a2 = 2;
    (f) if (a0 >= a1) then a2= 1;
    (g) if (a0 < a1) then a2= 34;
    (h) if (a0 >= a1) then a2= 1;
        else a2= 34;
    

  2. Consider the following program.

       loco 29
       stod a1
       loco 31
       stod a2
       loco 33
       stod a3
       lodd a1
       addd a2
       stod a3
       stod a4
    

    When the program is completed: (a) What is in a3? (b) What is in a4? (c) What is in a1? (d) What is in a2?

  3. Consider the following program.

       loco 29
       stod a1
       loco 31
       stod a2
       loco 33
       stod a3
       stod a4
       lodd a2
       subd a1
       stod a3
       stod a4
    

    When the program is completed:

    (a) What is in a3? (b) What is in a4? (c) What is in a1? (d) What is in a2?

  4. Consider the following program.

       loco 29
       stod a1
       loco 31
       stod a2
       loco 0
       stod a3
       stod a4
       lodd a2
       subd a1
       jzer then
       loco 100
       stod a3
       jump end
    then loco 200
       stod a4
    

    When the program is completed:

    (a) What is in a3? (b) What is in a4? (c) What is in a1? (d) What is in a2?

  5. (a) The following example shows an implementation of an IF ... THEN ... ELSE construct in Mac-1 assembly language. Add comments to each line of the assembly code, to explain how the assembly language performs the task, and replace the labels L1, L2, L3 with more explanatory labels (eg. THEN, ELSE, NEXT).

         IF a0>=a1 THEN a2:=0;
                   ELSE a2:=1;
           next instructions...
    
    If:       LODD a0
              SUBD a1
              JPOS L2
              JZER L2
    L1:       LOCO 1
              STOD a2
              JUMP L3
    L2:       LOCO 0
              STOD a2
    L3:     .....
    


next up previous contents
Next: 8. Further Assembly Programming Up: Lecture Notes on Computer Previous: 6. The Central Processing

平成17年1月9日