An assembler is a program that accepts a symbolic language program and produces its binary machine language equivalent. The input symbolic program is called the source program and the resulting binary program is called the object program. The assembler is a program that operates on character strings and produces an equivalent binary interpretation.
Representation of Symbolic Program in Memory
- The user types the symbolic program on a terminal. A loader program is used to input the characters of the symbolic program into memory.
- In the basic computer, each character is represented by an 8-bit code. The high-order bit is always 0 and the other seven bits are as specified by ASCII.
Table: ASCII code (in Hexadecimal code) for characters
- The code for CR is produced when the return key is depressed. This causes the “carriage” to return to its position to start typing a new line. The assembler recognizes a CR code as the end of a line of code.
- A line of code is stored in consecutive memory locations with two characters in each location since a memory word has a capacity of 16 bits.
- A label symbol is terminated with a comma. Operation and address symbols are terminated with a space and the end of the line is recognized by the CR code.
- For example, the following line of code:
PL3, LDA SUB I
Fig: Computer Representation of the Line of Code: PL3, LDA SUB I
- Each symbol is terminated by the code for space (20) except for the last symbol, which is terminated by the code of carriage return (OD).
- If the line of code has a comment, the assembler recognizes it by the code for a slash (2F). The assembler neglects all characters in the comment field and keeps checking for a CR code. When this code is encountered, it replaces the space code after the last symbol in the line of code.
- This input is scanned by the assembler twice to produce the equivalent binary program. The binary program constitutes the output generated by the assembler.
First Pass
- A two pass assembler scans the entire symbolic program twice. During the first pass, it generates a table that correlates all user defined address symbols with their binary equivalent value.
- The binary translation is done during the second pass.
- To keep track of the location of instructions, the assembler uses a memory word called a location counter (abbreviated LC). The content of LC stores the value of the memory location assigned to the instruction or operand presently being processed.
- The ORG pseudo-instruction initializes the location counter to the value of the first location. Since instructions are stored in sequential locations, the content of LC is incremented by 1 after processing each line of code. To avoid ambiguity in case ORG is missing, the assembler sets the location counter to 0 initially.
Fig: Flowchart for first pass of assembler.
Table: Address Symbol Table for Above Subtraction Program
- There are three symbols MIN, SUB, DIF each followed by comma (,).
Second Pass
- Machine instructions are translated during the second pass by means of table lookup procedures.
- A table lookup procedure is a search of table entries to determine whether a specific item matches one of the items stored in the table. The assembler uses four tables.
i) Pseudo-instruction table.
The entries of the pseudo-instruction table are the four symbols ORG, END, DEC, and HEX. Each entry refers the assembler to a subroutine that processes the pseudo-instruction when encountered in the program.
ii) MRI table.
The MRI table contains the seven symbols of the memory reference instructions and their 3 bit operation code equivalent.
iii) Non MRI table.
The non MRI table contains the symbols for the 18 register reference and input output instructions and their 16 bit binary code equivalent.
iv) Address symbol table.
The address symbol table is generated during the first pass of the assembly process. The assembler searches these tables to find the symbol that it is currently processing in order to determine its binary value.
No comments:
Post a Comment