What kind of instruction does assembly language have. Programming: Assembly language

What kind of instruction does assembly language have.  Programming: Assembly language
What kind of instruction does assembly language have. Programming: Assembly language

By purpose, commands can be distinguished (examples of mnemonic opcodes of commands of a PC assembler such as IBM PC are given in brackets):

l execution arithmetic operations(ADD and ADC - additions and additions with carry, SUB and SBB - subtractions and subtractions with a loan, MUL and IMUL - unsigned and signed multiplications, DIV and IDIV - unsigned and signed divisions, CMP - comparisons, etc. .);

l execution logical operations(OR, AND, NOT, XOR, TEST, etc.);

l data transfer (MOV - send, XCHG - exchange, IN - enter into the microprocessor, OUT - withdraw from the microprocessor, etc.);

l transfer of control (program branches: JMP - unconditional branch, CALL - procedure call, RET - return from the procedure, J* - conditional branch, LOOP - loop control, etc.);

l processing character strings (MOVS - transfers, CMPS - comparisons, LODS - downloads, SCAS - scans. These commands are usually used with a prefix (repetition modifier) ​​REP;

l program interrupts (INT - software interrupts, INTO - conditional interrupts on overflow, IRET - return from interrupt);

l microprocessor control (ST* and CL* - set and clear flags, HLT - stop, WAIT - standby, NOP - idle, etc.).

WITH complete list assembler commands can be found in the works.

Data transfer commands

l MOV dst, src - data transfer (move - move from src to dst).

Transfers: one byte (if src and dst are in byte format) or one word (if src and dst are in word format) between registers or between register and memory, and writes an immediate value to a register or memory.

The operands dst and src must have the same format - byte or word.

Src can be of type: r (register) - register, m (memory) - memory, i (impedance) - immediate value. Dst can be of type r, m. Operands cannot be used in one command: rsegm together with i; two operands of type m and two operands of type rsegm). Operand i can also be a simple expression:

mov AX, (152 + 101B) / 15

Expression evaluation is performed only during translation. Flags do not change.

l PUSH src - putting a word on the stack (push - push through; push to the stack from src). Pushes the contents of src onto the top of the stack - any 16-bit register (including segment) or two memory locations containing a 16-bit word. The flags do not change;

l POP dst - extracting a word from the stack (pop - pop; count from the stack in dst). Removes a word from the top of the stack and places it in dst - any 16-bit register (including segment) or two memory locations. Flags do not change.

Course work

By discipline " System Programming»

Topic number 4: "Solving problems for procedures"

Option 2

EAST SIBERIAN STATE UNIVERSITY

TECHNOLOGY AND MANAGEMENT

____________________________________________________________________

TECHNOLOGICAL COLLEGE

EXERCISE

for term paper

Discipline:
Topic: Problem solving for procedures
Artist(s): Glavinskaya Arina Alexandrovna
Head: Sesegma Viktorovna Dambaeva
Brief summary of the work: the study of subroutines in Assembly language,
problem solving using subroutines
1. Theoretical part: Basic information about the assembly language (set
commands, etc.), Organization of subprograms, Ways of passing in parameters
in subroutines
2. Practical part: Develop two subroutines, one of which converts any given letter to uppercase (including for Russian letters), and the other converts the letter to lowercase.
converts any given letter to uppercase, and the other converts the letter to lowercase.
converts a letter to lowercase.
Project timelines according to the schedule:
1. Theoretical part - 30% by week 7.
2. Practical part - 70% by 11 weeks.
3. Protection - 100% by 14 weeks.
Design requirements:
1. The settlement and explanatory note of the course project must be submitted in
electronic and hard copies.
2. The volume of the report must be at least 20 typewritten pages, excluding annexes.
3. RPP is drawn up in accordance with GOST 7.32-91 and signed by the head.

Head of work __________________

Performer __________________

Date of issue " 26 " September 2017 G.


Introduction. 2

1.1 Basic information about the assembly language. 3

1.1.1 Command set. 4

1.2 Organization of subroutines in assembly language. 4

1.3 Methods for passing parameters in subroutines. 6

1.3.1 Passing parameters through registers.. 6

1.3.2 Passing parameters through the stack. 7

2 PRACTICAL SECTION.. 9

2.1 Statement of the problem. 9

2.2 Description of the problem solution. 9

2.3 Testing the program.. 7

Conclusion. 8

References.. 9


Introduction

It is well known that programming in Assembly language is difficult. As you know, there are many different languages ​​now high level, which allow you to spend much less effort when writing programs. Naturally, the question arises when a programmer may need to use Assembler when writing programs. Currently, there are two areas where the use of assembly language is justified, and often necessary.

First, these are the so-called machine-dependent system programs, they usually manage various devices computer (such programs are called drivers). These system programs use special machine instructions that do not need to be used in ordinary (or, as they say, applied) programs. These commands are impossible or very difficult to specify in a high-level language.

The second area of ​​application of Assembler is related to the optimization of program execution. Very often, translator programs (compilers) from high-level languages ​​produce a very inefficient machine language program. This usually applies to programs of a computational nature, in which a very small (about 3-5%) section of the program (the main loop) is executed most of the time. To solve this problem, so-called multilingual programming systems can be used, which allow you to write parts of the program in different languages. Usually, the main part of the program is written in a high-level programming language (Fortran, Pascal, C, etc.), and the time-critical sections of the program are written in Assembler. In this case, the speed of the entire program can increase significantly. Often this the only way to force the program to give a result in an acceptable time.

This term paper is to gain practical skills in programming in assembly language.

Work tasks:

1. To study the basic information about the Assembler language (the structure and components of the program in Assembler, the format of commands, the organization of subroutines, etc.);

2. To study the types of bit operations, the format and logic of the assembler logic commands;

3. Solve an individual problem for the use of subroutines in Assembler;

4.. Formulate a conclusion about the work done.

1 THEORETICAL SECTION

Assembly language basics

Assembler is a low-level programming language that is a format for writing machine instructions that is convenient for human perception.

Assembly language commands correspond one to one to processor commands and, in fact, represent a convenient symbolic form of notation (mnemonic code) of commands and their arguments. Assembly language also provides basic programming abstractions: linking parts of a program and data through labels with symbolic names and directives.

Assembly directives allow you to include blocks of data (described explicitly or read from a file) into the program; repeat a certain fragment a specified number of times; compile the fragment according to the condition; set the fragment execution address, change label values ​​during compilation; use macro definitions with parameters, etc.

Advantages and disadvantages

The minimum amount of redundant code (the use of fewer commands and memory accesses). As a consequence - greater speed and smaller program size;

large amounts of code, a large number of additional small tasks;

Poor readability of the code, difficulty of support (debugging, adding features);

· the difficulty of implementing programming paradigms and any other somewhat complex conventions, the complexity of joint development;

Fewer available libraries, their low compatibility;

· direct access to hardware: input-output ports, special processor registers;

maximum "fitting" for the desired platform (use of special instructions, technical features"gland");

· non-portability to other platforms (except for binary compatible ones).

In addition to instructions, the program may contain directives: commands that are not translated directly into machine instructions, but control the operation of the compiler. Their set and syntax vary significantly and depend not on the hardware platform, but on the compiler used (giving rise to dialects of languages ​​within the same family of architectures). As a set of directives, we can distinguish:

Definition of data (constants and variables);

management of the organization of the program in memory and the parameters of the output file;

setting the compiler mode;

All kinds of abstractions (i.e., elements of high-level languages) - from the design of procedures and functions (to simplify the implementation of the procedural programming paradigm) to conditional structures and loops (for the structural programming paradigm);

macros.

Command set

Typical assembly language instructions are:

Data transfer commands (mov, etc.)

· Arithmetic commands(add, sub, imul, etc.)

Logical and bitwise operations (or, and, xor, shr, etc.)

Commands for managing the program execution (jmp, loop, ret, etc.)

Interrupt call commands (sometimes referred to as control commands): int

I / O commands to ports (in, out)

Microcontrollers and microcomputers are also characterized by commands that perform checks and transitions by condition, for example:

· jne - jump if not equal;

· jge - jump if greater than or equal to .

In order for the machine to execute human commands at the hardware level, it is necessary to set certain sequence actions in the language of zeros and ones. Assembler will become an assistant in this matter. This is a utility that works with the translation of commands into machine language. However, writing a program is a very time-consuming and complex process. This language is not intended to create light and simple actions. On this moment any programming language you use (Assembler works great) allows you to write special efficient tasks that greatly affect how the hardware works. The main purpose is to create micro-instructions and small codes. This language provides more features than, for example, Pascal or C.

Brief description of assembly languages

All programming languages ​​are divided into levels: low and high. Any of the syntactic systems of the “family” of Assembler is different in that it combines at once some of the advantages of the most common and modern languages. They are also related to others by the fact that you can fully use the computer system.

A distinctive feature of the compiler is its ease of use. In this it differs from those that work only with high levels. If any such programming language is taken into account, Assembler functions twice as fast and better. In order to write a lightweight program in it, it will not take too much time.

Briefly about the structure of the language

If we talk in general about the work and structure of the functioning of the language, we can say for sure that its commands are fully consistent with the commands of the processor. That is, the assembler uses mnemonic codes that are most convenient for a person to write.

Unlike other programming languages, Assembler uses specific labels instead of addresses to write memory cells. They are translated into the so-called directives with the code execution process. These are relative addresses that do not affect the operation of the processor (they are not translated into machine language), but are necessary for recognition by the programming environment itself.

Each processor line has its own. In this situation, any process will be correct, including the translated one.

Assembly language has several syntaxes, which will be discussed in the article.

Language pros

The most important and convenient adaptation of the assembly language will be that it can be used to write any program for the processor, which will be very compact. If the code is huge, then some processes are redirected to RAM. At the same time, they all perform quite quickly and without failures, unless, of course, they are controlled by a qualified programmer.

Drivers, operating systems, BIOS, compilers, interpreters, etc. are all assembly language programs.

When using a disassembler that translates from machine to machine, you can easily understand how this or that system task works, even if there are no explanations for it. However, this is only possible if the programs are light. Unfortunately, it is quite difficult to understand non-trivial codes.

Cons of the language

Unfortunately, it is difficult for novice programmers (and often professionals) to understand the language. Assembler requires detailed description the required command. Due to the fact that machine instructions must be used, the probability increases erroneous actions and complexity of implementation.

In order to write even the most a simple program, the programmer must be qualified, and his level of knowledge is high enough. The average specialist, unfortunately, often writes bad codes.

If the platform for which the program is being created is updated, then all commands must be rewritten manually - this is required by the language itself. The assembler does not support the function of automatic regulation of the health of processes and the replacement of any elements.

Language commands

As mentioned above, each processor has its own set of instructions. The simplest elements that are recognized by any type are the following codes:


Using directives

Programming microcontrollers in the language (Assembler allows this and does an excellent job of functioning) of the lowest level in most cases ends successfully. It is best to use processors with a limited resource. For 32-bit technology given language fits great. You can often see directives in codes. What is this? And what is it used for?

To begin with, it is necessary to emphasize that directives are not translated into machine language. They govern how the compiler does work. Unlike commands, these parameters, while having different functions, do not differ due to different processors, but at the expense of another translator. The main directives include the following:


origin of name

What is the name of the language - "Assembler"? We are talking about a translator and a compiler, which encrypt the data. From English Assembler means nothing more than an assembler. The program was not compiled by hand, an automatic structure was used. Moreover, at the moment, users and specialists have already erased the difference between the terms. Often assembler is called programming languages, although it is just a utility.

Because of the generally accepted collective name, some people have the erroneous assumption that there is a single low-level language (or standard norms for it). So that the programmer understands what structure in question, it is necessary to specify for which platform this or that assembly language is used.

macro tools

Assembler languages, which are relatively recent, have macro facilities. They make it easier to both write and run a program. Due to their presence, the translator executes the written code many times faster. When creating a conditional choice, you can write a huge block of commands, but it's easier to use macros. They will allow you to quickly switch between actions, in case of a condition being met or not being met.

When using macro language directives, the programmer receives Assembler macros. Sometimes it can be widely used, and sometimes its functionality is reduced to a single command. Their presence in the code makes it easier to work with it, makes it more understandable and visual. However, you should still be careful - in some cases, macros, on the contrary, worsen the situation.

Introduction.

The language in which the original program is written is called input language, and the language into which it is translated for execution by the processor - weekend language. The process of converting an input language into an output language is called broadcast. Since processors are capable of executing programs in binary machine language, which is not used for programming, translation of all source programs is necessary. known two ways translations: compilation and interpretation.

At compilation the source program is first completely translated into an equivalent program in the target language, called object program and then executed. This process is carried out using a special programs, called compiler. A compiler for which the input language is a symbolic representation of the machine (output) language of binary codes is called assembler.

At interpretations each line of source program text is parsed (interpreted) and the command specified in it is immediately executed. The implementation of this method lies with interpreter program. Interpretation takes a long time. To increase its efficiency, instead of processing each line, the interpreter preliminarily converts all command strings to characters (

). The generated sequence of symbols is used to perform the functions assigned to the original program.

The assembly language discussed below is implemented using compilation.

Features of the language.

The main features of the assembler:

● instead of binary codes, the language uses symbolic names - mnemonics. For example, for the addition command (

) mnemonic is used

Subtractions (

multiplication (

Divisions (

etc. Symbolic names are also used to address memory cells. To program in assembly language, instead of binary codes and addresses, you need to know only the symbolic names that the assembler translates into binary codes;

each statement corresponds one machine command(code), that is, there is a one-to-one correspondence between machine instructions and operators in an assembly language program;

● language provides access to all objects and teams. High-level languages ​​do not have this ability. For example, assembly language allows you to check a flag register bit, and a high-level language (for example,

) does not have this capability. Note that languages ​​for systems programming (for example, C) often occupy an intermediate position. In terms of accessibility, they are closer to assembly language, but they have the syntax of a high-level language;

● assembly language is not a universal language. Each specific group of microprocessors has its own assembler. High-level languages ​​do not have this disadvantage.

Unlike high-level languages, writing and debugging an assembly language program takes a lot of time. Despite this, assembly language has become wide use due to the following circumstances:

● A program written in assembly language has a significant smaller sizes and is much faster than a program written in a high-level language. For some applications, these indicators play a primary role, for example, many system programs (including compilers), programs in credit cards, cell phones, device drivers, etc.;

● some procedures require full access to hardware, which is usually not possible in a high-level language. This case includes interrupts and interrupt handlers in operating systems, as well as device controllers in embedded systems operating in real time.

In most programs, only a small percentage of the total code is responsible for a large percentage of the program's execution time. Typically, 1% of the program is responsible for 50% of the execution time, and 10% of the program is responsible for 90% of the execution time. Therefore, to write a specific program in real conditions, both assembler and one of the high-level languages ​​are used.

Operator format in assembly language.

An assembly language program is a list of commands (statements, sentences), each of which occupies a separate line and contains four fields: a label field, an operation field, an operand field, and a comment field. Each field has a separate column.

Label field.

Column 1 is allocated for the label field. A label is a symbolic name, or identifier, addresses memory. It is necessary in order to be able to:

● make a conditional or unconditional transition to the command;

● get access to the place where the data is stored.

Such statements are labeled. To designate a name, (capital) letters of the English alphabet and numbers are used. The name must start with a letter and end with a colon. The colon label can be written on a separate line, and the opcode can be written on the next line in column 2, which simplifies the work of the compiler. The absence of a colon makes it impossible to distinguish between a label and an opcode if they are on separate lines.

In some versions of assembly language, colons are placed only after instruction labels, not after data labels, and label length can be limited to 6 or 8 characters.

The label field should not contain the same names, since the label is associated with the addresses of commands. If during program execution there is no need to call a command or data from memory, then the label field remains empty.

Transaction code field.

This field contains the command mnemonic or pseudo-command (see below). The command mnemonic code is chosen by the language developers. In assembly language

mnemonic selected to load register from memory

), and to store the contents of the register in memory - the mnemonic

). In assembly languages

you can use the same name for both operations, respectively

If the choice of mnemonic names can be arbitrary, then the need to use two machine instructions is due to the processor architecture

Register mnemonics also depend on the assembler version (Table 5.2.1).

Operand field.

Here is located Additional Information required to perform the operation. In the field of operands for jump instructions, the address where you want to jump is indicated, as well as addresses and registers that are operands for the machine instruction. As an example, here are the operands that can be used for 8-bit processors

● numerical data,

presented in different number systems. To indicate the number system used, the constant is followed by one of the Latin letters: B,

Accordingly, binary, octal, hexadecimal, decimal number systems (

may not be recorded). If the first digit of the hexadecimal number is A, B, C,

Then an insignificant 0 (zero) is added in front;

● codes of microprocessor internal registers and memory cells

M (sources or receivers of information) in the form of letters A, B, C,

M or their addresses in any number system (for example, 10V - register address

in binary system);

● identifiers,

for registered aircraft pairs,

The first letters B

H; for a pair of accumulator and feature register -

; for the program counter -

; for stack pointer -

● labels indicating addresses of operands or next instructions in conditional

(when the condition is met) and unconditional transitions. For example, operand M1 in the command

means the need for an unconditional transition to the command, the address of which in the label field is marked with the identifier M1;

● expressions,

which are built by linking the data discussed above using arithmetic and logical operators. Note that the way data space is reserved depends on the version of the language. Assembly language developers for

Define the word), and later introduced an alternative.

which from the very beginning was in the language for processors

In language version

used

define a constant).

Processors process operands of different lengths. To define it, assembler developers have made different decisions, for example:

II registers of different lengths have different names: EAX - for placing 32-bit operands (type

); AX - for 16-bit (type

and AN - for 8-bit (type

● for processors

suffixes are added to each opcode: suffix

For type

; suffix ".B" for type

for operands of different lengths are used different codes operations, for example, to load a byte, a halfword (

) and words in 64-bit register use opcodes

respectively.

Comments field.

This field provides explanations about the actions of the program. Comments do not affect the operation of the program and are intended for a person. They may be needed to modify a program that, without such comments, may be completely incomprehensible even to experienced programmers. A comment begins with a character and is used to explain and document programs. The start character of a comment can be:

● semicolon (;) in languages ​​for processors of the company

Exclamation point(!) in languages ​​for

Each separate line reserved for a comment is preceded by a start character.

Pseudo commands (directives).

In assembly language, two main types of commands can be distinguished:

basic instructions that are equivalent to the machine code of the processor. These commands do all the processing provided by the program;

pseudo-commands or directives, designed to serve the process of translating the program into the language of code combinations. As an example, in Table. 5.2.2 shows some pseudo-commands from the as-assembler

for family

.

When programming, there are situations when, according to the algorithm, the same chain of commands must be repeated many times. To get out of this situation, you can:

● write the desired sequence of commands whenever it occurs. This approach leads to an increase in the volume of the program;

● arrange this sequence into a procedure (subroutine) and call it if necessary. Such an exit has its drawbacks: each time you have to execute a special procedure call instruction and a return instruction, which, with a short and frequently used sequence, can greatly reduce the speed of the program.

The most simple and effective method repeated repetition of a chain of commands is to use macro, which can be thought of as a pseudo-command designed to re-translate a group of commands frequently encountered in a program.

A macro, or macro instruction, is characterized by three aspects: macro definition, macro inversion, and macro expansion.

macro definition

This is a designation for a repeatedly repeated sequence of program commands, used for references in the text of the program.

A macro has the following structure:

List of expressions; macro definition

There are three parts to the above macro definition structure:

● header

macro containing the name

Pseudo-command

and a set of parameters;

● dotted body macro;

● team

graduation

macro definitions.

A macro parameter set contains a list of all parameters given in the operand field for the selected instruction group. If these parameters are given earlier in the program, then they can be omitted in the macro definition header.

For reassembly of the selected group of instructions, a call is used, consisting of the name

macro and parameter list with other values.

When the assembler encounters a macro definition during compilation, it stores it in the macro definition table. With subsequent appearances in the program of the name (

) of a macro, the assembler replaces it with the body of the macro.

Using a macro name as an opcode is called macro-reversal(macro call), and its replacement by the body of the macro - macro expansion.

If the program is represented as a sequence of characters (letters, numbers, spaces, punctuation and carriage returns to jump to newline), then macro expansion consists in replacing some chains from this sequence with other chains.

Macro expansion occurs during the assembly process, not during program execution. Ways to manipulate strings of characters is assigned to macro tools.

The assembly process is carried out in two passes:

● On the first pass, all macro definitions are kept and macro calls are expanded. In this case, the source program is read and converted into a program in which all macro definitions are removed, and each macro call is replaced by a macro body;

● The second pass processes the received program without macros.

Macros with parameters.

To work with repeating sequences of commands, the parameters of which can take various meanings, macro definitions are provided:

● with actual parameters that are placed in the operand field of the macro call;

● with formal parameters. During macro expansion, each formal parameter that appears in the body of the macro is replaced by the corresponding actual parameter.

using macros with parameters.

Program 1 shows two similar sequences of commands, differing in that the first of them swaps P and

And the second

Program 2 includes a macro with two formal parameters P1 and P2. During macro expansion, each P1 character inside the macro body is replaced by the first actual parameter (P,

), and the symbol P2 is replaced by the second actual parameter (

) from program No. 1. In a macro call

program 2 is marked: P,

The first actual parameter,

The second actual parameter.

Program 1

Program 2

MOV EBX,Q MOV EAX,Pl

MOV Q,EAX MOV EBX,P2

MOV P,EBX MOV P2,EAX

Extended capabilities.

Consider some advanced features of the language

If a macro containing a conditional branch instruction and a label to jump to is called two or more times, the label will be duplicated (label duplication problem), which will cause an error. Therefore, each call is assigned (by the programmer) a separate label as a parameter. In language

the label is declared local (

) and thanks to the advanced features, the assembler automatically generates a different label each time the macro is expanded.

allows you to define macros inside other macros. This advanced feature is very useful when combined with conditional program linking. Consider

IF WORDSIZE GT 16 M2 MACRO

Macro M2 can be defined in both parts of the statement

However, the definition depends on whether the program is being assembled on a 16-bit or 32-bit processor. If M1 is not called, then macro M2 will not be defined at all.

Another advanced feature is that macros can call other macros, including themselves - recursive call. In the latter case, in order to avoid an infinite loop, the macro must pass a parameter to itself, which changes with each expansion, and also check this parameter and end the recursion when the parameter reaches a certain value.

On the use of macros in assembler.

When using macros, the assembler must be able to perform two functions: save macro definitions And expand macro calls.

Saving macro definitions.

All macro names are stored in a table. Each name is accompanied by a pointer to the corresponding macro so that it can be called if necessary. Some assemblers have separate table for macro names, others - a common table in which, along with the names of macros, there are all machine commands and directives.

When encountering a macro during assembly created:

new table element with the name of the macro, the number of parameters and a pointer to another macro definition table where the macro body will be stored;

● list formal parameters.

The body of the macro, which is simply a string of characters, is then read and stored in the macro definition table. Formal parameters occurring in the loop body are marked special character.

Internal representation of a macro

from the above example for program 2 (p. 244) is:

MOV EAX, MOV EBX, MOV MOV &

where the semicolon is used as the carriage return character, and the ampersand & is used as the formal parameter character.

Macro call extension.

Whenever a macro definition is encountered during assembly, it is stored in the macro table. When a macro is called, the assembler temporarily suspends reading input data from the input device and starts reading the saved macro body. The formal parameters extracted from the macro body are replaced by the actual parameters and provided by the call. An ampersand & in front of the parameters allows the assembler to recognize them.

Although there are many versions of assembler, assembly processes have common features and are similar in many ways. The work of a two-pass assembler is considered below.

Two pass assembler.

The program consists of a number of statements. Therefore, it would seem that the following sequence of actions can be used during assembly:

● translate it into machine language;

● transfer the received machine code to a file, and the corresponding part of the listing - to another file;

● repeat the above procedures until the entire program is broadcast.

However, this approach is not efficient. An example is the so-called problem leading link. If the first statement is a jump to the P statement at the very end of the program, then the assembler cannot translate it. He must first determine the address of the operator P, and for this it is necessary to read the entire program. Each complete reading of the original program is called passage. Let's show how we can solve the forward reference problem using two passes:

on the first pass collect and store all symbol definitions (including labels) in the table, and on the second pass, read and assemble each operator. This method is relatively simple, but the second pass through the original program requires additional I/O time;

● on the first pass, convert program into an intermediate form and save it in a table, and the second pass is performed not according to the original program, but according to the table. This method of assembly saves time, since no I/O operations are performed on the second pass.

First pass.

Purpose of the first pass- build a symbol table. As noted above, another goal of the first pass is to save all macro definitions and expand the calls as they appear. Therefore, both character definition and macro expansion occur in the same pass. The symbol can be either label, or meaning, which is assigned a specific name using the -you directive:

;Value - buffer size

By giving meaning to the symbolic names in the instruction label field, the assembler essentially sets the addresses that each instruction will have during program execution. To do this, the assembler during the assembly process saves instruction address counter(

) as a special variable. At the beginning of the first pass, the value of the special variable is set to 0 and incremented after each command processed by the length of that command. As an example, in Table. 5.2.3 shows a fragment of the program indicating the length of commands and counter values. Tables are generated during the first pass symbol names, directives And operation codes, and if necessary literal table. A literal is a constant for which the assembler automatically reserves memory. We immediately note that modern processors contain instructions with immediate addresses, so their assemblers do not support literals.

Symbol table

contains one element for each name (Table 5.2.4). Each element of the symbol table contains the name itself (or a pointer to it), its numerical value, and sometimes some additional information, which may include:

● the length of the data field associated with the symbol;

● memory remapping bits (which indicate whether the value of a symbol changes if the program is loaded at a different address than the assembler intended);

● information about whether the symbol can be accessed from outside the procedure.

Symbolic names are labels. They can be specified using operators (for example,

Table of directives.

This table lists all the directives, or pseudo-commands, that occur when assembling a program.

Operation code table.

For each opcode, the table has separate columns: opcode designation, operand 1, operand 2, hexadecimal value of the opcode, instruction length and instruction type (Table 5.2.5). Operation codes are divided into groups depending on the number and type of operands. The command type determines the group number and specifies the procedure that is called to process all commands in that group.

Second pass.

Purpose of the second pass- creating an object program and printing, if necessary, an assembly protocol; output information needed by the linker to link procedures that were assembled at different times into one executable file.

In the second pass (as in the first), the lines containing the statements are read and processed one after the other. The original operator and derived from it in hexadecimal system day off object the code can be printed or buffered for later printing. After resetting the command address counter, the next statement is called.

The original program may contain errors, for example:

the given symbol is not defined or defined more than once;

● The opcode is represented by an invalid name (due to a typo), not provided with enough operands, or has too many operands;

● no operator

Some assemblers may detect an undefined symbol and replace it. However, in most cases, when a statement with an error is found, the assembler displays an error message on the screen and tries to continue the assembly process.

Articles dedicated to the assembly language.

General information about assembly language

The symbolic assembly language makes it possible to largely eliminate the shortcomings of machine language programming.

Its main advantage is that in assembly language all program elements are represented in symbolic form. The transformation of symbolic command names into their binary codes is the responsibility of special program- assembler, which frees the programmer from laborious work and eliminates the inevitable errors.

Symbolic names introduced when programming in assembly language, as a rule, reflect the semantics of the program, and the abbreviation of commands - their main function. For example: PARAM - parameter, TABLE - table, MASK - mask, ADD - addition, SUB - subtraction, etc. n. Such names are easily remembered by the programmer.

For programming in assembly language, it is necessary to have complex tools than for programming in machine language: you need computer systems based on microcomputers or PCs with a set peripherals(alphanumeric keyboard, character display, floppy disk drive and printer), as well as resident or cross-programming systems for the required types of microprocessors. Assembly language allows you to efficiently write and debug much more complex programs than machine language (up to 1 - 4 KB).

Assembly languages ​​are machine-oriented, that is, dependent on the machine language and structure of the corresponding microprocessor, since they assign a specific symbolic name to each microprocessor instruction.

Assembly languages ​​provide a significant increase in the productivity of programmers compared to machine languages ​​and at the same time retain the ability to use all the software-accessible hardware resources of the microprocessor. This enables skilled programmers to write programs that run in a shorter time and take up less memory than programs written in a high-level language.

In this regard, almost all I / O device control programs (drivers) are written in assembly language, despite the presence of a fairly large range of high-level languages.

Using assembly language, the programmer can set the following parameters:

mnemonic (symbolic name) of each command of the machine language of the microprocessor;

standard format for lines of a program described in assembler;

format for specifying various ways addressing and command options;

format for specifying character constants and constants of integer type in various number systems;

pseudo-commands that control the process of assembly (translation) of the program.

In assembly language, the program is written line by line, i.e., one line is allocated for each instruction.

For microcomputers built on the basis of the most common types of microprocessors, there may be several variants of assembly language, however, one usually has one practical distribution - this is the so-called standard assembly language

Programming at the level of machine instructions is the minimum level at which programming is possible. The system of machine instructions must be sufficient to implement the required actions by issuing instructions to the computer hardware.

Each machine instruction consists of two parts:

operating - determining "what to do";

· operand - defining processing objects, "what to do with".

The machine instruction of the microprocessor, written in assembly language, is a single line with the following syntactic form:

label command/directive operand(s) ;comments

Wherein required field in line is a command or directive.

The label, command/directive, and operands (if any) are separated by at least one space or tab character.

If a command or directive needs to be continued on the next line, then the backslash character is used: \.

By default, assembly language does not distinguish between uppercase and lowercase letters in commands or directives.

Direct Addressing: The effective address is determined directly by the machine instruction offset field, which can be 8, 16, or 32 bits in size.

mov eax, sum ; eax = sum

The assembler replaces sum with the corresponding address stored in the data segment (by default, addressed by register ds) and places the value stored at address sum in register eax.

indirect addressing in turn has the following types:

Indirect basic (register) addressing;

Indirect basic (register) addressing with offset;

· indirect index addressing;

· indirect base index addressing.

Indirect basic (register) addressing. With this addressing, the effective address of the operand can be in any of the general purpose registers, except for sp / esp and bp / ebp (these are specific registers for working with a stack segment). Syntactically in an instruction, this addressing mode is expressed by enclosing the register name in square brackets.

mov eax, ; eax = *esi; *esi value at address esi