Path: Home => AVR-overview => Assembler intro => Program control    (Diese Seite in Deutsch: Flag DE) Logo

Introduction to AVR assembler programming for beginners

Controlling sequential execution of the program

Here we discuss all commands that control the sequential execution of a program. It starts with the starting sequence on power-up of the processor, jumps, interrupts, etc.

What happens during a reset?

When the power supply of an AVR rises and the processor starts its work, the hardware triggers a reset sequence. The counter for the program steps will be set to zero. At this address the execution always starts. Here we have to have our first word of code. But not only during power-up this address is activated:
  1. During an external reset on the reset pin a restart is executed.
  2. If the Watchdog counter reaches its maximum count, a reset is initiated. A watchdog timer is an internal clock that must be resetted from time to time by the program, otherwise it restarts the processor.
  3. You can call reset by a direct jump to that address (see the jump section below).
The third case is not a real reset, because the automatic resetting of register- and port-values to a well-defined default value is not executed. So, forget that for now.

The second option, the watchdog reset, must first be enabled by the program. It is disabled by default. Enabling requires write commands to the watchdog's port. Setting the watchdog counter back to zero requires the execution of the command


to avoid a reset.

After execution of a reset, with setting registers and ports to default values, the code at address 0000 is wordwise read to the execution part of the processor and is executed. During that execution the program counter is already incremented by one and the next word of code is already read to the code fetch buffer (Fetch during Execution). If the executed command does not require a jump to another location in the program the next command is executed immediately. That is why the AVRs execute extremely fast, each clock cycle executes one command (if no jumps occur).

The first command of an executable is always located at address 0000. To tell the compiler (assembler program) that our source code starts now and here, a special directive can be placed at the beginning, before the first code in the source is written:

.ORG 0000

The first directive lets the compiler switch to the code section. All following is translated as code and is written to the program memory section of the processor. Another target segment would be the EEPROM section of the chip, where you also can write bytes or words to.


The third segment is the SRAM section of the chip.


The ORG directive above stands for origin and manipulates the address within the code segment, where assembled words go to. As our program always starts at 0x0000 the CSEG/ORG directives are trivial, you can skip these without getting into an error. We could start at 0x0100, but that makes no real sense. If you want to place a table exactly to a certain location of the code segment, you can use ORG. If you want to set a clear sign within your code, after first defining a lot of other things with .DEF- and .EQU-directives, use the CSEG/ORG sequence, even though it might not be necessary to do that.

As the first code word is always at address zero, this location is also called the reset vector. Following the reset vector the next positions in the program space, addresses 0x0001, 0x0002 etc., are interrupt vectors. These are the positions where the execution jumps to if an external or internal interrupt has been enabled and occurs. These positions called vectors are specific for each processor type and depend on the internal hardware available (see below). The commands to react to such an interrupt have to be placed to the proper vector location. If you use interrupts, the first code, at the reset vector, must be a jump command, to jump over the other vectors. Each interrupt vector must hold a jump command to the respective interrupt service routine. The typical program sequence at the beginning is like follows:

.ORG 0000
    RJMP Start
    RJMP IntServRout1

[...] here we place the other interrupt vector commands
[...] and here is a good place for the interrupt service routines themselves
Start: ; This here is the program start
[...] Here we place our main program

The command RJMP results in a jump to the label Start:, located some lines below. Labels always start in column 1 of the source code and end with a :. Labels, that don't fulfil these conditions are not taken for serious by the compiler. Missing labels result in an error message ("Undefined label"), and compilation is interrupted.

To the top of this page

Linear program execution and branches

Program execution is always linear, if nothing changes the sequential execution. These changes are the execution of an interrupt or of branching instructions.

Branching is very often depending on some condition, conditioned branching. As an example we assume we want to construct a 32-bit-counter using registers R1 to R4. The least significant byte in R1 is incremented by one. If the register overflows during that operation (255 + 1 = 0), we have to increment R2 similiarly. If R2 overflows, we have to increment R3, and so on.

Incrementation by one is done with the instruction INC. If an overflow occurs during that execution of INC R1 the zero bit in the status register is set to one (the result of the operation is zero). The carry bit in the status register, usually set by overflows, is not changed during an INC. This is not to confuse the beginner, but carry is used for other purposes instead. The Zero-Bit or Zero-flag is enough to detect an overflow. If no overflow occurs we can just leave the counting sequence.

If the Zero-bit is set, we must execute additional incrementation of the other registers.To confuse the beginner the branching command, that we have to use, is not named BRNZ but BRNE (BRanch if Not Equal). A matter of taste ...

The whole count sequence of the 32-bit-counter should look like this:

    INC R1
    BRNE GoOn32
    INC R2
    BRNE GoOn32
    INC R3
    BRNE GoOn32
    INC R4

So that's about it. An easy thing. The opposite condition to BRNE is BREQ or BRanch EQual.

Which of the status bits, also called processor flags, are changed during execution of a command is listed in instruction code tables, see the List of Instructions. Similiarly to the Zero-bit you can use the other status bits like that:

    BRCC/BRCS ; Carry-flag 0 oder 1
    BRSH ; Equal or greater
    BRLO ; Smaller
    BRMI ; Minus
    BRPL ; Plus
    BRGE ; Greater or equal (with sign bit)
    BRLT ; Smaller (with sign bit)
    BRHC/BRHS ; Half overflow flag 0 or 1
    BRTC/BRTS ; T-Bit 0 or 1
    BRVC/BRVS ; Two's complement flag 0 or 1
    BRIE/BRID ; Interrupt enabled or disabled

to react to the different conditions. Branching always occurs if the condition is met. Don't be afraid, most of these commands are rarely used. For the beginner only Zero and Carry are relevant.

To the top of that page

Timing during program execution

Like mentioned above the required time to execute one instruction is equal to the processor's clock cycle. If the processor runs on a 4 MHz clock frequency then one instruction requires 1/4 µs or 250 ns, at 10 MHz clock only 100 ns. The required time is as exact as the xtal clock. If you need exact timing an AVR is the optimal solution for your problem. Note that there are a few commands that require two or more cycles, e.g. the branching instructions (if branching occurs) or the SRAM read/write sequence. See the instructions table for details.

To define exact timing there must be an opportunity that does nothing else than delay program execution. You might use other instructions that do nothing, but more clever is the use of the No OPeration command NOP. This is the most useless instruction:


This instruction does nothing but wasting processor time. At 4 MHz clock we need just four of these instructions to waste 1 µs. No other hidden meanings here on the NOP instruction. For a signal generator with 1 kHz we don't need to add 4000 such instructions to our source code, but we use a software counter and some branching instructions. With these we construct a loop that executes for a certain number of times and are exactly delayed. A counter could be a 8-bit-register that is decremented with the DEC instruction, e.g. like this:

    CLR R1
    DEC R1
    BRNE Count

16-bit counting can also be used to delay exactly, like this

    LDI ZH,HIGH(65535)
    LDI ZL,LOW(65535)
    SBIW ZL,1
    BRNE Count

If you use more registers to construct nested counters you can reach any delay. And the delay is absolutely exact, even without a hardware timer.

More on instruction execution and timing with examples can be found here.

To the top of that page

Macros and program execution

Very often you have to write identical or similiar code sequences on different occasions in your source code. If you don't want to write it once and jump to it via a subroutine call you can use a macro to avoid getting tired writing the same sequence serveral times. Macros are code sequences, designed and tested once, and inserted into the code by its macro name. As an example we assume we need to delay program execution several times by 1 µs at 4 MHz clock. Then we define a macro somewhere in the source:

.MACRO Delay1

This definition of the macro does not yet produce any code, it is silent. Code is produced if you call that macro by its name:

[...] somewhere in the source code
[...] code goes on here

This results in four NOP incstructions inserted to the code at that location. An additional Delay1 inserts additional four NOP instructions.

By calling a macro by its name you can add some parameters to manipulate the produced code. But this is more than a beginner has to know about macros.

If your macro has longer code sequences, or if you are short in code storage space, you should avoid the use of macros and use subroutines instead.

To the top of that page


In contrary to macros a subroutine does save program storage space. The respective sequence is only once stored in the code and is called from whatever part of the code. To ensure continued execution of the sequence following the subroutine call you need to return to the caller. For a delay of 10 cycles you need to write this subroutine:


Subroutines always start with a label, otherwise you would not be able to jump to it, here Delay10:. Three NOPs follow and a RET instruction. If you count the necessary cycles you just find 7 cycles (3 for the NOPs, 4 for the RET). The missing 3 are for calling that routine:

[...] somewhere in the source code:
    RCALL Delay10
[...] further on with the source code

RCALL is a relative call. The call is coded as relative jump, the relative distance from the calling routine to the subroutine is calculated by the compiler. The RET instruction jumps back to the calling routine. Note that before you use subroutine calls you must set the stackpointer (see Stack), weil die because the return address must be packed on the stack by the RCALL instruction.

If you want to jump directly to somewhere else in the code you have to use the jump instruction:

[...] somewhere in the source code
    RJMP Delay10

[...] further on with source code

The routine that you jumped to can not use the RET command in that case. To return back to the calling location in the source requires to add another label and the called routine to jump back to this label. Jumping like this is not like calling a subroutine because you can't call this routine from different locations in the code.

RCALL and RJMP are unconditioned branches. To jump to another location, depending on some condition, you have to combine these with branching instructions. Conditioned calling of a subroutine can best be done with the following commands. If you want to call a subroutine depending on a certain bit in a register use the following sequence:

    SBRC R1,7 ; Skip the next instruction if bit 7 is 0
    RCALL UpLabel ; Call that subroutine

SBRC reads Skip next instruction if Bit in Register is Clear. The RCALL instruction to UpLabel: is only executed if bit 7 in register R1 is 1, because the next instruction is skipped if it would be 0. If you like to call the subroutine in case this bit is 0 then you use the corresponding instruction SBRS. The instruction following SBRS/SBRC can be a single word or double word instruction, the processor knows how far he has to jump over it. Note that execution times are different then. To jump over more than one following instruction these commands cannot be used.

If you have to skip an instruction if two registers have the same value you can use the following exotic instruction

    CPSE R1,R2 ; Compare R1 and R2, skip if equal
    RCALL SomeSubroutine ; Call SomeSubroutine

A rarely used command, forget it for the beginning.

If you like to skip the following instruction depending on a certain bit in a port use the following instructions SBIC und SBIS. That reads Skip if the Bit in I/o space is Clear (or Set), like this:

    SBIC PINB,0 ; Skip if Bit 0 on port B is 0
    RJMP ATarget ; Jump to the label ATarget

The RJMP-instruction is only executed ist bit 0 in port B is high. This is something confusing for the beginner. The access to the port bits is limited to the lower half of ports, the upper 32 ports are not usable here.

Now another exotic application for the expert. Skip this if you are a beginner. Assume we have a bit switch with 4 switches connected to port B. Depending on the state of these 4 bits we would like to jump to 16 different locations in the code. Now we can read the port and use several branching instructions to find out, where we have to jump to today. As alternative you can write a table holding the 16 addresses, like this:

    RJMP Routine1
    RJMP Routine2

    RJMP Routine16

In our code we copy that adress of the table to the Z pointer register:

    LDI ZH,HIGH(MyTab)
    LDI ZL,LOW(MyTab)

and add the current state of the port B (in R16) to this address.

    ADD ZL,R16
    BRCC NoOverflow
    INC ZH

Now we can jump to this location in the table, either for calling a subroutine:


or as a jump with no way back:


The processor loads the content of the Z register pair into its program counter and continues operation there. More clever than branching over and over?

To the top of that page

Interrupts and program execution

Very often we have to react on hardware conditions or other events. An example is a change on an input pin. You can program such a reaction by writing a loop, asking whether a change on the pin has occurred. This method is called polling, its like running around in circles searching for new flowers. If there are no other things to do and reaction time does not matter, you can do this with the processor. If you have to detect short pulses of less than a µs duration this method is useless. In that case you need to program an interrupt.

An interrupt is triggered by some hardware conditions. The condition has to be enabled first, all hardware interrupts are disabled at reset time by default. The respective port bits enabling the component's interrupt ability are set first. The processor has a bit in its status register enabling him to respond to the interrupt of all components, the Interrupt Enable Flag. Enabling the general response to interrupts requires the following command:

    SEI ; Set Int Enable

If the interrupting condition occurs, e.g. a change on the port bit, the processor pushes the actual program counter to the stack (which must be enabled first! See initiation of the stackpointer in Stack). Without that the processor wouldn't be able to return back to the location, where the interrupt occurred (which could be any time and anywhere). After that processing jumps to the predefined location, the interrupt vector, and executes the instructions there. Usually this is a JUMP instruction to the interrupt service routine somewhere in the code. The interrupt vector is a processor-specific location and depending from the hardware component and the condition that leads to the interrupt. The more hardware components and the more conditions, the more vectors. The different vectors for some of the AVR types are listed in the following table. (The first vector isn't an interrupt but the reset vector, performing no stack operation!)

NameInt Vector Addresstriggered by ...
RESET000000000000Hardware Reset, Power-On Reset, Watchdog Reset
INT0000100010001Level change on the external INT0-Pin
INT10002-0002Level change on the external INT1-Pin
TIMER1 CAPT0003-0003Capture event on Timer 1
TIMER1 COMPA--0004Timer1 = Compare A
TIMER1 COMPB--0005Timer1 = Compare B
TIMER1 COMP10004--Timer1 = Compare 1
TIMER1 OVF0005-0006Timer1 Overflow
TIMER0 OVF000600020007Timer0 Overflow
SPI STC--0008Serial transmit complete
UART RX0007-0009UART char in receive buffer available
UART UDRE0008-000AUART transmitter ran empty
UART TX0009-000BUART All sent
ANA_COMP--000CAnalog Comparator

Note that the capability to react to events is very different for the different types. The addresses are sequential, but not identical for different types. The higher a vector in the list the higher is its priority. If two components have an interrupt condition at the same time the upmost vector with the lower vector address wins. The lower int has to wait until the upper int was served. To disable lower ints from interrupting during the execution of its service routine the first executed int disables the processor's I-flag. The service routine must re-enable this flag after it is done with its job.

For re-setting the I status bit there are two ways. The service routine can end with the command:


This return from the int routine restores the I-bit after the return address has been loaded to the program counter.

The second way is to enable the I-bit by the instruction

    SEI ; Set Interrupt Enabled
    RET ; Return

This is not the same as the RETI, because subsequent interrupts are already enabled before the program counter is loaded with the return address. If another int is pending, its execution is already starting before the return address is popped from the stack. Two or more nested addresses remain on the stack. No bug is to be expected, but it is an unnecessary risk doing that. So just use the RETI instruction to avoid this unnecessary flow to the stack.

An Int-vector can only hold a relative jump instruction to the service routine. If a certain int is not used or undefined we can just put a RETI instruction there, in case a false int happens.

Note that larger devices have a two-word organization of the vector table. In this case the JMP instruction has to be used instead of RJMP. And RETI instructions must be followed by an NOP to point to the next vector table address.

As further execution of lower-priority ints is blocked, all int service routines should be short. If you need to have a longer routine to serve the int, use one of the two following methods. The first is to allow ints by SEI within the service routine, whenever you're done with the most urgent tasks. Not very clever. More convenient is to perform the urgent tasks, setting a flag somewhere in a register for the slower reactions and return from the int immediately.

A very hard rule for int service routines is: First instruction is to save the status register on the stack, before you use instructions that might change flags in the status register. The interrupted main program might just be in a state using the flag for a branch decision, and the int would just change that flag to another state. Funny things would happen from time to time. The last instruction before the RETI therefore is to pop the status register content from the stack and restore its original content, prior to the int.

For the same reason all used registers in a service routine should either be exclusively reserved for that purpose or saved on stack and restored at the end of the service routine. Never change the content of a register within a int service routine that is used somewhere else in the normal program without restoring it.

Because of these basic requirements a more sophisticated example for an interrupt service routine here.

.CSEG ; Code-Segment starts here
.ORG 0000 ; Address is zero
    RJMP Start ; The reset-vector on Address 0000
    RJMP IService ; 0001: first Int-Vektor, INT0 service routine
[...] here other vectors

Start: ; Here the main program starts
[...] here is enough space for defining the stack and other things

IService: ; Here we start with the Interrupt-Service-Routine
    PUSH R16 ; save a register to stack
    IN R16,SREG ; read status register
    PUSH R16 ; and put on stack
[...] Here the Int-Service-Routine does something and uses R16
    POP R16 ; get previous flag register from stack
    OUT SREG,R16 ; restore old status
    POP R16 ; get previous content of R16 from the stack
    RETI ; and return from int

Looks a little bit complicated, but is a prerequisite for using ints without producing serious bugs. Skip PUSH R16 and POP R16 if you can afford reserving R16 for exclusive use in that service routine.

That's it for the beginner. There are some other things with ints, but this is enough to start with, and not to confuse you.

To the top of that page

©2002 by