- Simple decimal, binary and hexadecimal numbers
- Larger numbers: multiple bytes
- Packed BCD
- Signed integers
- Adding
- Subtracting
- Multipliply/Divide
- Binary AND, OR and EXOR

In the binary world the digits are exhausted already with the next behind 1. So the same "Add to the left" method has to be applied and binary 10 is decimal two. The next digits to the left are decimal four, eight, 16, etc. Every digit to the left is by a factor of two larger than the previous, instead by ten in the decimal world.

That demonstrastes how a binary of 1010,1010 can be converted to the respective decimal value: if the binary digit is a one the respective decimal is added to the result, if zero nothing is added.

The 1010,1010 is a special binary because when sending it in serial mode, that is bit-by-bit one after the other, the maximum level changes occur on the signal line.

The same number of level changes occurs if the decimal 85 is send. The single bits of the 85 are the same like in the 170 case, but all bits are reversed (exchanged, make a zero from a one and vice versa). And another aspect can be seen here: if you shift the bits in 170 one digit to the right, and if you insert a 0 in the now freed leftmost bit, and if you ignore the 0 that is shifted out to the right, exactly half of the 170 results. In the decimal world shifting a number to the right (170 ==> 17) is a division by ten, in the binary world a division by two.

Similiar with shifting to the left. In the decimal world shifting 170 to the left yields 1700, binary shifting of 1010,1010 yields 1,0101,0100, the two-fold (which should be decimal 340, of course). With that we already learned multiplication and division in binary, but only with powers of two.

Simple numbers | Larger numbers | Packed BCD | Signed | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

The next larger storage unit is a byte. Of those storage units the AVRs have plenty of, namely 32 registers (PICs have only one register) into which 8 single bits fit, making up space for 256 single bits (in packages of eight, which would form a very large number if all bits would be combined into one number). As the byte is the most used bit package in AVRs, we used two eight bit large numbers in the above examples.

Because writing 8 bits already is a painful exercise, the writing of binary numbers has been simplified - by using hexadecimal numbers instead. For this, four bits are combined and assigned to one hexadecimal digit. The hexadecimal digits are similiar to decimal digits for 0 to 9. But because four bits can, at maximum, be 1111, additional six digits named A to F are defined, 1010 is A, 1111 is F. The above mentioned binaries 1010,1010 resp. 0101,0101 therefore are 0xAA resp. 0x55. The "0x" says that the following number is in hexadecimal format, because 55 can be in decimal or in hexadecimal format, and both are very different. 55 in hexadecimal format is (5 * 16 + 5) = 85, not 55 in decimal format. So the 0x is really decisive.

Into eight bits at maximum 0xFF or 0b11111111 fits, which is equivalent to 128+64+32+16+8+4+1=255 in decimal format.

If larger numbers have to be handled, we have to acquire additional storage space. We can store those in two (16 bits), three (24 bits), four (32 bits) or even five (40 bits) registers. With that, we would be able to handle numbers of up to one quadrillion, enough for 99.9% of all non-academic applications.

Into which of the 32 registers we store those five bytes and which row we choose, and if we choose a row at all, is up to the assembler coder's choice (high-level languagers do not have a choice here, their compiler decides). To keep the oversight over such large numbers, we can formulate for a 32-bit binary:

```
.equ Number = 4294000000 ; slightly above four billions
ldi R16,Byte1(Number) ; the lowest 8 bits into register 16
ldi R17,Byte2(Number) ; Bits 9 to 16 into register 17
ldi R18,Byte3(Number) ; Bits 17 to 24 into register 18
ldi R19,Byte4(Number) ; Bits 25 to 32 into register 19
```

The four bytes now follow a row, from the lowest significant
up to the most significant one. But we could mix those bytes
in any desired manner (as long as only 8-bit instructions are
concerned). Those rows are good for the human memory, for the
controller it is of no significance: he does what is told him
and cannot complain about false rows (a compiler can!). Often
those rows are signalled by R19:R18:R17:R16.With that, all math with all numbers can be performed.

Simple numbers | Larger numbers | Packed BCD | Signed | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

The second most common storage method is to fill one ASCII coded decimal digit into one byte. This occurs if binaries are converted to decimal and are to be displayed on an LCD. The decimal digits 0 to 9 are the ASCII characters 48 to 57, or 0x30 to 0x39. Each digit needs one byte, the four billions in the above example would need ten bytes (in binary form only four).

Similiar to that: if not the ASCII code but only the binary representation of the decimal digit is stored in one byte. This is called Binary Coded Digit (BCD). Four bits in each register are wasted and are always zero. Of the lower four bits only ten of the possible sixteen combinations are valid BCD.

The fact that the four upper bits are zero can be used to store an additional digit in the upper four bits.

This "Packed BCD" requires half than BCD, in the above case five registers, and less wasting space.

Applications for BCD and packed BCD are very rare, they provide no advantages and performing calculations with those is definately not simpler than with binaries. So, simply knowing that this is possible and exists is exhaustive.

Simple numbers | Larger numbers | Packed BCD | Signed | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

If a number in a byte is a signed integer, its most upper bit (bit 7 in a single byte number, bit 15 in a two byte number) is the sign bit. If the remaining bits are a positive number, this bit is zero. If those are a negative number, the sign bit is one. If the sign bit is one, the lower bits are stored differently: those are subtracted from 256 (one single byte) or 65,536 (two byte number). -1 in a single byte is therefore 0xFF, in 2-byte form 0xFFFF, -2 is either 0xFE or 0xFFFE. The range of numbers that fit into 8 bits are from 0 to +127 and from -1 to -128. The different encoding of negative numbers has some advantage, as will be shown in a later chapter.

Simple numbers | Larger numbers | Packed BCD | Signed | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

DECIMAL 158 BINARY 10101010 + 48 + 110110 ----- ---------- Carry 110 Carry 01111100 Result = 206 Result = 11100000 ===== ==========If in decimal adding two digits the ten is reached, a carry occurs to the next higher digit. 8 plus 8 yields 16, the digit result is 6 and the carry is one, to be added to the next digit sum.

The same with binary adding: if more than one of the added digits is a 1, a carry occurs. This is very often the case in the example.

```
LDI R16,0b10101010 ; the first number to register R16
LDI R17,0b110110 ; the number to be added to register R17
ADD R16,R17 ; add the second to the first and store result in register R16
```

we are already done."0b" stands for a binary. With the ADD instruction the two numbers have been added, the CPU had cared about any carries that occurred, and has written the result to R16. If the result would have been larger than 255 (not in our example), adding the upper bits 7 would have resulted in a carry. This additional information is stored in a flag register called Status REGister (SREG), and therein in the C bit. Whereabout this bit is located in SREG is irrelevant, as we learn below (the CPU knows where).

With this C flag set following the ADD instruction we can decide whether a red LED shall signal this to the outside world (assumed that the LED is connected to plus and is on if the portbit is low). In assembler:

```
LDI R16,0b10101010 ; the first number to register R16
LDI R17,0b110110 ; the second number to register R17
ADD R16,R17 ; add the second to the first, result into R16
BRCS RedLedOn ; Branch to the label if carry flag has been set
SBI LedPort,RedLed ; Set the portbit with the red LED high
RJMP JumpAfter ; Jump relative over the next instruction
RedLedOn:
CBI LedPort,RedLed ; Clear the portbit with the red LED, low
JumpAfter:
; In both cases the program execution is continued here
```

Here, the carry flag is used as a switch to decide between two
different routes. Very much more often the carry flag is used
if two or more byte long numbers are added. Each adding can
end with a carry flag set, and that has to be added to the
next higher byte.
Assume we would have to add a two-byte number in the two registers R16 and R17 (written as R17:R16, the higher 8 bits in R17) with another two-byte number in R19:R18. The adding of the lower eight bits is performed with ADD R16,R18. Instead of doing addition of the upper eight bits in the same way, we use ADC R17:R19. This adds these two registers and simultaneously adds the carry flag.

Nothing with jumping and branching around, as simple as this. This works with numbers of any size, here we add a four-byte number in R19:R18:R17:R16 with a two-byte number in R21:R20:

```
ADD R16,R20 ; Add the lowest byte, result to R16
ADC R17,R21 ; Add the next higher byte and carry, result to R17
LDI R20,0 ; Write a zero to R20 (do not use CLR, which clears carry!)
ADC R18,R20 ; Add zero and carry, result to R18
ADC R19,R20 ; Add zero and carry, result to R19
```

This was it: adding a 32-bit number and a 16-bit number is simply
a matter of five instructions, or 5 µs at 1 MHz.
Nothing that requires a 16- or 32-bit controller family to just
being able to add those large numbers. Can be done with 8-bit AVR,
too.
DECIMAL: 89 BINARY: 1000,1001 +32 0011,0010 Carry 110 0,0000,0000 Result =121 1011,1011This is rather disappointing with packed BCD, because we do not see a carry to the next digit here, as it should be. Instead the lower and the upper nibble are illegal numbers (1011). Illegal because packed BCD are only valid digits between 0 and 9, and hexadecimal B is out of range.

We can correct this by adding 66 to the result:

```
LDI R18,0x66 ; Packed BCD with 66
ADD R16,R18 ; add to result
```

In the binary form this results:
BINARY: 1011,1011 +0110,0110 = 1,0010,0001 PACKED = 1 2 1Perfect, the carry and both packed BCD digits are correct now. But only in our case. If adding 6 to the lower nibble would not have resulted in a carry to the upper nibble, we would have to subtract the 6 again, because the lower digit was fine before adding the 6. The same with the upper nibble: if adding 6 would not have caused a carry the 6 has to be subtracted again.

While the carry is signalled by the C bit in SREG, a carry from the lower four bits to the higher four bits is signalled with the H flag (Half overflow). If H is clear, 6 has to be subtracted. If C is clear 60 has to be subtracted.

By the way: the carry flag can already be set with the initial byte adding (e.g. if 99 is added to 99). A nice branching and jumping orgy follows.

After all that it should be clear that packed BCD math is anything else but simpler than binary math. Only pure software engineers are lucky about that, the rest of mankind is not amused. So we forget the H flag immediately and care about more relevant math.

DECIMAL +94 BINARY 0101,1110 HEX 5E + -1 1111,1111 FF Result +93 1,0101,1101 15DPerfect, nearly everything is correct. Bit 7 of the result is zero, the result is positive. And 93 results (64+16+8+4+1). Only the carry bit is false, because no overflow should have occurred.

Now we add +94 and -94, which should result in +0. -94 is (256 - 94) = 162, in which the sign bit is set (>=128).

DECIMAL +94 BINARY 0101,1110 HEX 5E + -94 1010,0010 A2 Result 0 1,0000,0000 100Exactly correct if you forget the carry flag again.

Simple numbers | Larger numbers | Packed BCD | Signede Ganznumbers | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

If it comes to numbers with more bytes the SUB is exchanged from the second byte up by the SBC instruction which also subtracts the carry flag. Similiarly to adding, any underflow sets the carry bit, that has to be subtracted from the next higher byte.

Here the decimal and the binary modes of multiplication are demonstrated. The difference between the two is not too large, except that binary multiplication requires slightly more single steps (just because the binary number has more digits). Other than the decimal variant the steps in binary are much simpler.

DECIMAL 9,748 * 143 BINARY 10,0110,0001,0100 * 1000,1111 -------------------- --------------------------------------- 29,244 10,0110,0001,0100 +389,920 +100,1100,0010,1000 +974,800 +1001,1000,0101,0000 -------------------- +1,0011,0000,1010,0000 =1,393,964 +00,0000,0000,0000,0000 ==================== +000,0000,0000,0000,0000 +0000,0000,0000,0000,0000 +1,0011,0000,1010,0000,0000 --------------------------------------- =1,0101,0100,0101,0010,1100 =======================================Decimal multiplication goes as follows: beginning with the last digit of the multiplicator (3) the number is multiplied by this and added to the result. Before proceeding with the next digit (4) the number is multiplied by 10 (shifted one left). Adding all this up yields the result.

Binary is simpler. Either the last digit is a one or a zero. If it is a one, the number is added as it is, if not, adding is suspended. Before proceeding with the next digit, multiplication of the number by two (a left shift) is performed.

The algorithm for that is so simple that it is nearly embarrassing to write it down:

```
; Numbers in R1:R0 and R3
ldi R16,0x26 ; 0x2614 to R1:R0
mov R1,R16
ldi R16,0x14
mov R0,R16
clr R2 ; R2 is needed to shift the number left, zero at start
ldi R16,0x8F ; 0x8F to R3
mov R3,R16
; Clear result in R6:R5:R4
clr R6
clr R5
clr R4
MultLoop:
lsr R3 ; shift least significant bit to carry
brcc MultWoAdd ; if clear skip adding
add R4,R0 ; Add number to result, lowest byte
adc R5,R1 ; Add number with carry to result, second byte
adc R6,R2 ; Add number with carry to result, third byte
MultWoAdd:
lsl R0 ; Shift number left, bit 0 = 0, bit 7 to carry
rol R1 ; Roll left, bit 0 = carry, bit 7 to carry
rol R2 ; Roll left, bit 0 = carry, bit 7 to carry
tst R3 ; Multiplication complete?
brne MultLoop ; No, repeat
; Result is in R6:R5:R4, done
```

That is it. Simple 19 instructions for the whole multplication of a
16-bit with an 8-bit number. And, as the Studio says after simulation,
within 90 µs at 1 MHz clock. Changing to a mega or
to a 16-bit controller would be simply overdone.The Studio also says that the result (0x15452C) is correct. If correctly understood it should be easy to expand the algorithm to larger or smaller sizes of numbers. Does not sound too complicated. Starting a C compiler to perform this simple task is not worth it.

DECIMAL 1,393,964 : 143 BINARY 0001,0101,0100,0101,0010,1100 : 1000,1111 139 : 143 = 0 <== 0010,1010,1000,1010,0101,1000 (shifted left) 1393: 143 = 9 Compare 1000,1111 = 0 (smaller, do not subtract) -143*9=-1,287,000 <== 0101,0101,0001,0100,1011,0000 (shifted left) =106,964 Compare 1000,1111 = 0 (smaller, do not subtract) 1069 : 143 = 7 <== 1010,1010,0010,1001,0110,0000 (shifted left) -143*7= -100,100 Subtr. -1000,1111 = 1 (subtract) 6,864 Result =0001,1011,0010,1001,0110,0000 subtracted 686 : 143 = 4 <== 0011,0110,0101,0010,1100,0000 (shifted left) -4*143= 5,720 Compare 1000,1111 = 0 (smaller, do not subtract) 1,144 : 143 = 8 <== 0110,1100,1010,0101,1000,0000 (shifted left) -8*143= -1,144 Compare 1000,1111 = 0 (smaller, do not subtract) 0 <== 1101,1001,0100,1011,0000,0000 (shifted left) --------------------------- Subtr. -1000,1111 = 1 (subtract) = 9,748 Result =0100,1010,0100,1011,0000,0000 subtracted <== 1001,0100,1001,0110,0000,0000 (shifted left) Subtr. -1000,1111 = 1 (subtract) Result =0000,0101,1001,0110,0000,0000 subtracted <== 0000,1011,0010,1100,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 0001,0110,0101,1000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 0010,1100,1011,0000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 0101,1001,0110,0000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 1011,0010,1100,0000,0000,0000 (shifted left) Subtr. -1000,1111 = 1 (subtract) Result =0010,0011,1100,0000,0000,0000 subtracted <== 0100,0111,1000,0000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 1000,1111,0000,0000,0000,0000 (shifted left) Subtr. -1000,1111 = 1 (subtract) Result =0000,0000,0000,0000,0000,0000 subtracted <== 0000,0000,0000,0000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) <== 0000,0000,0000,0000,0000,0000 (shifted left) Compare 1000,1111 = 0 (smaller, do not subtract) ---------------------------------------------- =0010,0110,0001,0100 ====================As can be seen here, the number of steps to be performed is 16. But simpler in binary. Just shift left (<==), 16-bit comparison, 16-bit subtraction (Subtr.) if the result of the comparison is higher or equal to the divisor. And the result of the comparison, 0 or 1, is to be shifted into the 24-bit result registers. If this has been repeated 16 times the result is complete.

To translate this into program code here the single operations to be performed. The 32-bit left shift goes like this:

This is done with the instruction LSL (Logical Shift Left) for the lowest byte and with ROL (ROtate Left) for the higher bytes, in order to transfer the carry bit to bit 0 of the next byte. The 16-bit comparison goes like this:

Only the two upper two bytes need to be compared. It would be sufficient to compare only the eight bits of the divisor, but under certain circumstances a one might roll into R3 (e.g. if the divisor comes nearer to 255). So for safety reasons we perform 16-bit comparision with zeroes in the upper byte. The least significant byte is compared using the CP instruction, the upper 8 bits with CPC (including the carry from the lower byte comparison).

If following the comparison the carry flag is set, the divisor is larger than the upper two bytes of the number. In that case a zero has to be shifted into the result registers. If comparison resulted in a cleared carry flag, a one is to be shifted into the result. In that case the divisor is to be subtracted:

Again, only the upper two bytes need to be subtracted, and, again, the most significant byte together with the carry flag (ADC (ADd with Carry). Shifting the ones and zeroes into the result registers goes from right to left:

Here, in all cases the instruction ROL is used, which shifts the carry into bit 0.

This is the whole art of division. Planned orderly, our flow diagram is as follows:

All starts with setting the number and the divisor to the desired values, to clear the result registers and to set the counter to 16.

The division loop starts with left shifting the number. The comparison follows. If the carry bit is not set, subtraction has to be performed and a one goes into the result register. If not, subtraction is skipped and a zero goes into the result. Both cases end in result shifting.

To determine if all bits have been treated, the counter is decreased. If there are further bits to be handled, the division loop restarts. If not, the result is complete.

Interesting is the case if we increase the division steps from 16 to 24. Then the lower 8 bits of the result are a binary fractional number. But this is higher math.

A further interesting case occurs if we apply a real 4-byte integer or if the third byte of the number is larger than the divisor. In those cases the result will be all ones. Smart coders catch those cases from the early beginning.

So we can start coding now. It is crucial to keep the oversight over the used registers and their content. Therefore it should be commented what and why is done here.

```
; The number to be divided into R3:R2:R1:R0 (R3 should be 0)
,equ Number1 = 1393964
ldi R16,Byte1(Number1)
mov R0,R16
ldi R16,Byte2(Number1)
mov R1,R16
ldi R16,Byte3(Number1)
mov R2,R16
ldi R16,Byte4(Number1)
mov R3,R16
; R4 is Divisor, R5 is zero
; The divider into R5:R4
,equ Number2 = 143
ldi R16,Number2
mov R4,R16
clr R5
; Number of bits to devide into R16
ldi R16,16
; Result will bei in R8:R7:R6, clear
clr R8
clr R7
clr R6
DivLoop: ; The division loop
lsl R0 ; Shift number to be divided once left
rol R1
rol R2
rol R3
cp R2,R4 ; Compare LSB
cpc R3,R5 ; Compare MSB and carry
brcs DivNull ; Carry is set, result is 0, do not subtract
sub R2,R4 ; Subtract LSB
sbc R3,R5 ; Subtract MSB and carry
sec ; Result is one, into carry
rjmp DivShift ; jump over next instruction
DivNull:
clc ; Result is 0, into carry
DivShift:
rol R6 ; Rotate carry into result registers
rol R7
rol R8
dec R16 ; Still bits to divide?
brne DivLoop ; Repeat loop 15 times
```

The Studio reports an execution time of 268 µs at 1 MHz
clock. This is a little bit longer than multiplication, but not
dramatically longer. And the result in R8:R7:R6 is absolutely correct.
```
ANDI R16,0x0F ; Binary AND with the lowest four bits set
```

Whatever was previously in the upper four bits, after ANDI those are
all zero. If they were zero, they remain zero. If they were one, the
AND with a zero yields always zero.For what is that necessary? Now, if we have a byte with 8 bits on port A, B, C or D, and if we want to clear the upper four portpins we can apply CBI (Clear Bit in I/o) four times. With ANDI we can clear those four bits simultaneously.

```
IN R16,Portoutput ; Read current state of port output
ANDI R16,0x0F ; Clear the upper four bits
OUT Portoutput,R16 ; and write those to the port
```

Instead of 0x0F we can, of course, use other bit combinations to ANDI
the first register with that.With AND we can clear the unwanted nibble. We can combine this with the instruction SWAP that exchanges the two nibbles of a byte in a register.

```
IN R16,Portoutput ; Read the current state of the port output
ORI R16,0xF0 ; Set the upper four bits
OUT Portoutput,R16 ; and write this to the port
```

Whatever was in the upper four bits, following ORI R16,0xF0 all four
are definitely one.By combining AND and OR every part of a byte can be set to a certain combination, leaving certain bytes intact as they were.

```
IN R16,Portoutput ; Read bits in port output
LDI R17,0x55 ; Load reverse mask, 0101,0101
EOR R16,R17 ; Revert all bits that are one in R17
OUT Portoutput,R16 ; Write reversed content to port output
```

With that e.g. the polarity of desired portpins can be reversed,
leaving the other portpins as they were before.
- This all was so highly complex and complicated, that you should not bother high-level-languagers with that. They will joke on you and throw words like "damned bit shifter" at you.
- Those who had not understood adding, subtracting, multiplication and division in school should consider to try to understand those in binary math, as those are much simpler than in decimal. After understanding binary, it could be possible to also understand decimal.
- Who is disappointed by his compiler because he does not accept 48-bit integer numbers should immediately change to AVR assembler. He will enjoy the extended freedom in doing binary math with 48, 56 or 64 bit long integers on an 8 bit machine, by just investing a few lines of source code for handling all that.

Simple numbers | Larger numbers | Packed BCD | Signed | Adding | Subtracting | Multiply/Divide | AND/OR/EXOR |
---|

©2017 by http://www,avr-asm-tutorial,net