Path: Home => AVR-EN => Applications => Ticker 16-8 => Assembler algorithms Logo
Ticker 16x8 logo

Text ticker with 16*8 LEDs and an ATtiny- or an ATmega-Controller

6 Assembler software algorithms for the ticker

I will not publish a complete software in assembler, because the possible versions are too numerous. Here I describe all elements that are necessary to compose the software:
  1. Running texts stored in flash and their selection,
  2. organizing the text display in SRAM,
  3. sending the text patterns to the LEDs,
  4. regulating the display speed,
  5. regulating the brightness of the display,
  6. for clocks: the second pulse.

6.1 Running texts in flash and their selection

An A as running text Texts stored in flash memory consist of a number of bytes, which determine the state of the 128 LEDs. For example an "A" here.

The3 first byte of an A consists of six bit that are set and of two bits that are cleared. Whether we assign the bits, beginning with the most upper LED, or with the lowest LED, is insignoficant: the hardware design determines that.

And we start with the lowest LED in the column as the least significant bit. That means we have to shift the most significant bit to the carry flag first, designated for bit 7 of the shift registers 4094 (or rather Q8 of the chip). If our hardware would be inversed, we have to shift the least significant bit first. So the only difference would be an "LSR"- instead of an "LSL"-instruction.

After shifting the eight bits into the shift register 4094-1 the shift row should be correct: the most significant bit of the A goes to Q8 of the shift register, the least significant one finally to Q1.

The first byte of the A in the selected rowing is binary 0b0011.1111 or decimal 63. In the second byte of the A only the four-bit and 64-bit is set, which is decimal 68. The complete A in the flash memory looks like this:

  .db 63,68,132,68,63

That line would give a warning by the assembler: the number of bytes in the line is odd, and the assembler would add a zero to that. That zero is a blank column, and is the space between the A and the next character to be displayed. With a 5-by-8 font, the zero is fine. With a 6-by-8 font, place two characters in one .db line to avoid assembler-added zeroes.

If the text ticker runs from right to left, we would have to send the A bytes in this row like listed above.

Not that easy is the recognition if the text is at its end, and the output has to restart from the beginning. We can not use zero as detection because we need the zero for blank columns in beteen the characters. So either we use a pointer to the end of the text line, or we use a unique pattern of bytes to detect the end. That pattern could be, e. g. five times 0xAA in a row. So whenever we see an 0xAA we'll have to read the next five bytes and if those are 0xAA as well we end the text output and restart from the beginning.

Because the manual writing of byte rows would be rather boring, I have written a piece of software that does it more conveniently (see the design software description).

6.1.1 Selecting text with a potentiometer voltage

In all versions that have a text selection potentiometer attached to one of the ADC input pins you can select from different stored text sequences. To enable that feature you'll have to do the following:

6.2 Organizing the text display in SRAM

If you want to display not only static texts by also some that changes with time (e. g. a time and date display) you'll need to place the display information to SRAM and to compose your display byte sequences there. Such a display can be "Mo, 22/02/28, 12:34:56". That would be too complicated to program it with constant byte sequences in the flash.

A buffer in SRAM, that invites to compose, needs the following elements: In the data segment this looks like this:

  .byte 2 ; The address of the next output position
  .byte 2 ; The last address plus one
  .byte 2 ; The restart address if the end has been reached
; The maximum length of the buffer
.equ cMaxDsply = RAMEND - sCurrent - 10 ; Leave some space for the stack
  .byte cMaxDsply

Depending from the size of the SRAM (ATtiny24: 128 bytes, ATtiny44: 256 bytes, ATtiny84: 512 Bytes, ATmega48: 256 Bytes, ATmega88: 512 Bytes, ATmega324PA: 2 kBytes), the buffer's length is adjusted. Make sure when composing not to exceed the (sDsplyEnd - 1) limit.

You can use this buffer to compose any sequence or mixed variable/constant content.

There are many advantages to organize the output in this manner, so it can be recommended, even in cases where the text to be displayed is constant.

6.3 Sending the ticker bits to the shift registers

Sending the ticker bits in flash or SRAM to the shift registers differs slightly by version. Therefore the description is structured by the version.

6.3.1 ATtiny24/44/84-1 with eight transmit bits

Transmit 8 bits at a time In this version eight single bits are transmitted and shiftet into the shift register 4094-1. All further shifts (to 4094-2, ... 4094-16) are performed by the shift registers.

The first bit to be shifted into the shift register 4094-1 is bit 7 of the currently transmitted byte. This bit 7 can be shifted to the carry flag with the instruction LSL. Depending from the carry flag the output pin to the DATA pin is either cleared or set.

After setting the DATA input pin one CLK pulse is sent to all shift registers. In order to work correct at all controller clock frequencies, the pulse duration shall be increased by adding NOPs, if the controller clock exceeds 1 MHz: add one NOP at 2 MHz, two at 3 MHz, etc.

After all eight bits have been shifted into the 4094s, a pulse is generated on the STB input pin of all 4094s. Add NOPs to this pulse duration, too, if your clock is higher than 1 MHz.

Finally increase the pointer in sAct and check, if it reached sEnd. If that is the case, restart the output sequence by writing sRestart to sAct.

The whole routine consumes 2 + 8 * (2 + 1 + 1 + 1) + 2 + 2 = 46 clock cycles, roughly 46µs.

6.3.2 ATtiny24/44/84-2 with 16 times eight transmit bits

16 times 8 bits With this version 16 bits are placed on port outputs and are send with two CLK pulse to all 16 shift registers. The first 8 bits collected are for the even-numbered shift registers (4094-2m 4094-4, ...), the second eight bits are for the odd-numbered shift registers (4094-1, 4094-3, ...). The odd-numbered shift registers are are connected to the controller's port, the even-numbered are fed by the QS pins of the odd-numbered shift registers.

Because all 128 bits of the active display window are written, and as the bit row is not that simple (first the eight odd bits, then the eight even bits), the algorithm first copies all 128 bits to registers. If you have not enough registers available, you can split this into two packages and you need only eight of those, with a slightly increased effort for the copy algorithm.

First all bits for the even-numbered 4094 are prepared and send to the DATA pins. The first CLK pulse shifts those to the odd-numbered shift registers.

Then all bits for the odd-numbered 4094s are shifted together. Those stem from registers R14, R12, etc., are rotaded left into the multi-purpose register, written to the 8-bit port output and shifted in with the second CLK pulse.

The whole is repeated eight times, then all shift registers are completely rewritten. By applying a STB pulse, the shift registers are copied to the latches of the 4094s.

The whole procedure takes slightly longer than shifting only one column left like in the ATtiny24-1 case. The copy routine consumes 288 clock cycles, the shift and send routine additionally 400 clock cycles. But, even at a clock rate of 32.768 kHz, more than 20 of those display routines can be absolved per second, which yields a very fast ticker speed.

6.3.3 ATmega48 or ATmega324PA with 8 times 16 send-bits

8 times 16 bits With an ATmega48 or an ATmega324PA two complete 8-bit-ports are available to send 16 bits at once to all shift registers. The algorithm then looks like this here: similar to the above, but only one CLK pulse is needed for 16 bits shifted. And we definitely need all 128 bits in 16 registers here to compose the two bytes to be transmitted at once.

The bit row for LSL and ROL is simpler here: from R7 to R0 and then from R15 to R8.

In relation to execution time needed, the same applies as above, the two SBI/CBI instructions that are not necessary here make only 4 clock cycles difference.

6.4 Adjusting the speed of the display

Speed control is performed by the following mechanism:
  1. The 16 bit timer is in CTC mode with Compare A as top value.
  2. When the top value is reached, a shift event is triggered.
  3. If the value in Compare A is large, e. g. 30,000, the event's repetition is small, here 1.09 Hz. If the value is smaller, e. g. 1,000, then the frequency is higher, e. g. 32.8 Hz.
  4. To get a linear display frequency with the linear potentiometer value, some calculation is necessary to convert the ADC value to an 1/x delay value for the timer.
In an assembler program we can do that with the following algorithm:
  1. The ADC value ranges between 0 and 255 (with ADC's ADLAR bit set). If not: divide the 10-bit result by four.
  2. This 8-bit value is multiplied with an 8-bit slope constant and an intercept value is added. The result is a linear frequency, like in the graph here.

    Frequency vs. 8-bit ADC value

    As the frequencies involved here are at a few Hz or even below, we calculate the 256-fold of this frequency, 256 * f. The slope constant at 32.768 kHz is e. g. 19, the intercept value to be added is 256. The resulting 16 bit value is 256 for ADC=0 and 5,101 for ADC=255, which corresponds with a frequency of# 19.9 Hz.
  3. The value of 256 * clock / prescaler (256 * c / p), at c=32,768 Hz and with a prescaler p of 8, is 8,388,608 or hex 0x00.80.00.00. This value is divided by 256 * f from above (a 32-by-16 bit division) and yields 32.768 for ADC=0, and 1,645 at ADC=255 (with rounding up).
  4. The result of the division (16 bit) is then written to the CompareA port register of TC1 and determines the time delay for the next shift event.
TC1 compare value vs. potentiometer degrees This indeed yields a linear frequency of the display shift. With an ATmega with hardware multiplicator the multiplication plus division routine lasts at 32.768 kHz approximately 9 ms. If an ATtiny24 without hardware multiplicator is used, the multiplication lasts a few ms longer. In no case this will result in flickering delays.

All calculations are available in the LibreOffice-Calc-file here in detail. This also allows to use different clock rates. In the three calculation sheets "crystals", "crystals_tc2" (use this if you want to use TC2 instead of TC0 for seconds timing) and sheet "displayshift" all
  1. potential (crystal)frequencies from 32.768 kHz to 20 MHz can be selected. If devices without a crystal shall be used insert the desired operating frequency,
  2. The upper and the lower limit for display shift events in seconds has to be selected by changing the two times in seconds (above the data for TC1). The calculation sheet determines the lower and the upper frequency divider rates from this. The default times given here have been tested with a 32.768kHz clock rate and come to at least a 20% sleep share of the controller. Higher clock rates can be faster.
  3. All relevant parameters are copied to the cell area for exporting those to the assembler source code.
That is how the exported figures look alike:

#; 16-Bit-TC 1
.equ clock16 = 32768 ; Clock in Hz
.equ cPresc16 = 1 ; Prescaler 16-bit-TC
.equ cPrescBits16 = (1<<CS10) ; Prescaler bits for 16-bit-TC
.equ cCtc16Slow = 32768 ; CTC divider 16-bit-TC, slow
.equ cCtc16Fast = 1638 ; CTC divider 16-bit-TC, fast
.equ cfMultA = 19 ; Frequency multiplicator slope constant
.equ cfMultB = 256 ; Frequency multiplicator intercept value

For the case that you want to avoid the 32 / 16 bit division, the sheet "DsplyShftLoopkUp" holds a lookup table. Just mark it and copy the table "TableDelay:" to your source code. But be aware that the table takes 256 words from your fladsh memory, the division routine is much shorter (42 words with a hardware multiplicator).

6.5 Controlling the brigthness of the display


6.6 For clocks: the seconds timing

With the ticker hardware and an ATmega324PA it is possible to program a tickered date-and-time clock. You just need to attach a crystal (e. g. a 32.768 kHz) and write some lines of code. Of course you'll either need a DCF77 module or you can add two buttons to adjust date and time manually.

6.6.1 Seconds generator

For such a clock you'll need a timer, either TC0 or TC2. Both are a bit different, so the LibreOffice calculation file has two different sheets.

Clock rates and dividing for seconds pukses This list provides all usual clock rates of AVR controllers and crystals and lists the prescaler and CTC divider values for a 1 sec. long pulse. With one exception an additional software downcounter is necessary. This can be either a one-byte or a two-byte counter (in that case preferably in R25:R24).

The first column in the list is the clock frequency in MHz. The second column is either the available case of crystals or has some hints on the source of this frequency. The third column gives the optimal prescaler (as large as possible). The fourth column gives the CTC divider (the Compare-A-value of the CTC plus one), also as large as possible. In the fifth column a software divider is listed. The three digits behind the decimal dot demonstrate that the division does not lead to inexact seconds pulses. Finally the sixth column has the number of bytes for the software counter.

With the sheets "crystals" and "crystals_tc2" one of the listed frequencies can be selected from the drop-down-list and the related parameters appear in a copy field. The 8-bit-TC0 area can be marked, copied and inserted into the assembler source code:

; 8-Bit-TC 0
.equ clock = 32768 ; Clock in Hz
.equ cPresc8 = 256 ; Prescaler for 8-Bit-TC
.equ cPrescBits8 = (1<<CS02) ; Prescaler bits for 8-bit-TC
.equ cCtcDiv8 = 128 ; CTC divider 8-Bit-TC
.equ cCompA8 = 127 ; Compare-A value
.equ cSoftCnt8 = 1 ; Software counter value

The further procedure in assembler is simple:
  1. Define one or two divider registers.
  2. Write an interrupt-service-routine for the OC0A-Int, which saves SREG and then decreases the software counter (either with DEC or with SBIW). If the Z flag is set, restart the software counter and set a seconds flag.
  3. On init set the counter to its initial value, the timer/counter TC0 gets the Compare-Byte cCtcDiv8 into OCR0A, is set to CTC mode with the appropriate WGM bits set and is started with the cPrescBits8 in the TCCR0B port register. In the Timer-Interrupt-Mask OCIE0A is enabled.

6.6.2 Date and time as ticker bytes

Main page Top of page Hardware Parts list PCBs Power supply Booster Assembler Designer

©2022 by