Free Trial

Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.

  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint
Share this Page URL

Chapter 3. Introduction to the PIC24 Mic... > A PIC24 Assembly Language Program

A PIC24 Assembly Language Program

At this point, we have more than enough instructions to write a simple PIC24 assembly language program. In this book, programs are first written in C then translated (compiled) to assembly language. This is done for the instructional purpose of illustrating the linkage between high-level programming language statements and their implementation in assembly language. Furthermore, programs written in a high-level language such as C improves the clarity of the program’s functionality, as assembly language can be obtuse, especially for readers new to assembly language programming. This also prepares you for the hardware interfacing chapters of this book, which uses the C language for all its example programs. If you are new to the C language, do not worry, as C language statements are introduced gradually and are fully explained. A previous exposure to any modern programming language is all that is necessary to understand the C program examples used in this book. Example C programs only use those C language statements necessary to demonstrate PIC24 µC capabilities and do not attempt to cover the entire C language. A program called a compiler is generally used to perform the translation from C to assembly language. This book uses the MPLAB PIC24 compiler for this task when C programs illustrating microcontroller hardware interfacing are covered beginning in Chapter 8.

Table 3.5 shows the standard C data types of char, short, int, long, and long long in their unsigned and signed varieties and how these types are referred to in this book. The standard C types represent different sized variables, with a char variable always requiring 8 bits (1 byte). A problem with the standard C data types is that the sizes of the int and long types are compiler-dependent. For example, in the C compiler used in this book for the PIC24 µC, the int and long types are 16 bits and 32 bits. However, these same types may be 32 bits and 64 bits for a different processor/compiler combination. It has become common practice for C programmers, especially when writing code for microcontrollers, to create user-defined types that expose the variable size because variable size affects both data and program storage requirements, along with program execution time. So, referring to Table 3.5 for an example, an unsigned int variable is declared as a uint16 variable in any C program used in this book.

Table 3.5. C data types.
StandardThis BookPIC24 Size (in bits)
unsigned charuint88
unsigned shortuint1616
unsigned intuint1616
unsigned longuint3232
unsigned long longuint6464
signed charint88
signed shortint1616
signed intint1616
signed longint3232
signed long longint6464

The type definitions of Table 3.5 are declared in a header file that is included by the compiler during the compilation process for the C examples in this book; these type definitions are:

typedef unsigned char           uint8;      //8 bits
typedef unsigned short          uint16;     //16 bits
typedef unsigned long           uint32;     //32 bits
typedef unsigned long long      uint64;     //64 bits
typedef signed char             int8;       //8 bits
typedef signed short            int16;      //16 bits
typedef signed long             int32;      //32 bits
typedef signed long long        int64;      //64 bits

Be sure to include these type definitions when using any of the C examples found in this book.

C-to-PIC24 Assembly Language

A C program that uses the data transfer and arithmetic operations discussed so far is shown in Listing 3.1. Line numbers have been added for clarity, but would not be part of the actual C program source code.

Listing 3.1. A “simple” C program.

(1) #define avalue 100
(2) uint8 i,j,k;
(3) main(void) {
(4)   i = avalue;        // avalue is 100 (0x64)
(5)   i = i + 1;         // i++, i = 101 (0x65)
(6)   j = i;             // j is 101 (0x65)
(7)   j = j – 1;         // j––, j is 100 (0x64)
(8)   k = j + i;         // k = 201 (0xC9)
(9) }

The C language is case sensitive, with all reserved key words, such as void, being lowercase. Comments begin with two // characters and can start anywhere on a line (this is actually a C++ language comment but is accepted by modern C compilers). Comments can also be delimited by a starting /* and an ending */ with the delimiters spanning multiple lines. Simple C statements are terminated by a semicolon (“;”). Compound statements, which are composed of multiple simple C statements, are bracketed by {}. Line 1 contains a define statement, which is a method for assigning a symbolic name to a value. Use of defines for constant values usually improves code clarity. Line 2 defines three variables of type uint8 (this code assumes that uint8 has been declared earlier, typically in an included header file), which from Table 3.5 means that each variable is 8 bits, or 1 byte, and represents unsigned data (unsigned char). The unsigned modifier tag combined with the 8-bit data size gives a value range of 0 to 255 for each variable. Chapter 5 discusses the difference between unsigned and signed data types and the effect this has on arithmetic operations. Line 3 defines the entry point for the C program, which must be named main. The (void) after the main label indicates that main receives no parameters, which will always be the case for C programs in this book. The body of the main code is a compound C statement, enclosed by {}. Line 4 assigns the constant value 100 to the variable i. Line 5 increments i by 1; i contains the value 101 after execution of this statement. The C statement i++, where ++ is the C increment operator, could be used instead of i=i+1. Line 6 copies the value of i to j. Line 7 decrements j, so j contains the value 100 after execution of this statement. The statement j–– could be used instead of j=j–1.

The first step in translating (i.e., compiling) the program in Listing 3.1 to PIC24 assembly language is to choose locations for the variables i, j, and k. This can be any data RAM location that is not assigned to a special function register. We could also simply use working registers to represent variables, and we do this when C functions are covered in Chapter 6. In this example, we will use data RAM, and for simplicity, we use the first available data RAM locations, which are 0x0800 for i, 0x0801 for j, and 0x0802 for k (recall that data RAM locations 0x0000 through 0x07FF are reserved for special function registers). Figure 3.20 shows the program of Listing 3.1 translated to PIC24 assembly language.

Figure 3.20. The contents of main in the “simple” C program compiled to PIC24 assembly language.

The compilation is straightforward when only one line is considered at a time. Optimizing C compilers (and expert assembly language programmers) consider multiple C language statements at a time during compilation in an effort to reduce the total number of instructions, and it may be difficult to correlate the final assembly code with the original C language statements. This book does not expect you to become an expert assembly language programmer; this only occurs after a considerable amount of time is spent crafting assembly language programs. Instead, this book strives for clarity and understanding, and will always perform C-to-PIC24 assembly language translation in the most straightforward manner possible. Some comments on the assembly code in Figure 3.20 are:

  • Byte operations (.b) are used because i, j, k are uint8 (1 byte) variables.

  • The reader may wonder why both W0 and WREG are used in the assembly language for the first C statement i = 100. Only WREG can be used in a mov instruction that is a byte operation and specifies a file register address, so the destination for the constant value 100 must be specified as WREG. However, the mov instruction with an immediate operand requires one of the working registers, W0 through W15, as its destination. It is a syntax error to write mov.b #100,WREG and also a syntax error to write mov.b W0,i even though W0 and WREG are physically the same register.

  • The reader may wonder why the assembly statement mov.b 0x800,0x801 is not used for the j = i operation. The answer is simple—there is no PIC24 mov instruction that allows file register addresses to be specified for both the source and destination operands. Instead, i must be copied to a working register first, and then copied from the working register to j.

Of the C language statements in Figure 3.20, the statement k = j + i is the most difficult, and requires three PIC24 instructions to implement. In the resulting three-instruction sequence in Figure 3.20, observe that the destination of add.b 0x0801,WREG is WREG so that the value of 0x0801 (j) is left undisturbed. The mov.b WREG,0x0802 instruction copies the result of the addition into the k variable location (0x0802). This three-instruction sequence could be replaced by the instructions in Listing 3.2.

Listing 3.2. Alternate implementation of k = j + i.

mov.b 0x0801,WREG       ;WREG = j
add.b 0x0800,WREG       ;WREG = i + WREG = i + j
mov.b 0x0802            ;k = WREG = i + j (result same as j + i)

This works because addition is a commutative operation; either i + j or j + i produces the same result. However, when performing subtraction one must be careful of the operand order because j – i is not equal to i – j. Listing 3.3 shows one way to implement the C language statement k = j – i.

Listing 3.3. Implementation of k = j – i.

mov.b 0x0800,WREG       ;WREG = i
sub.b 0x0801,WREG       ;WREG = j – i
mov.b 0x0802            ;k = WREG = j – i

The code in Figure 3.20 makes heavy use of WREG because of the need for byte operations to data memory and the use of file register addressing. There are many other possible assembly language implementations using addressing modes such as indirect addressing, but we reserve that discussion for a later chapter.

The PIC24 assembly language of Figure 3.20 is somewhat obtuse because memory addresses (0x0800, 0x0801, 0x0802) are used instead of the variable names i, j, k. Also, there is still the problem of translating the PIC24 instruction mnemonics to machine code, a process that is interesting the first time, boring the second time, and painful thereafter. A program called an assembler automatically converts instruction mnemonics to machine code. Microchip Technology Inc. provides the MPLAB Integrated Design Environment (IDE), which contains an assembler and simulator for most Microchip microprocessors. Listing 3.4 gives the assembly language of Figure 3.20 written in a more readable form, and in a format compatible with the MPLAB PIC24 assembler (the line numbers are not part of the source file).

Listing 3.4. MPLAB-compatible assembly source code for “simple” C example.

(1) .include ""
(2) .global __reset           ;The label for the first line of code.
(3)    .bss                   ;uninitialized data section
(4) ;;These start at location 0x0800 because 0–0x07FF are reserved for SFRs
(5) i:  .space 1              ;Allocating space (in bytes) to variable.
(6) j:  .space 1              ;Allocating space (in bytes) to variable.
(7) k:  .space 1              ;Allocating space (in bytes) to variable.
(8) ;Code Section in Program Memory
(9)    .text                  ;Start of Code section
(10) __reset:                 ;first instruction located at __reset label
(11)    mov #__SP_init, W15   ;Initialize the Stack Pointer
(12)    mov #__SPLIM_init,W0
(13)    mov W0,SPLIM          ;Initialize the Stack Limit register
(14) ;User Code starts here.
(15)    .equ avalue, 100
(16) ;i = avalue; // avalue = 100
(17)    mov.b #avalue, W0     ;W0 = 100
(18)    mov.b WREG,i          ;i = 100
(20) ;i = i + 1;
(21)    inc.b i               ;i = i + 1
(23) ;j = i
(24)    mov.b i,WREG          ;W0 = i
(25)    mov.b WREG,j          ;j = W0
(27) ;j = j – 1;
(28)  dec.b j                 ;j = j – 1
(30) ; k = j + i
(31)   mov.b i,WREG           ;W0 = i
(32)   add.b j,WREG           ;W0 = W0+j (WREG is W0)
(33)   mov.b WREG,k           ;k = W0
(35) done:
(36)   goto done              ;Placeholder for last line of executed code
(38) .end                     ;End of program code in this file


The .include statement in line 1 is called an assembler directive, which is an instruction to the assembler and not a PIC24 assembly language statement. Lines 1–3, 5–7, 9, 15, and 38 are all assembler directives. The .include statement causes the source file to be included (read) during assembly. When assembling a PIC24 program, the assembler must be told the target device, in this case the 24HJ32GP202, by setting this device name within the MPLAB environment (see Appendix B). This device name is used by the generic include file to select a device specific include file (i.e., the file) that defines symbolic names for all SFRs and named bits within SFRs.

The .global __reset assembler directive at line 2 declares the label __reset as a global label, that is, a label that is accessible outside of the scope of this file. Labels are used as symbolic names for the instruction addresses, generally to be used as the target of a change of control instruction, such as goto __reset. In this case, the label __reset is a special label required by the PIC24 assembler that must be used to label the first executable instruction in our program. Labels are case sensitive, while assembler directives and instruction mnemonics are case insensitive. Labels, assembler directives, and instruction mnemonics may start in any column that is desired. A semicolon is used to start a comment.

The .bss assembler directive at line 3 indicates the start of a section that contains uninitialized data to be placed in data RAM. The three .space assembler directives that follow allocate one byte of space for each of the i, j, and k variables. Observe that each .space directive is labeled with a variable name. The .space directives use the first available data memory location, and thus i is assigned the address 0x0800, j the address 0x0801, and k the address 0x0802. Using .space directives and labeling in this manner allow the i, j, k variables to be referenced by name within PIC24 instructions instead of using absolute memory locations, which greatly improves code clarity. The .space directive can also be used to insert bytes between declarations of different sized variables (8-bit, 16-bit, and 32-bit) to ensure that 16-bit and 32-bit variables start on a word boundary.

The .text assembler directive at line 9 indicates the start of a section that contains instructions to be placed in program memory. There must be separate assembler directives, .bss and .text, for data memory and program memory because these are two different physical memories in the PIC24 architecture. Locations 0x0 through 0x01FF of program memory are reserved for reset handling, and trap and interrupt vectors (discussed in Chapter 9), so the first instruction of our program is placed at 0x0200. At power up, the program counter is reset to 0x0. Thus, the first PIC24 instruction is fetched from program memory location 0x0, which is called the reset handler. The assembler automatically generates a goto __reset instruction that occupies the two words at program memory locations 0x0 and 0x2, which is the reason why the first executable instruction in our program must be labeled as __reset. In other words, the contents of locations 0x0 and 0x2 hold a goto statement to the start of the user program.

The mov #__SP_init, W15 instruction loads the constant value #__SP_init into W15, where __SP_init is an automatically generated value that represents the first free location in data RAM after the variable declarations. This free space is reserved for stack space storage, which is discussed in detail in Chapter 6. Register W15 is reserved for a special function known as the stack pointer. The two instruction sequence in lines 12 and 13 initialize the stack limit register to the value __SPLIMIT_ init, which is an automatically generated value that represents the maximum data memory address that is safely available for stack pointer usage. At this point, it is not necessary for the reader to understand stack operations to write simple assembly language programs. Lines 1 through 13 form a template that should be used for any assembly language program, modified with appropriate .space directives that match the variable needs of that program. Technically, this program does not need to initialize either the stack pointer or stack limit registers as no instructions that use the stack are executed, but it is a good idea to include this in your standard assembly language template since it will be needed in future examples.

The .equ assembler directive at line 15 assigns the value 100 to the label avalue. For those readers familiar with C, the assembler’s .equ directive is very similar to C’s #define directive, which is discussed in more detail in Chapter 8. The instructions in lines 16–33 duplicate the instructions of Figure 3.20, except that the names i, j, and k are used instead of absolute addresses. Lines 35 and 36 contain the infinite loop goto done, where the target address done is the location of the goto instruction. A microcontroller program never really ends; it must always be doing something, as there is no place for the program to go when it finishes! When a program exits on a personal computer, control is returned to the operating system, which is in an infinite loop waiting for input from the keyboard, mouse, or some other input device. A microcontroller program is also typically an infinite loop that is waiting on input from some external device such as a car engine, sensor array, and so forth. In this simple example, the program execution is trapped when it falls into the goto done infinite loop. Another method to halt program execution is to stop the processor clock; this is discussed in Chapter 8. The .end assembler directive at line 38 is used to mark the end of the source code in this file; it is not strictly necessary and can be omitted if desired.

Listing 3.5 gives the machine code listing produced by the MPLAB assembler for the assembly language program of Listing 3.4. The address column gives the program memory location in hex and the opcode column the machine code for the mnemonic to the right. Not counting the reset vector, this program takes 13 PIC24 instructions to implement (locations 0x200 through 0x21A). The number of required instruction words is 14 (13+1), because the goto done instruction at location 0x218 requires two instruction words.

Listing 3.5. Machine code listing for simple C program.

0000              040200                           goto _reset
0002              000000                           nop
0200              20804F           _reset          mov.w #0x804,W15
0202              20FF80                           mov.w #0xff8,W0
0204              880100                           mov.w #W0,0x0020
0206              B3C640                           mov.b #0x64,W0
0208              B7E800                           mov.b W0,i
020A              EC6800                           inc.b i
020C              BFC800                           mov.b i,W0
020E              B7E801                           mov.b W0,j
0210              ED6801                           dec.b j
0212              BFC800                           mov.b i,W0
0214              B44801                           add.b j,W0
0216              B7E802                           mov.b W0,k
0218              040214           done            goto done
021A              000000                           nop

The nop Instruction

The second word of the goto instructions in Listing 3.5 displays as a nop instruction, which stands for “NO oPeration”. A nop simply causes the instruction word to be fetched and the PC to be incremented to the next instruction word. One machine code encoding of a nop has the upper 8 bits as “0” and the remaining bits as don’t-cares. All instructions that require two words have the second instruction word encoded such that the upper 8 bits are “0”, as seen at locations 0x0002 and 0x021A in Listing 3.5. A second nop encoding has the upper 8 bits as “1”, and the remaining bits as don’t-cares. This second nop encoding was chosen because program memory contains all “1”s when the program memory is in a blank or erased state. This means that the values of 0xFFFFFF and 0x000000 are both treated as nop instructions. These nop encodings were selected because any erased location contains 0xFFFFFF, and any memory location in the 4 Mi instruction word range that is not physically implemented returns a 0x000000 when read. In this way, if a program error causes a jump to a portion of program memory that is erased, continuous nop instructions (0xFFFFFF) are fetched until the program counter exceeds physical memory. Then, 0x000000 values (nop instructions again) are read until the PC wraps back to the reset location of 0x0, simulating a device reset. An internal register of the PIC24 µC can be checked by the startup code to determine if a physical reset actually occurred; and if not, then an error indicator displayed indicating that an anomalous reset condition occurred.

16-Bit (Word) Operations

Assume the C program of Listing 3.1 is changed to use uint16 (16-bit) variables for i, j, and k instead of uint8 variables. This means that all operations using these variables now require 16-bit operations instead of 8-bit operations. Listing 3.6 gives the new assembly language implementation assuming uint16 (16-bit) variables (some lines that have not changed between Listing 3.4 and Listing 3.6 have been deleted for reasons of brevity).

Listing 3.6. Simple C program with 16-bit operations.

(1) i: .space 2                   ;Allocating space (in bytes) to variable.
(2) j: .space 2                   ;Allocating space (in bytes) to variable.
(3) k: .space 2                   ;Allocating space (in bytes) to variable.
(4) ;Code Section in Program Memory
(5)     .text                     ;Start of Code section
(6) __reset:                      ;first instruction located at __reset label
(7)     mov #__SP_init, W15       ;Initialize the Stack Pointer
(8)     mov #__SPLIM_init,W0
(9)     mov W0,SPLIM              ;Initialize the Stack Limit register
(10) ;User Code starts here.
(11)    .equ avalue, 2047
(12) ;i = avalue;                 // avalue = 2047
(13)    mov #avalue, W0           ;W0 = 2047
(14)    mov WREG,i                ;i = 2047
(16) ;i = i + 1;
(17)    inc i                     ;i = i + 1
(19) ;j = i
(20)    mov i,WREG                ;W0 = i
(21)    mov WREG,j                ;j = W0
(23) ;j = j – 1;
(24)    dec j                     ;j = j – 1
(26) ;k = j + i
(27)    mov i,WREG                ;W0 = i
(28)    add j,WREG                ;W0 = W0+j (WREG is W0)
(29)    mov WREG,k                ;k = W0
(31) done:
(32)    goto done                 ;Place holder for last line of executed code


The changes in Listing 3.6 to accommodate 16-bit operations are:

  • The .space directives in lines 1–3 now reserve two bytes (16 bits) instead of 1 byte for each variable. This means that the addresses of the i, j, k variables are now 0x0800, 0x0802, and 0x0804.

  • All of the byte (.b) operations in Listing 3.4 are removed so that 16-bit operations are performed.

  • The value of the constant avalue has also been changed from 100 to 2047 to illustrate that a larger number range is available with 16-bit variables.

Assembling the program of Listing 3.6 reveals that it has the same number of instruction words as Listing 3.4, and thus requires the same execution time. This is the advantage of a 16-bit CPU architecture in that 16-bit operations are as efficient as 8-bit operations in terms of program memory size and execution time. If the C data types of Listing 3.1 are changed to uint32 (32-bit) variables, then the assembly code of Listing 3.6 would change significantly, requiring approximately double the number of instructions and execution time. Operations on 32-bit data are examined in detail in Chapter 5.

Sample Question: Write a PIC24 assembly language fragment that implements the C statement k = i + j + 20;” where k, i, j are all uint16 variables.

Answer: One solution is:

mov j,W0           ;W0 = j
add #20,W0         ;W0 = W0 + 20 = j + 20
add i,WREG         ;W0 = i + W0 = i + j + 20
mov W0,k           ;k = W0 = i + j + 20

Observe that a single C statement may require several PIC24 assembly language statements, as several operations can be written in one C statement. Translating the C statement to PIC24 statements requires that you decompose the C statement into steps that the PIC24 can accomplish. The above solution also requires that the variables be in near RAM. If this assumption is not true, then the use of indexed addressing covered in Chapter 6 is required. This book assumes that all variables are in near RAM unless explicitly stated otherwise.

Sample Question: A neophyte assembly language programmer translated the C code fragment:

uint16 k, j;
k = j + 1;

to the assembly language sequence of:

inc j
mov j,W1
mov W1,k

What is wrong with this?

Answer: The inc j instruction modifies the variable j. The C statement k = j + 1 only modifies k; the variable j is not modified. One correct solution is:

inc j,WREG
mov W0,k

The inc j,WREG instruction places j + 1 in WREG and leaves the memory location j unmodified.

  • Safari Books Online
  • Create BookmarkCreate Bookmark
  • Create Note or TagCreate Note or Tag
  • PrintPrint