Monday, 2 February 2015

Volatile Article



Introduction

Volatile is a qualifier that is applied to a variable when it is declared. It tells the compiler that the value of the variable may change at any time-without any action being taken by the nearby code. The implications of this are quite serious. Let's take a look at the syntax.

A variable should be declared volatile whenever its value could change unexpectedly. In practice, only three types of variables could change:

            * Memory-mapped peripheral registers
            * Global variables modified by an interrupt service routine
            * Global variables within a multi-threaded application

If we do not use volatile qualifier the following problems may arise:

            * Code that works fine-until you turn optimization on
            * Code that works fine-as long as interrupts are disabled
            * Flaky hardware drivers
            * Tasks that work fine in isolation-yet crash when another task is enabled

Example:

        static int var;
        void test(void)
        {
        var = 0;
        while (var  !=  255)
        continue;
        }

The above code sets the value in var to 0. It then starts to poll that value in a loop until the value of var becomes 255.

An optimizing compiler will notice that no other code can possibly change the value stored in 'var', and therefore assume that it will remain equal to 0 at all times. The compiler will then replace the function body with an infinite loop, similar to this:

        void test_opt(void)
        {
        var = 0;
        while (TRUE)
        continue;
        }

Declaration of Volatile variable
Include the keyword volatile before or after the data type in the variable.
volatile int var;
int volatile var;

Pointer to a volatile variable

volatile int * var;

int volatile * var;

Above statements implicate 'var' is a pointer to a volatile integer.  

Volatile pointers to non-volatile variables

int * volatile var; -> Here var is a volatile pointer to a non-volatile variable/object. This type of pointer are very rarely used in embedded programming.

Volatile pointers to volatile variables

int volatile * volatile var;

If we qualify a struct or union with a volatile qualifier, then the entire contents of the struct/union becomes volatile. We can also apply the volatile qualifier to the individual members of the struct/union.

Usages of volatile qualifier

    * Peripheral registers

Most embedded systems consist of a handful of peripherals devices. The value of the registers of these peripheral devices may change asynchronously. Lets say there is an 8-bit status register at address 0x1234 in any hypothetical device. What we need to do is to poll this status register until it becomes non-zero. The following code snippet is an incorrect implementation of this scenario/requirement:

        UINT1 * ptr = (UINT1 *) 0x1234;
        // Wait for register to become non-zero.
        while (*ptr == 0);
        // Do something else.

Now no code in proximity attempts to change the value in the register whose address(0x1234) is kept in the 'ptr' pointer. A typical optimizing compiler(if optimization is turned ON) will optimize the above code as below:

         mov    ptr, #0x1234  -> move address 0x1234 to ptr  
         mov    a, @ptr   -> move whatever stored at 'ptr' to accumulator
         loop   bz  loop  -> go into infinite loop

What the assumes while optimizing the code is easy to interpret. It simply takes the value stored at the address location 0x1234(which is stored in 'ptr') into accumulator   and it never updates this value as because apparently the value at the address 0x1234 never gets changed(by any nearby code). So, as the code  suggests, the compiler replaces it with an infinite loop (comparing the initial zero value stored at the address 0x1234 with a constant 'zero'). As the value stored at this address would initially be zero and it is never updated, this loop goes forever. The code beyond this point would never get executed and the system would go into a hanged state.

So what we essentially need to do here is to force the compiler to update the value stored at the address 0x1234 whenever it does the comparison operation. The volatile qualifier does the trick for us. Look at the code snippet below:

            UINT1 volatile * ptr = (UINT1 volatile *) 0x1234;

The assembly for the above code should be:

     mov     ptr, #0x1234 -> move the address 0x1234 to ptr
     loop    mov  a, @ptr -> move whatever stored @address to accumulator      
     bz      loop         -> branch to loop if accumulator is zero

So now at every loop the actual value stored at the address 0x1234(which is stored in the 'ptr') is fetched from the peripheral memory and checked whether it's zero or non-zero; as soon as the code finds the value to be non-zero the loop breaks. And that's what we wanted.

Subtler problems tend to arise with registers that have special properties. For instance, a lot of peripherals contain registers that are cleared simply by reading them. Extra (or fewer) reads than you are intending can cause quite unexpected results in these cases.

    * ISR(Interrupt Service Routine)

Sometimes we check a global variable in the main code and the variable is only changed by the interrupt service routine. Lets say a serial port interrupt tests each received character to see if it is an ETX character (presumably signifying the end of a message). If the character is an ETX, the serial port ISR sets a particular variable, say 'etx_rcvd'. And from the main code somewhere else this 'etx_rcvd' is checked in a loop and untill it becomes TRUE the code waits at this loop. Now lets check the code snippet below:

        int etx_rcvd = FALSE;
        void main()
        {
            ...
            while (!ext_rcvd)
            {
                // Wait
            }
            ...
        }
        interrupt void rx_isr(void)
        {
            ...
            if (ETX == rx_char)
            {
                etx_rcvd = TRUE;
            }
            ...
        }

This code may work with optimization turned off. But almost all the optimizing compiler would optimize this code to something which is not intended here. Because the compiler doesn't even have any hint that etx_rcvd can be changed outside the code somewhere( as we saw within the serial port ISR). So the compiler assumes the expression !ext_rcvd would always be true and would replace the code with infinite loop. Consequently the system would never be able to exit the while loop. All the code after the while loop may even be removed by the optimizer or never be reached by the program. Some compiler may throw a warning, or some may not, depends completely on the particular compiler.

The solution is to declare the variable etx_rcvd to be volatile. Then all of your problems (well, some of them anyway) will disappear.

    * Multi-threaded applications

Often tasks/threads involved in a multi-threaded application communicate via a shared memory location i.e. through a global variable. Well, a compiler does not have any idea about preemptive scheduling or to say, context switching or whatsoever. So this is sort of same problem as we discussed in the case of an interrupt service routine changing the peripheral memory register. Embedded Systems Programmer has to take care that all shared global variables in an multi threaded environment be declared volatile. For example:

        int cntr;
        void task1(void)
        {
            cntr = 0;
            while (cntr == 0)
            {
                sleep(1);
            }
            ...
        }
        void task2(void)
        {
            ...
            cntr++;
            sleep(10);
            ...
        }



This code will likely fail once the compiler's optimizer is enabled. Declaring 'cntr' to be volatile is the proper way to solve the problem.

Some compilers allow you to implicitly declare all variables as volatile. Resist this temptation, since it is essentially a substitute for thought. It also leads to potentially less efficient code.

    Note:

    Can you have constant volatile variable?

    You can have a constant pointer to a volatile variable but not a constant volatile variable.

One more example

Consider the following two blocks of a program, where second block is the same as first but with volatile keyword. Gray text between lines of C code means i386/AMD64 assembler compiled from this code.

            {
                BOOL flag = TRUE;

                while( flag );
            repeat:
                jmp repeat
            }

            {
                volatile BOOL flag = TRUE;
                mov        dword ptr [flag], 1

                while( flag );
            repeat:
                mov        eax, dword ptr [flag]
                test       eax, eax
                jne        repeat
            }

In first block variable 'flag' could be cached by compiler into a CPU register, because it does not have volatile qualifier. Because no one will change value at a register, program will hang in an infinite loop (yes, all code below this block is unreachable code, and compiler such as Microsoft Visual C++ knows about it). Also this loop was optimized in equivalent program with the same infinite loop, but without involving variable initialization and fetching. 'jmp label' means the same as 'goto label' in C code.
    Second block have volatile qualifier and have more complex assembler output (initializing 'flag' with 'mov' instruction, in a loop fetching this flag into CPU register 'eax' with a 'mov' instruction, comparing fetched value with zero with 'test' instruction, and returning to the beginning of the loop if 'flag' was not equal to zero. 'jne' means 'goto if not equal'). This is all because volatile keyword prohibits compiler to cache variable value into CPU register, and it is fetched in all loop iterations. Such code is not always is an infinite loop, because another thread in the same program potentially could change value of variable 'flag' and first thread will exit the loop.
    It is important to understand that volatile keyword is just a directive for compiler and it works only at a compile-time. For example, the fact of using interlocked operation differs from just a compiler option, since special assembler commands are produced. Thus, interlocked instructions are most like to hardware directives, and they work at a run-time.


Saturday, 17 January 2015

Callback Function

Function pointers are among the most powerful tools in C, but are a bit of a pain during the initial stages of learning. This article demonstrates the basics of function pointers, and how to use them to implement function callbacks in C. C++ takes a slightly different route for callbacks, which is another journey altogether.




A pointer is a special kind of variable that holds the address of another variable. The same concept applies to function pointers, except that instead of pointing to variables, they point to functions. If you declare an array, say, int a[10]; then the array name a will in most contexts (in an expression or passed as a function parameter) “decay” to a non-modifiable pointer to its first element (even though pointers and arrays are not equivalent while declaring/defining them, or when used as operands of the sizeof operator). In the same way, for int func();func decays to a non-modifiable pointer to a function. You can think of func as a const pointer for the time being.
But can we declare a non-constant pointer to a function? Yes, we can — just like we declare a non-constant pointer to a variable:
int (*ptrFunc) ();
Here, ptrFunc is a pointer to a function that takes no arguments and returns an integer. DO NOT forget to put in the parenthesis, otherwise the compiler will assume that ptrFunc is a normal function name, which takes nothing and returns a pointer to an integer.



LINKING between main() and f() is STATIC.
there is no any(static/dynamic) linking between sim()/mul() and main().
those called by passing function as an Argument from main().

------------------------------------------------------------------------------------------------------------
Let’s try some code. Check out the following simple program:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include<stdio.h>
/* function prototype */
int func(int, int);
int main(void)
{
    int result;
    /* calling a function named func */
    result = func(10,20);       
    printf("result = %d\n",result);
    return 0;
}
/* func definition goes here */
int func(int x, int y)             
{
return x+y;
}
As expected, when we compile it with gcc -g -o example1 example1.c and invoke it with./example1, the output is as follows:
result = 30
The above program calls func() the simple way. Let’s modify the program to call using a pointer to a function. Here’s the changed main() function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include<stdio.h>
int func(int, int);
int main(void)
{
    int result1,result2;
    /* declaring a pointer to a function which takes
       two int arguments and returns an integer as result */
    int (*ptrFunc)(int,int);
    /* assigning ptrFunc to func's address */                     
    ptrFunc=func;
    /* calling func() through explicit dereference */
    result1 = (*ptrFunc)(10,20);
    /* calling func() through implicit dereference */         
    result2 = ptrFunc(10,20);               
    printf("result1 = %d result2 = %d\n",result1,result2);
    return 0;
}
int func(int x, int y)
{
    return x+y;
}
The output has no surprises:
result1 = 30 result2 = 30

A simple callback function

At this stage, we have enough knowledge to deal with function callbacks. According toWikipedia, “In computer programming, a callback is a reference to executable code, or a piece of executable code, that is passed as an argument to other code. This allows a lower-level software layer to call a subroutine (or function) defined in a higher-level layer.”
Let’s try one simple program to demonstrate this. The complete program has three files:callback.creg_callback.h and reg_callback.c.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/* callback.c */
#include<stdio.h>
#include"reg_callback.h"
/* callback function definition goes here */
void my_callback(void)
{
    printf("inside my_callback\n");
}
int main(void)
{
    /* initialize function pointer to
    my_callback */
    callback ptr_my_callback=my_callback;                           
    printf("This is a program demonstrating function callback\n");
    /* register our callback function */
    register_callback(ptr_my_callback);                             
    printf("back inside main program\n");
    return 0;
}
1
2
3
/* reg_callback.h */
typedef void (*callback)(void);
void register_callback(callback ptr_reg_callback);
1
2
3
4
5
6
7
8
9
10
11
/* reg_callback.c */
#include<stdio.h>
#include"reg_callback.h"
/* registration goes here */
void register_callback(callback ptr_reg_callback)
{
    printf("inside register_callback\n");
    /* calling our callback function my_callback */
    (*ptr_reg_callback)();                                  
}
Compile, link and run the program with gcc -Wall -o callback callback.c reg_callback.c and./callback:
This is a program demonstrating function callback
inside register_callback
inside my_callback
back inside main program
The code needs a little explanation. Assume that we have to call a callback function that does some useful work (error handling, last-minute clean-up before exiting, etc.), after an event occurs in another part of the program. The first step is to register the callback function, which is just passing a function pointer as an argument to some other function (e.g., register_callback) where the callback function needs to be called.
We could have written the above code in a single file, but have put the definition of the callback function in a separate file to simulate real-life cases, where the callback function is in the top layer and the function that will invoke it is in a different file layer. So the program flow is like what can be seen in Figure 1.
Program flow
Figure 1: Program flow
The higher layer function calls a lower layer function as a normal call and the callback mechanism allows the lower layer function to call the higher layer function through a pointer to a callback function.
This is exactly what the Wikipedia definition states.

Use of callback functions

One use of callback mechanisms can be seen here:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/ * This code catches the alarm signal generated from the kernel
    Asynchronously */
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
struct sigaction act;
/* signal handler definition goes here */
void sig_handler(int signo, siginfo_t *si, void *ucontext)
{
   printf("Got alarm signal %d\n",signo);
   /* do the required stuff here */
}
int main(void)
{
    act.sa_sigaction = sig_handler;
    act.sa_flags = SA_SIGINFO;
    /* register signal handler */
    sigaction(SIGALRM, &act, NULL);  
    /* set the alarm for 10 sec */       
    alarm(10);   
    /* wait for any signal from kernel */                                        
    pause();  
    /* after signal handler execution */                                             
    printf("back to main\n");                     
    return 0;
}

Signals are types of interrupts that are generated from the kernel, and are very useful for handling asynchronous events. A signal-handling function is registered with the kernel, and can be invoked asynchronously from the rest of the program when the signal is delivered to the user process. Figure 2 represents this flow.
Kernel callback
Figure 2: Kernel callback
Callback functions can also be used to create a library that will be called from an upper-layer program, and in turn, the library will call user-defined code on the occurrence of some event. The following source code (insertion_main.cinsertion_sort.c and insertion_sort.h), shows this mechanism used to implement a trivial insertion sort library. The flexibility lets users call any comparison function they want.
1
2
3
4
/* insertion_sort.h */
typedef int (*callback)(int, int);
void insertion_sort(int *array, int n, callback comparison);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
/* insertion_main.c */
#include<stdio.h>
#include<stdlib.h>
#include"insertion_sort.h"
int ascending(int a, int b)
{
    return a > b;
}
int descending(int a, int b)
{
    return a < b;
}
int even_first(int a, int b)
{
    /* code goes here */
}
int odd_first(int a, int b)
{
    /* code goes here */
}
int main(void)
{
    int i;
    int choice;
    int array[10] = {22,66,55,11,99,33,44,77,88,0};
    printf("ascending 1: descending 2: even_first 3: odd_first 4: quit 5\n");
    printf("enter your choice = ");
    scanf("%d",&choice);
    switch(choice)
    {
        case 1:
            insertion_sort(array,10, ascending);
            break;
        case 2:
            insertion_sort(array,10, descending);
         case 3:
            insertion_sort(array,10, even_first);
            break;
        case 4:
            insertion_sort(array,10, odd_first);
            break;
        case 5:
            exit(0);
        default:
            printf("no such option\n");
    }
    printf("after insertion_sort\n");
    for(i=0;i<10;i++)
        printf("%d\t", array[i]);
    printf("\n");
     return 0;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/* insertion_sort.c */
#include"insertion_sort.h"
void insertion_sort(int *array, int n, callback comparison)
{
    int i, j, key;
    for(j=1; j<=n-1;j++)
    {
        key=array[j];
        i=j-1;
        while(i >=0 && comparison(array[i], key))
        {
            array[i+1]=array[i];
            i=i-1;
        }
        array[i+1]=key;
    }
}