The Migration on Request tool must be maintained in the future for as yet unknown systems and architectures. Important choices have to be made about how the program should be implemented in order to ensure that it remains portable.
C is a fairly small and simple language. It is well established and support exists on practically every platform. It seems likely that such support would be continued on future platforms due to the popularity of the language and the amount of legacy code that already exists.
Not all compilers are the same. One C compiler may encounter problems with some source code that another compiler will handle quite happily. The problem is partly due to compiler writers adding extra features or being lenient when checking programs against the language standards. The languages are still evolving - C90 allows variables to be declared anywhere in a program, whereas earlier standards require variables to be declared at the start of a block.
If there is a need to convert a C program to another language, this would be a fairly easy job were it not for a few parts of the language which are now accepted as bad practice and do not appear in modern language design. A restricted version, named C--, has been suggested David Holdsworth of the CAMiLEON group. C-- removes the "unhygenic" aspects of C.
The vector graphic Migration on Request tool has been written in accordance with the suggestions for C--, along with some extra restrictions. A summary of the rules and restrictions follows - those wishing to develop the Migration on Request tool further should read these.
Do not use macros.
The C macro preprocessor is widely regarded as a route to confusing code, although it originally allowed efficient implementation of multiple variants from a single source code. It is now regarded that normal if tests using values known at compile time enable modern optimising compilers to achieve the same level of efficiency - which in any case, is not our main concern.
Do not use unions.
The particular style of the union does not survive to other languages. Object orientation techniques render the idea obsolete. In some respects unions have their origin in FORTRAN's EQUIVALENCE statement, that was a notorious cause of portability problems in the past.
Unions are often used to save memory. This is not a main concern.
Do not use address arithmetic.
Many typical C programs are filled with address arithmetic. This is partly historic because the array facilities were not present in the earliest versions of C. C-- should force the use of array subscripting.
Also see below - memory allocation.
The condition in if, while statements, etc, must be boolean relational expression. The use of a relational expression gives correct code that delivers a boolean test when translated into other languages.
The result of an assignment must be voided. This means that code such as x = y = 2; is not allowed. Although the facility does carry forward to Java it is absent from Pascal and many other languages.
These are not allowed. There are some common library routines that are variadic - eg. printf - but the facility does not carry forward to many other languages.
The use of essential routines such as printf is permitted within reason. It is recommended to wrap these calls in separate functions. This minimises the occurrences of variadic functions and should allow manual modification to take place when moving the program to another language. All calls to variadic functions should be well documented.
The size of various types are not defined in the C standards. Although most current C implementations use 32 bit integers, this cannot be assumed. Some languages such as Java are more explicit about the size of each type. Likewise, a short int cannot be assumed to be 16 bits (or half the size of an int).
There is also the issue of whether numbers are stored in a little-endian or big-endian format, or least-significant / most-significant byte order.
These problems are not easily resolved. The approach taken when developing the vector graphic Migration on Request tool is to assume that ints are at least 32 bits long, and that short ints are at least 16 bits long.
The C language does not have very advanced memory management functions. The malloc function is used to claim a number of bytes of memory. The sizeof operator can be used to calculate the amount of memory required. Arrays can be allocated by multiplying the size passed to malloc or by giving the number of items to allocate to calloc.
These basic facilities can be dangerous as it is very easy to allocate the wrong amount of memory. It is possible to access elements in an array that are unclaimed which can lead to unexpected results. Other languages (eg. Java) may prevent this from happening, perhaps by throwing an exception. Great care must be taken to avoid these sorts of problems.
In Java every object can be thought of as a pointer, and all objects are allocated memory dynamically using the new operator. This doesn't apply to the basic data types such as an int which cannot be pointed to and instead has to be put in a 'wrapper class' such as Integer.
For these reasons, all structs should be created dynamically using malloc. Referencing basic data types such as int should be avoided.
Because in C there is no distinction between a pointer to an individual object and a pointer to the start of an array, it is recommended that arrays are not used - linked lists (or similar concepts) should be used instead. This also reduces the temptation to use address arithmetic.
C does not use garbage collection, so always remember to free any claimed memory when no longer needed.