Home / Index
 www.icosaedro.it 

 M2 - Reference

M2 version: 1.5-20110205



M2 is a procedural, high-level programming language with a garbage collector. Data structures are automatically allocated, arrays are dynamically expanded. The syntax of M2 is similar to that of Modula-2: all the keywords are written in uppercase letters, and all the identifiers are case-sensitive: file, FILE and File are different names.

Index

Hello, world!

An M2 program is a text file with extension .mod (.def and .imp will be explained later). For example, open your preferred text editor and type-in this source (probably you will use some sort of copy-and-paste mechanism provided by your system):

MODULE hello
IMPORT m2
BEGIN
    print("Hello, world!\n")
END

then save this source into the file hello.mod, compile and execute with the command

$ m2 hello.mod
$ ./hello
Hello, world!

m2 actually is a script that first call the actual m2c M2-to-C cross compiler, so that the C source hello.c is generated, then it calls the C compiler in order to generate the executable program hello (hello.exe on Windows). If all the compilation stages was successful, the intermediate source hello.c gets deleted.

Check m2 -h for other useful options of the front-end script.

Comments

All the text following a # character up to the end of the line is ignored: this is a single line comment.

All the text enclosed between (* and *) is ignored: this is a multi-line comment. Multi-line comments can be nested.

The BOOLEAN data type

The type BOOLEAN has only the values FALSE or TRUE. Boolean expressions can be evaluated through the logical operators AND, OR and NOT. The following tables summarizes the computation done by these operators:
a     b      |  a OR b     a AND b             a     |  NOT a
-------------+--------------------             ------+-------
FALSE FALSE  |  FALSE      FALSE               FALSE | TRUE
FALSE TRUE   |  TRUE       FALSE               TRUE  | FALSE
TRUE  FALSE  |  TRUE       FALSE
TRUE  TRUE   |  TRUE       TRUE
The NOT operator has the highest priority, followed by AND and then OR. The round parenthesis can be used to alter the order of the evaluation:
NOT TRUE AND FALSE OR TRUE    gives   TRUE
NOT TRUE AND (FALSE OR TRUE)  gives   FALSE
The boolean expression are evaluated from left to right, and the first partial value that determinate the result terminates the evaluation of the sub-expression. That means that the first TRUE term of the expression a OR b OR c makes all the expression TRUE and the remaining terms aren't evaluated; the first FALSE factor of the expression a AND b AND c makes all the expression FALSE and the remaining factors aren't evaluated.

The INTEGER data type

An INTEGER number is a two-complement integer number, i.e. the int data type of the underlying C compiler, typically 32 bit long. There are only two levels of priority for the operators involving integer numbers
first:  * DIV MOD & << >>
next:   + - | ^

i DIV j gives the quotient, i MOD j gives the remainder of the division.

i << n is a bitwise left shift of n bits, i >> n is a bitwise right shift of n bits.

i|j gives the bitwise OR, i^j gives the XOR, & gives the bitwise AND.

Integer numbers can be compared through the operators < <= = >= <>, that give a BOOLEAN result.

The REAL data type

A REAL number is a floating-point number, i.e. the double data type of the underlying C compiler, typically 64 bit long. A literal REAL number requires at least the decimal point or the scale part:
3.141592
1.6E-27
-1.234
The allowed operators are (in order of priority):
first:  * /
next:   + -
REAL numbers can be compared through the operators < <= = >= <>, that give a BOOLEAN result.

The STRING data type

A STRING is a sequence of zero or more bytes, dynamically allocated. A STRING can have the special value NIL, that means the string isn't allocated at all. Only the ASCII printable characters are allowed in strings, with the exception of the " (double quote) and the \ (back-slash) that have special meaning. Example of literal strings:
"abcdef"
"A line.\nAnother line."
"\x00"
Two or more strings can be concatenated through the + operator. If all the strings concatenated are NIL, the result will be NIL. INTEGER and REAL numbers can be concatenated to a string, but they cannot appear as the first term. The conversion from number to string is automatically done by the functions m2runtime.itos() and m2runtime.rtos(). Examples:
"abc" + "def" + NIL + ""   gives   "abcdef"
"Pi = " + 3.141592         gives   "Pi = 3.14159"
"Record no. " + (1+2)      gives   "Record no. 3"
Strings can be compared through the operators < <= = >= <>, that give a BOOLEAN result. The NIL string is equal to NIL and less of any other string. The empty string "" is greater that NIL and less than any other string containing one or more characters:
NIL < "" < "abc"
Two non-NIL, non-empty strings are compared byte-by-byte, from left to right. Examples:
NIL < ""            gives TRUE
NIL = NIL           gives TRUE
NIL = ""            gives FALSE
"a" < "b"           gives TRUE
"abc" < "abcd"      gives TRUE

Only the printable ASCII characters are allowed in strings. Any other byte value can be entered using the escape sequence \xHH where HH is the hexadecimal value of the byte. Some commonly used special escape sequences are allowed:


\xHHhexadecimal code HH
\\\
\""
\a\x07 (ASCII BEL)
\b\x08 (ASCII BS)
\n\x0A (ASCII LF)
\r\x0D (ASCII CR)
\t\x09 (ASCII HT)

The content of a string cannot be changed, but a variable of the type STRING can be re-assigned.

The substring operator [] can be used to explore a string and return substrings. s[i,j] gives a string (j-i) bytes long containing the characters ranging from s[i] up to s[j-1]. s[i] is short version of s[i,i+1] and it gives a string one-byte long containing the byte at offset i. For examples:

s = "hello"
s[0,2] gives "he"
s[0]   gives "h"
s[2,2] gives "" (empty string)
s[0,length(s)-1]  gives "hello"

The substring operator cannot be applied to a NIL string. Use the m2runtime.length(s) function to return the current length of a string. To summarize, the range [i,j] is applicable to a string only if the string is not NIL and 0<=i<=j<length(s) otherwise it is a run-time fatal error.

The standard module str provides a set of basic string handling functions.

The ARRAY data type

An ARRAY is an ordered list of variables, all of the same type, that can be selected through an integer index ranging from 0 up to the last element assigned. The general syntax of this type is as follows:

ARRAY OF T

where T is the type of the elements. A variable of the type ARRAY has the initial value NIL, i.e. means it is not allocated. To allocate an array, simply assign an element to it with any non-negative integer index; if this index is greater that 0, the elements not assigned takes the initial value of the variable of their type (FALSE for BOOLEAN, zero for numbers, NIL for strings and for RECORD and ARRAY). For example

VAR a: ARRAY OF STRING
...
a[123] = "hello"

In this example the elements from a[0] up to a[122] are set to NIL. Arrays are expanded as necessary to hold the new element. The function m2runtime.count(a) gives the number of elements actually stored. For example, to print a list of strings:

...
VAR names: ARRAY OF STRING
    i: INTEGER
BEGIN
    FOR i=0 TO count(names)-1 DO
        print("element no. " + i + " = " + names[i] + "\n")
    END
END

The ARRAY constructor can be used to assign the whole array; the elements specified will occupies the offsets 0, 1, 2, etc:

a = {123, 456, 789}

To assign a new element to an array just after the last one:

a[ count(a) ] = 456

Adding an element to an array is so frequent that M2 allows to simply omit the index, so bringing to the short form that follows:

a[] = 456

A matrix is an ARRAY OF ARRAY. This example will set a 3x3 identity matrix with 1.0 in all the diagonal elements; remember that the elements not assigned takes the value 0.0:

VAR m: ARRAY OF ARRAY OF REAL
    i: INTEGER
BEGIN
    FOR i=0 TO 2 DO
        m[1][2] = 0.0 # ensures the row be 3 elems wide
        m[i][i] = 1.0
    END

    (* or simply:

        m = {
                {1.0, 0.0, 0.0},
                {0.0, 1.0, 0.0},
                {0.0, 0.0, 1.0}
            }
    *)

END

ARRAYs can be compared through the operators = and <>. What will be actually compared are the addresses where these ARRAYs are allocated in memory, NOT their content. The special value NIL can be compared with an ARRAY. This chunk of code prints the current status of the array `a'. Note that an array can be allocated with zero elements in it: this can be accomplished with the array constructor left empty, for example a={}.

IF a = NIL THEN
    print("not allocated")
ELSIF count(a) = 0 THEN
    print("allocated, but empty")
ELSE
    print("allocated with one or more elements")
END

The RECORD data type

A RECORD is a list of variables (aka "fields"). Every field has a name that can be used as selector of the field. The general syntax of a RECORD type is as follows:

RECORD
    fieldname1: type1
    fieldname2: type2
    fieldname3, fieldname4, fieldname5: type3
    ...
END

where fieldname1,... are the names of the fields, and type1,... are their types. A variable of the type RECORD is initially unallocated and its value is NIL. To allocate a RECORD simply assign one of its fields; the fields not assigned takes the initial value of their type (FALSE for BOOLEAN, zero for numbers, NIL for strings and for RECORD and ARRAY). For example

...
VAR point: RECORD  x, y: REAL  END
BEGIN
    point[x] = 1.0  # here the RECORD is allocated
    point[y] = 2.0
END

Note that the selector [fieldname] is the name of the field. The number of the fields of a RECORD is fixed and cannot changed at run-time. The types of the fields can be any valid type, including ARRAY and RECORD. If you need a RECORD containing a variable number of data, use a field of the type ARRAY.

The RECORD constructor can be used to assign all the fields to a RECORD; note that all the fields must be specified, in the order:

point = {1.0, 0.0}

Array constructors and record constructors can be combined to build complex structures.

RECORD types typically have a name declared inside a TYPE section. For example, a typical single-linked list of strings can be declared in this way:

TYPE
    List: RECORD
        next: List
        key: STRING
    END

The following program first adds some strings to a list, then prints all the strings of this list:

MODULE lists
IMPORT m2

TYPE
    List: RECORD
        next: List
        key: STRING
    END

FUNCTION add_elem(VAR l: List, s: STRING)
VAR m: List
BEGIN
    m[key] = s
    m[next] = l
    l = m

    (* or simply:

        l = {l, s}

    *)
END

FUNCTION print_elems(l: List)
BEGIN
    WHILE l <> NIL DO
        print(l[key] + "\n")
        l = l[next]
    END
END

VAR list: List
BEGIN
    add_elem(list, "Sunday")
    add_elem(list, "Monday")
    (* ...and so on *)
    print_elems(list)
END

RECORDs can be compared through the operators = and <>. What will be actually compared are the addresses where these RECORDs are allocated in memory. The special value NIL can be compared with a RECORD. For example:

    VAR p,q: Point
    ...
    IF p = NIL THEN
        print("not allocated")
    END
    IF p = q THEN
        print("same point")
    END

Type compatibility

A variable of a given type is assignment-compatible only with a variable of the same type. For dynamically allocated data types (STRING, ARRAY, RECORD) the assignment simply copy the pointer to the allocate data. Two ARRAYs are compatible if and only if their elements are compatible. Two RECORDs are compatible if and only if they have the same number of fields and their fields are compatible in the order. Different data types can have different names although being assignment-compatible.

The typical structure of a MODULE

Every module can have several sections, listed in arbitrary order. Only the IMPORT sections must appear before any other section. The general layout of the module is as follows:

MODULE name

IMPORT list of modules to be imported

CONST constants

TYPE types

VAR variables

FUNCTION name(arguments): return_type
BEGIN ... END

BEGIN
    here the main body of the program
END

The IMPORT section

This section declares the modules to be imported. You can specify several modules separated by commas, or use several IMPORT sections. Example:
IMPORT m2, io, win, MyModule
IMPORT AnotherModule

The required modules are searched first inside the same directory of the requiring module, then inside the list of directories given by the configuration of the compiler.

If an imported module depends on other modules (either in its DEFINITION or in its IMPLEMENTATION), these modules are imported automatically and included in the final source.

The items imported from a module (constants, types, variables and functions) are immediately available to the program. If two or more modules export the same item name, this item must be qualified by its module name. This is the case, for example, of the io.Open() function and the win.Open() function.

The CONST section

A constant is a name given to a literal value. The type of a constant can be only BOOLEAN, INTEGER, REAL and STRING. Examples:
CONST
    DEBUG = FALSE
    MAX_FILES = 128
    PI = 3.141592
    ROOT_PATH = "/usr/local/bin/"

The TYPE section

A named type is a name given to a type. Named types are useful for structured data types (ARRAY and RECORD), the enumeration type and the FORWARD type (see below). A named type can be used in place of the full declaration. Examples:
TYPE
    Point = RECORD x, y: INTEGER END
    Poly = ARRAY OF Point
    ProcessStatus = (ready, running, waiting)
Two variables are of the same type if they have the same structure in terms of the basic simple types (BOOLEAN, INTEGER, REAL, STRING) and the same structure builders (ARRAY, RECORD). The names of fields of a RECORD may differ. For example, any RECORD containing two INTEGER fields of any name, are equivalent to the Point record declared above; the type ARRAY OF RECORD a, b: INTEGER END is equivalent to the type Poly declared above.

The VAR section

A variable has a name, a type and a corresponding value. Variables declared inside a function are automatically allocated every time the function is called, and automatically released exiting that function. Examples:
VAR
    i, j: INTEGER
    in_fn, out_fn: STRING
    in, out: io.FILE
    origin: Point
    drawing: ARRAY OF Poly
    proc1, proc2: ProcessStatus
All the variables have a default value assigned:
Type           Default value
----------------------------
BOOLEAN        FALSE
INTEGER        0
REAL           0.0
STRING         NIL
ARRAY          NIL
RECORD         NIL

Variables that are local to a function can have the STATIC attribute that make these variables statically allocated, i.e. their value is allocated once for all when the program starts, then it is never released. For example, the function seq() of the following example will print the sequence of numbers 0, 1, 2, etc:

MODULE test_static

IMPORT m2

FUNCTION seq()
VAR STATIC i: INTEGER
BEGIN
    print("i=" + i + "\n")
    i = i + 1
END

BEGIN
    seq()  # prints "i=0"
    seq()  # prints "i=1"
    seq()  # prints "i=2"
END

Note that the initial value of the static variable is zero: even the static variables are initialized to their default value.

The FUNCTION section

A function is a chunk of code with a name, some formal arguments and a resulting return type. The formal arguments are local variables whose value is assigned by the caller. The return value can be of any type. If the function does not return a value, the return type must be omitted. Example:
FUNCTION min(a: INTEGER, b: INTEGER): INTEGER
(*
    Returns the minimum value between a,b.
    Note: this function is already available as m2.min().
*)
BEGIN
    IF a <= b THEN
        RETURN a
    ELSE
        RETURN b
    END
END


FUNCTION InArray(s: STRING, a: ARRAY OF STRING): BOOLEAN
(*
    Returns TRUE if the string "s" is contained inside the array "a".
*)
VAR i: INTEGER
BEGIN
    FOR i=0 TO count(a)-1 DO
        IF a[i] = s THEN
            RETURN TRUE
        END
    END
    RETURN FALSE
END


FUNCTION CharList(s: STRING): ARRAY OF CHAR
(*
    Returns the list of chars of which the given string is composed of.
*)
VAR  i: INTEGER  a: ARRAY OF STRING
BEGIN
    FOR i=0 TO length(s)-1 DO
        a[i] = s[i]
    END
    RETURN a
END
The types BOOLEAN, INTEGER and REAL are passed as a copy of the actual argument. STRING, ARRAY and RECORD are passed by address pointing to the allocated area, or NIL if not allocated. A formal argument can be passed by "reference" specifying the VAR attribute: in this case the actual argument must be a variable whose address is passes to the function, and any change to the formal arguments modify the original actual argument. Example:
MODULE solve

IMPORT m2, math

FUNCTION solve(a: REAL, b: REAL, c: REAL,
    VAR r1: REAL, VAR r2: REAL)
(*
    Returns the roots of the equation a*x*x + b*x + c = 0
*)
VAR
    delta, r: REAL
BEGIN
    IF a = 0.0 THEN
        HALT("a = 0")
    END
    delta = b*b - 4.0 * a * c
    IF delta <= 0.0 THEN
        HALT("no real solutions")
    END
    r = math.sqrt(delta)
    r1 = (-b - r) / (2.0 * a)
    r2 = (-b + r) / (2.0 * a)
END

VAR s1, s2: REAL

BEGIN
    solve(1.0, 2.0, -3.0,  s1, s2)
    print("s1=" + s1 + ", s2=" + s2 + "\n")
END

A function can declare any number of local constants, types, variables and nested function. All the items declared inside a function are local to the function, apart the STATIC variables. Local items are visible only inside the function.

A function can access any item declared in global scope, including itself.

Functions can be nested inside another function. The inner functions can have their local items and have access to the the items of the parent function, including variables and formal arguments.

The IF statement

The simplest form of the IF statement controls a block of statements that are executed only if a boolean expression is evaluated to be TRUE. One or more ELSIF branches are allowed. An ELSE branch can catch the any other case. Examples:
IF DEBUG THEN
    print("still alive!")
END

IF (s = NIL) OR (s = "") THEN
    print("missing value for the string")
ELSE
    print("s=" + s)
END

print("the number is ")
IF i > 0 THEN
    print("positive")
ELSIF i = 0 THEN
    print("zero")
ELSE
    print("negative")
END

The SWITCH statement

A block of instructions is executed depending on the value of a given controlling INTEGER expression.
SWITCH i DO

CASE 0:
    print("zero")

CASE 1, 2, 3:
    print("one or two or three")

ELSE
    print("invalid value: expected 0, 1, 2 or 3")

END
The ELSE branch is optional. It is a fatal, run-time error if the result of the controlling expression is a value that does not match any of the CASE branches and the ELSE branch is not provided.

The LOOP statement

The sequence of instructions controlled by the LOOP...END statement are executed and the repetition continues indefinitely. The EXIT statement can be used to exit the LOOP and continue with the following statement. This chunk of code will print all the numbers from 0 to 9:
i = 0
LOOP
    IF i >= 10 THEN
        EXIT
    END
    print("i=" + i + "\n")
    i = i + 1
END
There may be several EXIT statements inside a LOOP statement. Another typical way to leave a LOOP cycle is through the RETURN statement.

The WHILE statement

A block of statements is executed repeatedly while a controlling BOOLEAN expression evaluates to TRUE. This chunk of code will print all the numbers from 0 to 9:
i = 0
WHILE i < 10 DO
    print("i=" + i + "\n")
    i = i + 1
END
The block of controlled statements might not be executed if the BOOLEAN expression evaluates to FALSE at the first execution of the WHILE statement.

The REPEAT statement

A block of statements is executed repeatedly until a controlling BOOLEAN expression evaluates to TRUE. Note that the controlling expression is evaluated after the controlled statements, so that these latter are always executed at least one time. Note too that, contrary to the WHILE loop, the controlling expression gives the condition to exit the loop. This chunk of code will print all the numbers from 0 to 9:
i = 0
REPEAT
    print("i=" + i + "\n")
    i = i + 1
UNTIL i >= 10

The FOR statement

A block of statements is executed several times while an INTEGER variable takes all the values inside a given range. This chunk of code will print all the numbers from 0 to 9:
FOR i=0 TO 9 DO
    print("i=" + i + "\n")
END
The expressions giving the initial and the final value of the range are evaluated only once when the FOR statement is executed, so they cannot contain values update inside the loop itself. If the final value is less than the initial value, the controlled statements are not executed. The range of values are scanned from the the initial value to the final value with an increment of 1. A different increment, possibly negative, can be provided:
FOR i=9 TO 0 BY -1 DO
    print("i=" + i + "\n")
END

The RETURN statement

The RETURN statement can be used inside the main body of a MODULE or inside the body of a FUNCTION. When inside a MODULE, the RETURN statement causes the termination of the process and must specify the exit code of the process, typically 0 for success or 1 for failure:
RETURN 0
The exit code from a process must be an INTEGER number between 0 and 255 (the value resulting from the expression is masked with 0xFF).

Inside a FUNCTION the RETURN statement causes the termination of the FUNCTION. If the FUNCTION returns a value, the RETURN statement must specify the value returned through an expression of the proper type.

The HALT statement

This statement causes the abnormal termination of the program suitable for debugging. For example:
IF i < 0 THEN
    HALT("unexpected negative value")
END
will terminate the process sending to standard error a message similar to this:
MyModule:342: HALT: unexpected negative value
Abort

The RAISE ERROR statement

It allows to set the error condition:

RAISE ERROR code message

where code is an integer expression (typically a constant or a literal value) and message is a string giving the human readable description of the error occurred.

The error code and its description are available as m2runtime.ERROR_CODE and m2runtime.ERROR_MESSAGE respectively. If the errors are not catch-ed inside a TRY...END section (see below) the RAISE ERROR instruction HALTs immediately the program sending to the standard error a message having the structure

module.function(), line n, code code: message

and the error status on the exit of the program will be 1.

Vice-versa, if the RAISE ERROR is protected inside a TRY...END instruction, this instruction has no other effects than setting the error status. It is responsibility of the program to return to the caller and take any other countermeasure to handle the error condition.

Functions that want to raise errors MUST have the "RAISE ERROR" qualifier in their declaration:

FUNCTION FuncName()
RAISE ERROR
BEGIN
    # ...
END

The TRY statement

It has the general structure

TRY
    assignment or function call
CATCH
    code1:
        code that handle the error code1
    
    code2:
        code that handle the error code1
    
    (* ...more branches here *)

    ELSE
        code that handle any other error
END

Typical usages of the TRY...END statement are along the handling of file access. For example, this function will return TRUE if the file exists and it is readable, and FALSE in any other case:

MODULE TryTest
IMPORT m2, io
VAR
    f: FILE
BEGIN

    print("Trying to read myself: ")

    TRY
        io.Open(f, "TryTest.mod", "r")

    CATCH
        ENOENT: print("I do not exist, strange...\n")

        EACCESS: print("access denied, very strange...\n")

    ELSE
        print(ERROR_MESSAGE + "\n")

    END

    IF ERROR_CODE = 0 THEN
        print("success!\n")
        TRY io.Close(f) ELSE END  # ignore any error
    END 
END

The ELSE branch is optional: if omitted and none of the catch branches catch-ed the error, the error is raised.

Library modules

A library module consists in a DEFINITION MODULE and an IMPLEMENTATION MODULE. The DEFINITION MODULE gives the interface to the module, that is the list of constants, types, variables and functions accessible to any client module. A DEFINITION MODULE cannot contain executable code.

The IMPLEMENTATION MODULE gives the implementation of the exported functions. The IMPLEMENTATION MODULE can also contains private constants, private types, private variables, private functions and it MUST contain the implementation of the exported functions. Example:

DEFINITION MODULE example
(* File: example.def *)

IMPORT io

CONST
    (* My PC: *)
    CPU = "i686"
    RAM = 256 (* MB *)
    HD  =  80 (* GB *)

TYPE
    Point2D = RECORD x, y: REAL END

VAR
    afile: io.FILE  # actually not used in this example

FUNCTION Add2D(a: Point2D, b: Point2D): Point2D

END

This file defines the interface to the example.imp implementation module that follows. The DEFINITION example.def does not require to be compiled, but it can be checked with M2.

IMPLEMENTATION MODULE example
(* File: example.imp *)

(*
    IMPORT, CONST, TYPE, VAR and FUNCTIONS private
    to the module goes here.
*)

FUNCTION Add2D(a: Point2D, b: Point2D): Point2D
VAR p: Point2D
BEGIN
    p[x] = a[x] + b[x]
    p[y] = a[y] + b[y]
    RETURN p
END

END

Compiling this file gives example.c containing the C source of the module, and example.lnk containing possible options for the C compiler or the linker (see "Mixing C code" below for details). To summarize, these files are involved:

example.def

example.imp

example.c

example.lnk

A client program might looks like this:

MODULE tryExample
IMPORT m2, example
VAR one, img, sum: Point2D
BEGIN
    one = {1.0, 0.0}
    img = {0.0, 1.0}
    sum = Add2D(one, Add2D(one, img))
    print("it works!\n")
END

Compiling and executing complex applications

Use a Makefile to define the dependencies of the various elements. For example, suppose the main module M.mod requires the modules A.def,A.imp that in turn required B.def,B.imp. The Makefile can be the following:

# Makefile for the M project

B.c: B.def B.imp
    m2 B.imp

A.c: A.def A.imp B.c
    m2 A.imp

M: M.mod A.c B.c
    m2 M.mod

test: M
    ./M

The program can be compiled simply giving the command "make", and it can be compiled and executed simply giving the command "make test".

Mixing C code

It is really simple to mix C code along the M2 code. Lines beginning with a $ in the first column are passed verbatim to the resulting C source. All the M2 identifiers take the form MODULE_ITEM, where MODULE is the name of the module and ITEM is the name of a variable or function.

WARNING! This simple naming scheme MODULE_ITEM was just intended to simplify the integration between C code and M2 code, but it does not protect from possible collisions. For example, a module M exporting the item A_B and a module M_A exporting the item B both will produce the name M_A_B. To avoid these situations the names of the modules should not contain the underscore character.

For example, here is the implementation of a function that returns the current process identifier number (PID):

FUNCTION getpid(): INTEGER
BEGIN
$   return getpid();
END

The following function returns TRUE if the file exists, FALSE otherwise:

MODULE mymod

$ #include <sys/types.h>
$ #include <sys/stat.h>
$ #include <unistd.h>

FUNCTION file_exists(fn: STRING)
VAR
$   char *s;
$   struct stat buf;
BEGIN
    IF InvalidZString(fn) THEN
        # The file name contain a NUL byte --> not a valid file name
        RETURN FALSE
    END

# Build a valid zero-terminated C string:
$   MK_ZSTRING(s, mymod_fn);

$   return stat(mymod_fn->s, &buf) == 0;
END

BEGIN
    IF file_exists("mymod.mod") THEN
        print("I exist!\n")
    END
END

InvalZString(fn) returns TRUE is the passed string contains a zero byte, forbidden in zstrings usually handled by the C standard library. MK_ZSTRING(s, mymod_fn) dynamically allocates into the stack a zero-terminated copy of the string mymod_fn and sets s to be a pointer to this string; that string will be released on exit from the function.

Any line beginning with "$$ linker options:" can contain a list of options for the C compiler and linker. See for example the module math.def that requires the linker option -lm.

Reserved keywords

AND
ARRAY
BEGIN
BOOLEAN
BY
CASE
CATCH
CONST
DEFINITION
DIV
DO
ELSE
ELSIF
END
ERROR
EXIT
FALSE
FOR
FORWARD
FUNCTION
HALT
IF
IMPLEMENTATION
IMPORT
INTEGER
LOOP
MOD
MODULE
NIL
NOT
OF
OR
RAISE
REAL
RECORD
REPEAT
RETURN
STATIC
STRING
SWITCH
THEN
TO
TRUE
TRY
TYPE
UNTIL
VAR
VOID
WHILE

Data structures in memory

A BOOLEAN is represented as a 32 bit word, whose value is 0 for FALSE, TRUE otherwise.

An INTEGER is a 32 bit word, 2-complement.

A REAL is a 64 bit word.

A STRING is a pointer to a dynamically allocated area containing the actual string and its length. Several pointers can share the same string: since strings, once created, can never be changed (only pointers can), copying a string to another or passing a string as an argument of a function simply involves the copy of a pointer. The NIL string is a pointer whose value is zero. The empty string "" is pre-allocated by the m2runtime module.

M2 do not has a "CHAR" data type; single characters are simply strings one byte long. For efficiency reasons, the strings whose length is exactly 1 byte are allocate only one time in a special internal table, so that programs that scans a file byte-per-byte do not really cause the dynamic memory to be used.

The ARRAYs are dynamically allocated. The picture below illustrates the structure used. Some elements of the array are preallocated, so adding elements to an array does not produce too many re-allocations of the memory block.

The RECORDs are dynamically allocated. The picture below illustrate the structure used.



For more informations about the internal representation of data, read the file lib/m2runtime.c.

Run-time error messages

Errors detected at run-time cause the termination of the program. The message displayed on standard error has the format below:

MODULE.FUNCTION(), line LINE: MESSAGE

The messages that can be produced by the run-time module m2runtime are as follow:

Substring of a NIL string
Can't apply the substring operator s[*] to a NIL string.

Invalid substring index
Range s[i] invalid because i<0 or i>=length(s).

Invalid substring range
Range s[i,j] invalid because i<0 or j>=length(s) or i>j.

Cannot dereference NIL array
Can't access elements of an unallocated array.

Array index is negative
The expression a[EXPR] gives an invalid negative index.

Array index too large
Can't read the element a[EXPR] because EXPR >= count(a).

Cannot dereference NIL record
Can't read the fields of an unallocated RECORD.

Unexpected case in SWITCH
The expression SWITCH EXPR gives a value not found between the expected CASEs and the ELSE branch is not present.

Missing RETURN <expr>
Missing RETURN EXPR in a function returning a value. Since this issue is difficult to detect from the analysis of the source, this control is made at run-time.

EBNF Syntax

The following EBNF declarations describes in a concise but precise form the syntax of the M2 language. For learn more about this formalism, you can look at the page www.icosaedro.it/bnf_chk/index.html.

1. compilation_unit = definition_module2 | implementation_module3 | module4 ;
2. definition_module = "DEFINITION" "MODULE" module_name5 import6 { const_decl7 | var_decl11 | function_decl23 } "END" ;
3. implementation_module = "IMPLEMENTATION" "MODULE" module_name5 import6 { const_decl7 | type_decl10 | var_decl11 | function_decl23 function_body25 } "END" ;
4. module = "MODULE" module_name5 import6 function_body25 ;
5. module_name = name68 ;
6. import = { "IMPORT" [ module_name5 { "," module_name5 } ] } ;
7. const_decl = "CONST" { name68 "=" const_value8 } ;
8. const_value = boolean48 | [ "+" | "-" ] number49 | [ "+" | "-" ] const_name9 | string60 ;
9. const_name = qualified_name70 ;
10. type_decl = "TYPE" { name68 "=" ( "FORWARD" | type12 ) } ;
11. var_decl = "VAR" [ "STATIC" ] { name68 { "," name68 } ":" type12 } ;
12. type = simple_type14 | array_type18 | record_type19 | function_type22 | type_name13 ;
13. type_name = qualified_name70 ;
14. simple_type = "VOID" | "BOOLEAN" | "INTEGER" | enum_type15 | "REAL" | "STRING" ;
15. enum_type = "(" enum_elem16 { "," enum_elem16 } ")" ;
16. enum_elem = name68 [ "=" const_integer17 ] ;
17. const_integer = int_number52 | const_name9 ;
18. array_type = "ARRAY" "OF" type12 ;
19. record_type = "RECORD" { field_decl20 } "END" ;
20. field_decl = field_name21 { "," field_name21 } ":" type12 ;
21. field_name = name68 ;
22. function_type = function_decl23 ;
23. function_decl = "FUNCTION" name68 "(" [ formal_arg24 { "," formal_arg24 } ] ")" [ ":" type12 ] [ "RAISE" "ERROR" ] ;
24. formal_arg = [ "VAR" ] name68 ":" type12 ;
25. function_body = { const_decl7 | type_decl10 | var_decl11 | function_decl23 function_body25 } "BEGIN" { instruction26 } "END" ;
26. instruction = assignment27 | function_call28 | if33 | switch34 | while35 | repeat36 | for37 | loop38 | exit39 | try40 | raise41 | return42 ;
27. assignment = var_name64 { selector65 } "=" expr43 ;
28. function_call = prefix_function_call29 | postfix_function_call30 ;
29. prefix_function_call = function_name31 "(" { actual_arg32 } ")" ;
30. postfix_function_call = ( var_name64 | function_name31 "(" { actual_arg32 } ")" ) { "->" qualified_name70 "(" { actual_arg32 } ")" } ;
31. function_name = qualified_name70 ;
32. actual_arg = expr43 | var_name64 { selector65 } ;
33. if = "IF" expr43 "THEN" { instruction26 } { "ELSIF" expr43 "THEN" { instruction26 } } [ "ELSE" { instruction26 } ] "END" ;
34. switch = "SWITCH" expr43 "DO" { "CASE" const_integer17 { "," const_integer17 } ":" { instruction26 } } [ "ELSE" { instruction26 } ] "END" ;
35. while = "WHILE" expr43 "DO" { instruction26 } "END" ;
36. repeat = "REPEAT" { instruction26 } "UNTIL" expr43 ;
37. for = "FOR" qualified_name70 "=" expr43 "TO" expr43 [ "BY" const_integer17 ] "DO" { instruction26 } "END" ;
38. loop = "LOOP" { instruction26 } "END" ;
39. exit = "EXIT" ;
40. try = "TRY" ( assignment27 | function_call28 ) { "CATCH" const_integer17 { "," const_integer17 } ":" { instruction26 } } [ "ELSE" { instruction26 } ] "END" ;
41. raise = "RAISE" "ERROR" expr43 expr43 ;
42. return = "RETURN" [ expr43 ] ;
43. expr = simple_expr44 [ relation47 simple_expr44 ] ;
44. simple_expr = [ "+" | "-" ] term45 { add_operator50 term45 } ;
45. term = factor46 { mult_operator51 factor46 } ;
46. factor = "NIL" | boolean48 | number49 | string60 | const_name9 | var_name64 { selector65 } [ substr_selector63 ] | function_call28 | "(" expr43 ")" | "NOT" factor46 | "~" factor46 ;
47. relation = "<" | "<=" | "=" | ">=" | ">" | "<>" ;
48. boolean = "FALSE" | "TRUE" ;
49. number = int_number52 | real_number57 ;
50. add_operator = "+" | "-" | "OR" | "^" | "|" ;
51. mult_operator = "*" | "/" | "DIV" | "MOD" | "AND" | "&" ;
52. int_number = integer53 | hex_integer55 ;
53. integer = digit54 { digit54 } ;
54. digit = "0".."9" ;
55. hex_integer = "0x" hex_digit56 { hex_digit56 } ;
56. hex_digit = digit54 | "a".."f" | "A".."F" ;
57. real_number = integer53 ( decimals58 [ exponent59 ] | [ decimals58 ] exponent59 ) ;
58. decimals = "." integer53 ;
59. exponent = ( "e" | "E" ) [ "+" | "-" ] integer53 ;
60. string = "\"" { str_char61 | str_escape62 } "\"" ;
61. str_char = " " | "!" | "#".."[" | "]".."~" ;
62. str_escape = "\\" ( "\\" | "\"" | "a" | "b" | "n" | "r" | "t" | "x" hex_digit56 hex_digit56 ) ;
63. substr_selector = "[" expr43 [ "," expr43 ] "]" ;
64. var_name = qualified_name70 ;
65. selector = "[" ( index67 | field_name21 ) "]" | next_elem_in_array66 ;
66. next_elem_in_array = "[" "]" ;
67. index = expr43 ;
68. name = ( letter69 | "_" ) { letter69 | digit54 | "_" } ;
69. letter = "a".."z" | "A".."Z" ;
70. qualified_name = [ module_name5 "." ] name68 ;

References

www.icosaedro.it/m2 is the official WEB site of the M2 language. Check her for new versions of the language and of the compiler.

www.icosaedro.it/m2/applications.html contains a list of applications developed with M2.


Umberto Salsi
Contact
Site map
Home / Index