BNF Syntax Validator and Formatter

Last update: 2024-03-12

bnf_chk validator and formatter for the EBNF (Extended Backus-Naur Form) syntax. Two implementations are also available:


EBNF Syntax
Try BNF_CHK on-line


The program presented here lets you check the formal validity of a set of EBNF declarations. Given a text file containing the EBNF declarations, the program bnf_chk can:

The program parses a common EBNF syntax like this, describing integer and floating point numbers:

digit = "0".."9";
integer = ["+"|"-"] digit {digit};
real = integer "." digit {digit} [ "E" integer ];

These rules say that a digit is any character between 0 and 9, an integer number is a sequence of one or more digits with possibly a leading sign, and a real number is just like an integer with a fractional part and possibly a scale factor. These declarations can be reformatted with the command

$ bnf_chk --print-html --print-index numbers.txt

giving as output the HTML version that follows:

1. digit = "0".."9" ;
2. integer = [ "+" | "-" ] digit1 { digit1 } ;
3. real = integer2 "." digit1 { digit1 } [ "E" integer2 ] ;

The program is released as freeware.

EBNF Syntax

A file that BNF_CHK can parse must be a sequence of zero or more rules. The special character "#" starts a comment and all the characters up to the end of the line are ignored.

Rules. Every rule begins with its rule identifier, followed by an equal sign =, followed by an expression and terminated by semicolon:

identifier = expression ;

Rule identifiers. Any sequence of letters, digits and underscore starting with a letter or an underscore is a rule identifier (non-terminal symbol).

Expressions. An expression is one or more terms separated by a vertical bar:

term1 | term2 | term3

Terms. A term is a product of factors. Factors can be simply written in the order, no "product" symbol is required. Spaces are allowed to separate contiguous identifiers.

Factors. A factor can be:

Literal words of the language must be enclosed within double quotes. Only ASCII printable characters are allowed. The character double quote itself can be expressed through the special sequence \" (backslash, double quote) and the character backslash can be expressed as \\ (backslash, backslash). Commonly used escaped sequences "\a\b\n\r\t" are also allowed. Any other byte can be expressed in hexadecimal form as \xHH where HH are two hexadecimal digits.
The new JavaScript implementation also supports UTF-16 characters encoded as "\uHHHH".

The figure below illustrates some features of the EBNF syntax through syntax diagrams: the square boxes are the rules, while the arrows indicate the allowed paths between rules.

EBNF syntax diagrams

Algebraic syntax and syntax diagrams compared.

The EBNF syntax allowed by the program can be expressed in terms of the EBNF syntax itself as follows:

1. bnf_file = { rule2 } ;
2. rule = identifier3 "=" expression6 ";" ;
3. identifier = ( letter4 | "_" ) { letter4 | digit5 | "_" } ;
4. letter = "a".."z" | "A".."Z" ;
5. digit = "0".."9" ;
6. expression = term7 { "|" term7 } ;
7. term = factor8 { factor8 } ;
8. factor = identifier3 | literal9 | range10 | "(" expression6 ")" | "{" expression6 "}" | "[" expression6 "]" ;
9. literal = "\"" { char11 } "\"" ;
10. range = "\"" char11 "\"" ".." "\"" char11 "\"" ;
11. char = plain_char12 | escaped_char13 | hex_char14 ;
12. plain_char = " ".."\x7f" | "\x80".."\xff" | utf_16_char15 ;
13. escaped_char = "\\" ( "\"" | "\\" | "a" | "b" | "n" | "r" | "t" ) ;
14. hex_char = "\\x" hex16 hex16 ;
# UTF-16 chars are supported only by the new JavaScript implementation:
15. utf_16_char = "\\u" hex16 hex16 hex16 hex16 ;
16. hex = "0".."9" | "a".."f" | "A".."F" ;

Try BNF_CHK on-line

Two implementations are available:

Here are some examples of BNF definitions:


This old implementation of the validator based on the M2 language is not maintained anymore and it's still here only for future reference. The new maintained implementation is made in JavaScript and it's available here.
File Size Description
bnf_chk-windows-1.4_20061107.zip 68 KB Executable program tested on Microsoft Windows Vista. Users of Windows that intend simply to use the program, should download only this package.
bnf_chk-pure-c-1.4_20061107.tar.gz 27 KB This version of the program is the output of the M2-to-C cross compiler. To successfully generate the executable program you only need a C compiler like gcc under Linux or Unix.
bnf_chk-1.4_20061107.tar.gz 30 KB Original M2 source of BNF_CHK. Requires the M2 development system, available at www.icosaedro.it/m2.

The source program is also browsable throught the CVS server, from which you can also download the latest version of the program under development.

Umberto Salsi
Site map
An abstract of the latest comments from the visitors of this page follows. Please, use the Comments link above to read all the messages or to add your contribute.

2011-09-19 by Guest
comment sections (* comment *) gave fatal error[more...]

2009-12-21 by Umberto Salsi
Anonymous wrote: [...] I would to remember you all to read the specifications about my little program (click on the "Section index" link above) where the exact syntax the program accepts is clearly explained. About the repetition factor, the syntax 3*A can be easily converted to A A A without commas. [more...]

2009-12-11 by Guest
Anonymous wrote: [...] The syntax checker also does not understand the repetition syntax above, such as bb = 3 * aa, "B";[more...]

2009-03-05 by Guest
It seems that your EBNF syntax checked does not understand ','. In EBNF it is called 'concatenate-symbol'. Eample from ISO 17977: bb = 3 * aa, "B"; cc = 3 * [aa], "C"; Your checker does not understand comma here.[more...]