Home / Section index
 www.icosaedro.it 

 The architecture of PHPLint

Last updated: 2014-02-20

The following brief notes illustrate the architecture of PHPLint.

How PHPLint starts

In the root of the PHPLint project there is a simple script php.bat (or php for Unix/Linux) that starts the currently installed PHP interpreter. This script, in turn, is used by phpl.bat (phpl under Unix/Linux) that finally starts PHPLint, a PHP script located in stdlib/it/icosaedro/lint/PHPLint.php. So, basically what you have to do to generate the PHPLint report about your program is to type a command like this:

phpl MyProgram.php

A simple GUI named phplint.tcl is also available that displays the report about a given source file and automatically updates this report as the file timestamp changes, so that you may see the changes while editing and fixing your program.

The PHPLint source

Under the stdlib/ directory there are several PHP source files: part are actually used by the PHPLint program itself, and part are provided for general usage in the hope them may be useful.

Generally, every source defines a single class and its namespace, and the namespace exactly matches the path of the file. So, for example, the file stdlib/it/icosaedro/bignumbers/BigInt.php contain the class it\icosaedro\bignumbers\BigInt. Every source loads the stdlib/autoload.php package that implements class autoloading: it is this package that maps class namespaces into file paths.

All the actual PHPLint program is under the it\icosaedro\lint namespace, being PHPLint.php the main program. The task of the main program is to parse the command line parameters and to initialize all the classes of the PHPLint parser. A description of the main classes follows.

Main classes

The Logger is responsible for the logging of the diagnostic messages: errors, warnings and notices. Every message may refers to some specific source location, so a concept of "where" something happened is required.

The Where class stores the concept of where, in the source being parsed, something happened. An object of this class contains: the file, the line of the source, the number of that line and the exact position in the line. Every object of this class stores a single position, and can be used to report messages related to that position. The current position in the current source being scanned is returned by the here() method of the Scanner class.

The Globals class holds the context of the parser, including the list of the globally existing entities: packages, constants, variables, functions and classes. A single instance of this class is created when the program starts, and it is passed to every parser method.

Scanner is the PHPLint lexical scanner. In fact, PHPLint does not uses the internal PHP scanner, but implements its own, independent, version with some specific features not available from the built-in PHP scanner, like support for meta-code keywords, character encoding checking, unclosed strings, invalid escape sequences, user-friendly error reporting, abstract input source, etc. And it is quite fast, as its scan rate is about 100 KB/s of PHP source on a 2.6 GHz Pentium E5300 processor.

expressions\Expression is the parser of the expressions. All the classes under the same namespace are related to this task. The most complicated part is the correct handling of the variables, which is resolved splitting the problem between parsing of unexisting variables (see UnknownVar), existing but not definitively assigned variables (see UnassignedVar) and existing and assigned variables (see AssignedVar). Each of these classes implements a parse() method that perform its own task. The distinction between existing but not assigned and existing and assigned variables comes from the static flow analysis the program performs statement by statement (see Flow for more about how PHPLint performs the static analysis).

types/Type is the abstract, base class that represents a type of data. Some derived classes implement the well-known PHP types (boolean, int, float, string, ...). The UnknownType class is special, as it represents something whose type cannot be determined by PHPLint. Generally, when PHPlint cannot determine the exact type of an expression, an error is logged in the report and a value of this type is set as the result of the evaluation of the expression. Later, PHPLint avoids to report more errors whenever that same unknown value is found again in the source.

The Result class stores a type/value pair as may result from the evaluation of an expression. Several methods also implements all the PHP operators, so that some expressions, at least those involving only literal values and constants, can be calculated.

Finally, the statements/Statement and all the classes under this namespace performs the parsing of the actual PHP source code. Some of these classes can be easily recognized, like IfStatement, WhileStatement, ForStatement, and so on. Each of these classes provides a parse() method that parses the corresponding, specific, statement.

Conclusions

At the time I'm writing these notes, PHPLint 2 is nearly complete. Many aspects of the old program required complete redesign, taking advantage from the modern features of a OOP language like PHP. Moreover, part of the already existing PHPLint standard library (notably, containers and file i/o) found their natural application in the PHPLint 2 program, making the source more readable, re-usable and maintenable.

PHPLint 2, with more than 170 classes and hundreds of methods, is a quite large PHP project. While developing PHPLint 2, PHPLint 1 played its important role helping to master a so big, complex project. In fact, experience shown that once the source passes validation, it is also ready to work perfectly most of the times. This demonstrates how important validation is, and how much it contributes the quality of a program.


Umberto Salsi
Comments
Contact
Site map
Home / Section index
Still no comments to this page. Use the Comments link above to add your contribute.