example

Développement d'un compilateur pour VSPL

Rappel - la syntaxe de VSPL:

<programme> --> begin <liste d'instructions> end
<liste d'instructions> --> <instruction> ; <liste d'instructions>
<liste d'instructions> --> <instruction>
<instruction> --> <id> := <expression>
<expression> --> <id> + <id>
<expression> --> <id> - <id>
<expression> --> <id>

Spécification du compilateur Java:

Le programme Java doit lire dans un fichier un programme écrit en VSPL et après l'avoir analysé, le programme imprime:

un message d'erreur si une erreur syntaxique ou lexicale est découverte
les valeurs des expressions d'affectation du programme

Solution

Svp., lisez de nouveau les sections suivantes dans les notes de cours: "Diagrammes syntaxiques" et "Exemple: les expressions - différentes grammaires et règles pour l'évaluation des attributs sémantiques" dans la section sur le "Attributs sémantiques".

Nous utilisons l'analyse récursive descendante. Cela veut dire que le compilateur contient une procédure récursive pour chaque non-terminal de la grammaire: <programme>, <liste d'instructions>, <instruction> et <expression>. Mais le non-terminal <liste d'instructions> peut être éliminé comme suit: On note que les règles (2) et (3) peuvent être remplacées par la règle (en BNR étendu avec l'étoile de Kleene): <liste d'instructions> --> ( <instruction> ; )* <instruction> . Ensuite on peut substituer la partie droite pour <liste d'instructions> dans la règle (1) qui devient alors: <programme> --> begin ( <instruction> ; )* <instruction> end .

Puis, on peut écrire des automates (récursive), un pour chaque non-terminal, comme suit (nous avons déjà vu de tels automates récursive dans la discussion de la grammaire des expressions régulières au début du chapître sur l'analyse syntaxique): - note: on utilise ici les noms anglais: <program> pour <programme> et <statement> pour <instruction> -

program statement expression

Ceci peut aussi être transformé en graphes syntaxique (voir ici). Ces automates (ou les graphes syntaxiques correspondants) peuvent être facilement traduits en code Java, comme montré par la classe Syner dans le programme Java appellé SimpleCompiler. Similairement, on peut écrire des graphes syntaxique pour l'analyse lexical et construire un programme Java correspondant (voir la classe Lexer dans le SimpleCompiler). Et voici un programme exemple et la sortie produite par le compilateur en analysant ce programme.

Commentaires sur le programme du SimpleCompiler (en anglais):

Besides a simple class SimpleCompiler, which represents the "main" program, there are essentially two classes, one representing the lexical analyzer and one representing the syntax analyzer.
The lexical analyzer class exports the definition of the tokens for this language (constant definitions) and has two exported variables (attributes): token which always contains the next token from the input file, and idName which contains the name of the identifier if the token is an identifier. It also exports the procedure getNextToken() which analyses the next token in the input file and puts it into the token variable. When an end of file has been encountered the token EOF is returned. The lexical analyzer also has a hidden variable c which always contains the next character (that is, the last character read from the input file). The procedure nextChar is used to get the next character from the input file. It also prints the character on the system output file if the option makeCopyOnOutput is selected.
The syntax analyzer works as explained in Sebesta's book. There is an analysis procedure for each nonterminal shown in the syntax graph. The procedure startAnalysis, which is called by the main program, will call the syntax analysis procedure for the <program> nonterminal. It also handles the end of the program. When the input file has been successfully analyzed, a subsequent call on the getNextToken of the lexical analyzer checks whether there are more non-space characters in the input file. Normally, this call should result in the return of the EOF token. If this is not the case, then there is some additional input available and an appropriate error message is prepared.
Please, note that there is a very simple symbol table in the syntax analyzer, which stores the names and values of variables. The name of the variable is used as the key for accessing the information.
The procedure parseExpression returns an integer number which represents the value of the expression. This corresponds to a semantic attribute of the expression which is the value of the expression in this case. If a nonterminal has several semantic attributes, then it is more convenient to let the parsing procedure return an instance of a object that has an attribute for each semantic attribute of the nonterminal.