This assignment has a total value of 250 points. Electronic copies should be submited via the usual Computer Science Department electronic assignment submission sustem. For each question, include program documentation and a set of test inputs and outputs in what you hand in.
(a) Write a regular expression for all integer constants in C. Make sure to take into account different bases and the different suffixes the constants can ahve.
(b) Write a regular grammar for (a).
(c) Write a deterministic finite automaton for (a).
(a) Construct non-deterministic finite automata for the following regular expressions. Show the sequence of moves made by each in processing the input string ababbab.
(a.1) (a|b)*
(a.2) (a*|b*)*
(a.3) ((epsilon | a)b*)*
(a.4) (a|b)*abb(a|b)*
(b) Convert the NFAs of part (a) to DFAs using the subset construction. Show the sequence of moves made by each in processing the input string ababbab.
Consider the grammar
S -> ( L ) | a L -> L , S | S
(a) What are the terminals, nonterminals and start symbol?
(b) Find parse trees for the following sentences:
(c) Construct a leftmost derivation for each of the sentences in (b).
(d) Construct a rightmost derivation for each of the sentences in (b).
(e) What language does this grammar generate?
Construct LR Parsing tables for the grammar of Question 3.
Using the tool JFLex, write a scanner for C. A complete example of a scanner for C using lex was posted to net.sources 20 years ago. See the file scan.l in the directory http://www.csd.uwo.ca/~watt/home/courses/2010-11/cs4447a/parts/ansi-c-grammar.
The output of your scanner should be a stream of C tokens, containing enough information to be used in Question 3. You may use your own class to provide the token class numbers, or use the CToken class provided in the directory http://www.csd.uwo.ca/~watt/home/courses/2010-11/cs4447a/parts/classes.
Write a small driver program to test your parser on C Code. The driver program should scan and print the tokens, but (of course) not form parse trees.
Using the tool CUP, write a parser for C. A complete example of a parser for C using yacc was part of the same net.sources posting. See the file gram.y.
If you wish, you may simplify the grammar by assuming all types are given as int. This should cut the size of your grammar to about half.
The output of your parser should be a parse tree object. You may use your own parse tree class, or you may use the CCode class in the directory http://www.csd.uwo.ca/~watt/home/courses/2010-11/cs4447a/parts/classes.