CSE340 Spring 2023 Project 2
Due: March 21st 2023 by 11:59pm MST on GradeScope
Use this information only for good ; never for evil. Do not expose to fire . Do n ot o perate h eavy equipment after reading, may cause drowsiness . . . The standard is written in English . If you have trouble understanding a particular section, read it again and again and again . . . Sit up straight. Eat your vegatables. Do not mumble. – Pascal Standard ISO 7185:1990
1 General Advice
You should read the description carefully. Multiple readings are recommended. You will not be able to understand everything on a first reading. Give yourself time by starting early and taking breaks between readings. You will digest the requirements better.
- The answers to many of your questions can be found in this document.
- Do no start coding until you have a complete understanding of the requirements.
- Ask for help early. I said it before and I say it again: I and the TAs can save you a lot of time if you ask for help early. You can get help with how to approach the project to make the solution easier and have an easier time implementing it. If you spent too many hours on project 1, you should have asked for help early on. That said, when you ask for help, you should be prepared and you should have done your part.
- Have fun!
2 Overview
In this project, you are asked to write a C++ program that reads a description of a context free grammar, then, depending on the command line argument passed to the program, performs one of the following tasks (see Section 6.3 for more details on how to run the program with command line arguments):
- print the list of non-terminals followed by the list of terminals in the order in which they appear in the grammar rules,
- find useless symbols in the grammar and remove rules with useless symbols,
- calculate FIRST sets,
- calculate FOLLOW sets , or
- determine if the grammar has a predictive parser. We provide you with code to read the command line argument into an integer variable. Depending on the value of the variable, your program will invoke the appropriate functionality. The rest of the document is organized as follows:
- Section 3 describes the input format
- Section 4 describes what the input represents
- Section 5 describes what the output of your program should be for each task. This is the largest section of the document.
- Section 6 discusses command line arguments and how you should run and test your program.
- Section 7 describes the grading scheme.
- Section 8 addresses some submission concerns.
3 Input Format
The following context-free grammar specifies the input format:
Rule-list ! Rule Rule-list | Rule
Id-list ! ID Id-list | ID
Rule ! ID ARROW Right-hand-side STAR
Right-hand-side ! Id-list |
The input consists of a rule list. Each rule has a lefthand side which is an ID, which is followed by an arrow and is followed by a sequence of zero more IDs and terminated with a STAR. The meaning of the input is explained in the Semantics section below. The tokens used in the above grammar description are defined by the following regular expressions:
ID = letter (letter + digit)*
STAR = '*'
HASH = #
ARROW = ->
Where digit is the digits from 0 through 9 and letter is the upper and lower case letters a through z and A through Z. Tokens are case-sensitive. Tokens are space separated and there is at least one whitespace character between any two successive tokens. We provide a lexer with a geToken() function to recognize these tokens. You should use the provided lexer in you solution.
4 Semantics
...