UNIVERSITY OF NORTHERN BRITISH COLUMBIA

Winter 2002
Computer Science


            Scanner

                                                        

1. General points

  • You must follow the lexical conventions provided in appendix A.1 in your text.
  • It is up to you to decide how you will handle the error, but ensure that any conventions you introduce are included in your report.
  • Also note that the grammar does not impose a static limit on the length of a token string, or a static limit on the length of a line.
  • Comments should simple be ignored at the parser stage, and no tokens should be produced.
  • Your scanner should accept files with at least the following extensions (.c- , .cminus, .mns)

 

      
2. Suggested Readings

  1. Chapter #1.7-1.8
  2. Chapter 2

You will also find your CPSC340 textbook useful for this stage.
 

            

3. Input to build your scanner

Tokens:

                          Special symbols : + - * /  <  <= > >= == != = ; , ( ) [ ] { } /* */

                                       Keywords           : else if int return void while

                                       Identifiers          :

                                                                      ID = letter letter*

                                                                      letter = a|..|z|A|..|Z

                                       Numerals           :

                                                                      NUM = digit digit*

                                                                      digit = 0|1|...|9

Note :

  • Lower and upper case letters are distinct.
  • White space consists of blanks, newlines, and tabs. White space is ignored except that it must sperate ID's, NUM's, and keywords.
  • Comments are sorrounded by the usual C notations /* ... */. Comments can be placed anywhere white space can appear (that is, comments cannot be placed within tokens) and may include more than one line.  Comments may not be nested.

4. Token standard


This token format MUST BE FOLLOWED when implementing the Cminus scanner.

Special Symbol Token
+ PLUS
- MINUS
* TIMES
/ DIV
$<$ LT
$<=$ LTEQ
$>$ GT
$>=$ GTEQ
$==$ EQ
$!=$ NEQ
$=$ ASSIGN
; SEMI
, COMMA
( LPAREN
) RPAREN
[ LSQR
] RSQR
{ LCRLY
} RCRLY


Table 1: special symbols and the corresponding tokens generated by the scanner.


Keyword Token
else ELSE
if IF
int INT
return RETURN
void VOID
while WHILE


Table 2: key words and the corresponding tokens generated by the scanner.

 

5. Scanner I/O structure

 

   
6. Sample Output


The scanner output must contain at least the functionality shown below. Feel free to make additions that you deem necessary. Using the program nfact.mns two examples of the output produced by the scanner are shown.


Program nfact.mns
[EFS]>cat nfact.mns
/**************
  recursive N factorial.  Used
  for testing recursion.  Also
  used to make sure that the
  answer from recursive n! is the
  same as iterative n!.
**************/

/* recursively call n!*/
int nfact(int n)
{
  if( n == 0 )
  {
    return 1;
  }
  else
  {
    return(n*nfact(n-1));
  }

  return 0;
}

void main(void)
{
  int n;
  n = input();
  n = nfact(n);
  output(n);
}
[EFS]>

 


Example #1
[EFS]>cminus nfact.mns

CMINUS COMPILATION: nfact.mns
IN NO PARSE
        10: reserved word: int
        10: ID, name= nfact
        10: (
        10: reserved word: int
        10: ID, name= n
        10: )
        11: {
        12: reserved word: if
        12: (
        12: ID, name= n
        12: ==
        12: NUM, val= 0
        12: )
        13: {
        14: reserved word: return
        14: NUM, val= 1
        14: ;
        15: }
        16: reserved word: else
        17: {
        18: reserved word: return
        18: (
        18: ID, name= n
        18: *
        18: ID, name= nfact
        18: (
        18: ID, name= n
        18: -
        18: NUM, val= 1
        18: )
        18: )
        18: ;
        19: }
        21: reserved word: return
        21: NUM, val= 0
        21: ;
        22: }
        24: reserved word: void
        24: ID, name= main
        24: (
        24: reserved word: void
        24: )
        25: {
        26: reserved word: int
        26: ID, name= n
        26: ;
        27: ID, name= n
        27: =
        27: ID, name= input
        27: (
        27: )
        27: ;
        28: ID, name= n
        28: =
        28: ID, name= nfact
        28: (
        28: ID, name= n
        28: )
        28: ;
        29: ID, name= output
        29: (
        29: ID, name= n
        29: )
        29: ;
        30: }
        31: EOF
[EFS]>


Example #2
[EFS]>c- nfact.mns

*********C- COMPILATION: nfact.mns*********

Scanning the source file...
        10: reserved word: int
        10: ID, name= nfact
        10: (
        10: reserved word: int
        10: ID, name= n
        10: )
        11: {
        12: reserved word: if
        12: (
        12: ID, name= n
        12: ==
        12: NUM, val= 0
        12: )
        13: {
        14: reserved word: return
        14: NUM, val= 1
        14: ;
        15: }
        16: reserved word: else
        17: {
        18: reserved word: return
        18: (
        18: ID, name= n
        18: *
        18: ID, name= nfact
        18: (
        18: ID, name= n
        18: -
        18: NUM, val= 1
        18: )
        18: )
        18: ;
        19: }
        21: reserved word: return
        21: NUM, val= 0
        21: ;
        22: }
        24: reserved word: void
        24: ID, name= main
        24: (
        24: reserved word: void
        24: )
        25: {
        26: reserved word: int
        26: ID, name= n
        26: ;
        27: ID, name= n
        27: =
        27: ID, name= input
        27: (
        27: )
        27: ;
        28: ID, name= n
        28: =
        28: ID, name= nfact
        28: (
        28: ID, name= n
        28: )
        28: ;
        29: ID, name= output
        29: (
        29: ID, name= n
        29: )
        29: ;
        30: }
        31: End of File
[EFS]>


 

7. What to hand in


The write-up should be approximately 2 pages excluding diagrams, tables and code. The assignment should have all of the following sections:

  1. Description of the solution
    • A description of the main functions.
    • The DFA used for generating all tokens in Appendix A.1 (even if you did not use a DFA in your code).
  2. Bugs in Code
  3. Solutions to these Bugs
  4. Description of Conventions
    • State each error message and its meaning.
    • State any other conventions that you may have produced.
  5. Compiling and Running Cminus code
    • State all the modified files in this section of the report.
    • Show how you compile and run your Cminus code using descriptions and/or examples.
    • Show how you run a compiled Cminus program using descriptions and/or examples.
  6. Code
    • State all the modified files in this section of the report.
    • Include all code.

 

8. Scanner Testing

The test code will only be made available after this part of the project has been marked, so that you can correct your code. It is your responsibility to develop your own tests and make sure this part of the project works. Note that you do not submit tests that you create yourself.

Scanner Tests (this link is only made active after you this part of the project has been marked)