<html>

<head><script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- MyFirstUnitAd -->
<ins class="adsbygoogle"
     style="display:inline-block;width:970px;height:250px"
     data-ad-client="ca-pub-5778386704669218"
     data-ad-slot="1503492166"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>

<meta http-equiv="Content-Language" content="zh-cn">
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<title>CFGReader</title>
</head>

<body>



<p align="left"><span lang="en-ca"><font size="6" color="#FF0000"><b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CFG 
Reader(</span>LR(1)-DFA<span lang="en-ca">) </span> </b></font></p>

<div align="left">
  <pre><b><font color="#ff0000" size="5">A. Eleventh Edition</font></b></pre>
</div>
<div align="left">
  <pre><b>This is eleventh <span lang="en-ca">edition of my </span></b><b><span lang="en-ca">CFG Reader which stands for Context-Free-Grammar Reader. In this edition, </span></b></pre>
</div>
<div align="left">
  <pre><b><span lang="en-ca">I </span>corrected several bugs in previous edition by constructing a DFA.</b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">B</span>.<span lang="en-ca"><a name="problem"></a>The problem</span></font></b></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="4" color="#FF0000"><b>The </b></font></span><b><span lang="en-ca"><font size="4" color="#FF0000">original grammar from teacher:</font></span></b></pre>
</div>
<div align="left">
  <font FACE="CMSLTT10">
  <p ALIGN="LEFT">S -&gt; ( S ) S | e<br>
  <br>
  <br>
  <b><font color="#FF0000"><span lang="en-ca">The  </span>DFA along with a LR(0) 
  parsing table constructed by my class Grammar is like following<span lang="en-ca">:</span></font></b></font></div>
<div align="left">
  <font FACE="CMSLTT10">
  <pre>
</pre>
  <pre>state no.0:
START ==&gt;  . S $
S ==&gt;  . ( S ) S
S ==&gt;  . e

state no.1:
START ==&gt; S  . $

state no.2:
S ==&gt;  . ( S ) S
S ==&gt;  . e
S ==&gt; (  . S ) S

state no.3:
S ==&gt; ( S  . ) S

state no.4:
S ==&gt;  . ( S ) S
S ==&gt;  . e
S ==&gt; ( S )  . S

state no.5:
S ==&gt; ( S ) S  .

S ==&gt; ( S ) S  | e
START ==&gt; S $
the following is LR table
    S,(,),e,START,$,
0|  1,s2,-1,-1,-1,-1,
1|  -1,-1,-1,-1,-1,-1,
2|  3,s2,-1,-1,-1,-1,
3|  -1,-1,s4,-1,-1,-1,
4|  5,s2,-1,-1,-1,-1,
5|  -1,r0,r0,r0,-1,r0,
Press any key to continue</pre>
  <pre>

</pre>
  </font></div>
<div align="left">
  <b><font color="#ff0000" size="5"><span lang="en-ca"><a name="explain"></a>C</span>.<span lang="en-ca">The
  </span></font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>idea of 
  program</b></font></span></div>
<div align="left">
  <pre><b>Constructing a DFA is a boring, routine job. Please note when I write this explanation, I just recalled that my</b></pre>
</div>
<div align="left">
  <pre><b>program has one more bug to fix: the epsilon at end a production rule should be removed, or ignored, whatever you</b></pre>
</div>
<div align="left">
  <pre><b>may call it. But the reduce-reduce-conflict and shift-reduce-conflict must be reported. I need to fix it next </b></pre>
</div>
<div align="left">
  <pre><b>edition.</b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5">D.<span lang="en-ca"><a name="Method"></a>The </span>major functions</font></b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">E</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>Further improvement</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">F</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>File listing</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font size="3"><span lang="en-ca">1. CFGReader.h</span></font></b></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>2. </b></font></span><b><font size="3"><span lang="en-ca">CFGReader</span></font></b><span lang="en-ca"><font size="3"><b>.cpp  </b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>3. Grammar.h</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>4. Grammar.cpp</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>5. parser.h </b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>6. parser.cpp</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>7. scanner.h</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>8. scanner.cpp</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>9. errorno.h</b></font></span></pre>
  <pre><span lang="en-ca"><font size="3"><b>10. errorNo.cpp</b></font></span></pre>
  <pre><span lang="en-ca"><font size="3"><b>11. initialize.cpp</b></font></span></pre>
  <pre><span lang="en-ca"><font size="3"><b>12. main.cpp (main)</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader.h</b></font></span></pre>
</div>
<pre>#include &quot;Grammar.h&quot;


class CFGReader
{
private:
	char buffer[BufferLength];
	FILE* stream;
	void newRule(const char* str);
	void readRule();
	void addRule(const char* str);
	int addToken(const char* str, bool isTerminal=true);
	void printRule(int index);
public:
	void readFromFile(const char* fileName);
	void optimize();
	void calculateLookAhead();
	void print();
};

</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;

using namespace std;

Grammar grammar;

const char* deliStr=&quot; |\n&quot;;

//this would only remove the immediate left recursion
//and should I expand it to remove all kinds of left-recursion?


void CFGReader::readFromFile(const char* fileName)
{
	if ((stream=fopen(fileName, &quot;r&quot;))==NULL)
	{
		cout&lt;&lt;&quot;cannot open file &quot;&lt;&lt;fileName&lt;&lt;endl;
	}
	else
	{		
		readRule();		
	}
}

void CFGReader::readRule()
{
	char* ptr=buffer;
	char* hold;
	int count=0;

	while (!feof(stream))
	{
		count=0;
		fgets(buffer, BufferLength-1, stream);
		ptr=buffer;	
		
		while ((ptr=strtok(ptr, &quot; \n&quot;))!=NULL)
		{
			if (strcmp(ptr, &quot;|&quot;)!=0)
			{
				//test the first two token to see if old rule continues
				if (count==0)				
				{
					hold=ptr;
				}
				if (count==1)
				{
					if (strcmp(&quot;-&gt;&quot;, ptr)==0)
					{			
						/*
						curIndex=addToken(hold, false);//the index of cur token	
						grammar[curIndex]-&gt;terminal=false;
						newRule();	
						*/
						newRule(hold);
					}
					else
					{								
						addRule(hold);
						addRule(ptr);															
					}
				}
				if (count&gt;=2)//always add
				{
					addRule(ptr);
				}
				count++;
			}
			else
			{
				//this is an alternative production rule
				newRule(NULL);						
			}
			ptr=NULL;
		}		
	}
}

void CFGReader::newRule(const char* str)
{
	grammar.newRule(str);
}

void CFGReader::optimize()
{
	grammar.optimize();
}

void CFGReader::calculateLookAhead()
{
	grammar.calculateLookAhead();
}

void CFGReader::addRule(const char* str)
{
	grammar.addRule(str);
}

int CFGReader::addToken(const char* str, bool isTerminal)
{
	return grammar.addToken(str, isTerminal);
}

void CFGReader::printRule(int index)
{
	grammar.printRule(index);
}

void CFGReader::print()
{
	grammar.print();
	//grammar.printToken(true);
	grammar.buildTable(false);
	grammar.printTable(false);
	
	//grammar.printAllRules();
}



</pre>
<pre>
<span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar.h</b></font></span></pre>
<pre>#ifndef GRAMMAR_H
#define GRAMMAR_H
#include &lt;bitset&gt;
#include &lt;iostream&gt;
using namespace std;



const int BufferLength=256;
const int MaxTokenCount=100;
const int MaxRHSCount=40;
const int MaxRuleCount=200;
const int MaxGrammarTokenLength=10;
const int MaxProduction=10;
const int TokenTypeCount=38;
const int MaxStateCount=200;
const int MaxItemCount=500;

//a macro to ease the job
//#define BitSet (bitset&lt;MaxItemCount&gt;)


struct GrammarToken
{
	//bool isMeta;
	bool terminal;
	bool isNull;
	char* name;
	int production[MaxProduction];//pointer to the production it gives
	int count;
	int firstCount;
	int followCount;
	int follow[MaxTokenCount];
	int first[MaxTokenCount];
};

struct Item
{
	int varIndex;
	int rulePos;
	int dotPos;	
};

bool operator == (Item&amp; first, Item&amp; second);
/*
Item&amp; operator= (Item&amp; first, Item&amp; second)
{
	first.dotPos=second.dotPos;
	first.rulePos=second.rulePos;
	first.varIndex=second.varIndex;
	return first;
}
*/

class BitSet: public bitset&lt;MaxItemCount&gt;
{
private:
	int current;
public:
	BitSet&amp; operator=(const BitSet&amp; theSet);
	int next();
	BitSet();
	void restart(){current=-1;}
};


class Grammar
{
	//to allow Parser to access Grammar
	friend class Parser;
	friend class LRParser;	
private:	
	int LRTable[MaxStateCount][MaxTokenCount];
	int terminalCount;
	int nonTermCount;
	GrammarToken* token[MaxTokenCount];
	int tokenCount;
	int curIndex;//the index of current token
	int curPos;//the position at production rule
	//MaxRuleCount&gt;=MaxVariableCount!!!
	//the last one in each rule is reserved for the length
	int production[MaxRuleCount][MaxRHSCount+1];
	
	int prodIndex;//pointing to current production, initialized to -1
	int checkRecursion(int tIndex, int curToken);
	void grammarError(int theVar, int theToken, int rule1, int rule2);
	void addFirstIntoTable(int theVariable, int theFirst, int theRule);
	void addFollowIntoTable(int theVariable, int theRule);
	void addBeginEnd();
	void addEpsilonForToken(int tIndex);
	bool addIntoFirst(int target, int source);
	bool addFirst(int target, int source);
	void shiftRule(int ruleIndex, bool Left2Right, int offset);
	void addAtEnd(int ruleIndex, int toAdd);
	void addRuleForToken(int tIndex, int ruleIndex);
	int copyRule(int source, int target);//return the position of -1
	void replaceRule(int curToken, int ruleIndex);
	int findMetaSymbol(int&amp; begin, int&amp; end);
	void initialize();
	void doReplaceMetaSymbol(int ruleIndex, int begin, int end);
	void removeRuleFromToken(int tIndex, int ruleIndex);
	void checkEpsilon(int ruleIndex);
	void removeImmediate(int tIndex, int ruleIndex);
	void doAddRule(int tIndex);
	void commonLeftFactor();
	int findCommonFactor(int tokenIndex, int ruleIndex);
	void removeLeftRecursion();
	void doCommonFactor(int tokenIndex, int ruleIndex, int index);
	int forgeToken(int index);//create a new variable
	void epsilonRule(int ruleIndex);
	bool isRedundant(int tIndex);
	void replaceMetaSymbol();
	void calculateFirst();
	void calculateFollow();
	void calculateNull();
	bool Null(int tIndex);
	bool addFollow(int target, int source);
	bool addIntoFollow(int target, int source);
	bool addFollowIntoFollow(int target, int source);
    void removeRedundant();
	void removeToken(int tIndex);
	void LRoptimize();
	void LLoptimize();
	void buildLLTable();
	void buildLRTable();
	int ruleLen(int ruleIndex);	
	void calculateProperty();

	int item2Rule(int itemIndex);
	int item2Var(int itemIndex);
	int shift2Mask(int input);
	int reduce2Mask(int input);
	void printDFA();
	bool* expandFinished;
	int mask2State(int input);
	bool isReduce(int input);
	bool isShift(int input);
	//items-related methods
	void initializeDFA();
	void uninitializeDFA();
	void constructDFA();
	//void addItems(int itemIndex, bitset&lt;MaxStateCount&gt;* theSet);
	int addItem(Item&amp; theItem);
	int itemCount;
	int stateCount;
	Item**items;
	//int addState(bitset
	void addSonItem(int tIndex, BitSet&amp; theSet);
	int createState(const BitSet&amp; theSet);
	BitSet** sets;
	int createItem(int theVar, int theNo, int position);
	void addClosure(int itemIndex, BitSet&amp; theSet);
	void expandState(int stateIndex);
	int findNextTokenOfItem(int itemIndex);
	void doExpandState(int itemIndex, int tIndex, BitSet&amp; theSet);
	void add2Table(int stateIndex, int targetState, int tIndex);//this is shift
	void add2Table(int stateIndex, int ruleIndex);//this is reduce
public:
	Grammar();
	void buildTable(bool forLLGrammar=true);
	void optimize(bool forLLGrammar=true);
	void printRule(int index);
	void printRule(int tIndex, int rulePos, int dotPos);
	void print();
	void printTable(bool isLL=true);
	void printAllRules();
	void printToken(bool onlyVar=false);
	int varCount();
	int termCount();
	void addRule(const char* str);
	void newRule(const char* str);//this is an internal method, not suitable for outside
	GrammarToken* operator[](int index);
	int addToken(const char* str, bool isTerminal=true);
	GrammarToken* createToken(const char* theName, bool isTerminal);
	int tokenIndex(const char* str);
	void calculateLookAhead();
};

#endif</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &lt;iomanip&gt;
#include &quot;errorNo.h&quot;
#include &quot;Grammar.h&quot;

using namespace std;
/*
#define REDUCEMASK			0xd10000000
#define SHIFTMASK			0xd20000000
#define STATEMASK			0xd01111111
#define SHIFTREDUCEMASK		0xd30000000
*/

const  int REDUCEMASK  =			268435456;
const  int SHIFTMASK	 =		    536870912;
const  int STATEMASK	  =	    	268435455;  
const  int SHIFTREDUCEMASK =	    805306368;
const  int ACCEPTANCE		=       SHIFTREDUCEMASK;


const char* emptyStr=&quot;e&quot;;
const char* optionBegin=&quot;{&quot;;
const char* optionEnd=&quot;}&quot;;
const char* startStr=&quot;START&quot;;
const char* endStr=&quot;$&quot;;
int startSymbolIndex=-1;
int stackBottomIndex=-1;
int beginIndex=-1;
int endIndex=-1;
int emptyIndex=-1;
int epsilonIndex=-1;

int table[MaxTokenCount][MaxTokenCount];

char* terminalStr[TokenTypeCount]=
{
	//GENERAL TYPE 5
	&quot;i&quot;, &quot;n&quot;, &quot;c&quot;, &quot;COMMENT&quot;, &quot;ERROR&quot;,
	//THE FOLLOWING ARE SYMBOL TYPE	18
	&quot;(&quot;, &quot;)&quot;, &quot;;&quot;, &quot;+&quot;, &quot;-&quot;, &quot;*&quot;, 
    &quot;/&quot;, &quot;:=&quot;, &quot;&lt;&quot;, &quot;&gt;&quot;, &quot;=&quot;, &quot;&lt;=&quot;,
	&quot;&gt;=&quot;, &quot;!=&quot;, &quot;[&quot;, &quot;]&quot;, &quot;,&quot;, 
	&quot;:&quot;, 
	//THE FOLLOWING ARE RESERVED TYPE 15
	&quot;begin&quot;, &quot;end&quot;, &quot;program&quot;, &quot;variables&quot;,&quot;integer&quot;, &quot;array&quot;, &quot;char&quot;, 
	&quot;module&quot;, &quot;if&quot;, &quot;then&quot;, &quot;else&quot;, &quot;loop&quot;, &quot;exit&quot;, &quot;read&quot;, &quot;write&quot;
};

//these are &quot;hash-table-like&quot; tables
//this is exactly the opposite of below
int token2type[MaxTokenCount];
//this is to translate tokenType in scanner to token-index of grammar
int type2token[MaxTokenCount];

int matchTokens(const char* str)
{
	for (int i=0; i&lt;TokenTypeCount; i++)
	{
		if (strcmp(terminalStr[i], str)==0)
		{
			return i;
		}
	}
	return -1;
}



void Grammar::calculateProperty()
{
	terminalCount=nonTermCount=0;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (token[i]-&gt;terminal)
		{
			terminalCount++;
			continue;
		}
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			int len;			
			len=ruleLen(token[i]-&gt;production[j]);
			production[token[i]-&gt;production[j]][MaxRHSCount]=len;
		}
		nonTermCount++;
	}
}
			

void Grammar::addEpsilonForToken(int tIndex)
{
	if (epsilonIndex==-1)
	{
		epsilonRule(++prodIndex);
		addRuleForToken(tIndex, prodIndex);
	}
	else
	{
		addRuleForToken(tIndex, epsilonIndex);
	}
}

void Grammar::printTable(bool isLL)
{
	/*
	cout&lt;&lt;&quot;         &quot;;
	for (int i=0; i&lt;tokenCount; i++)
	{
		cout&lt;&lt;&quot;| &quot;&lt;&lt;i;
	}
	cout&lt;&lt;endl;
	*/
	if (isLL)
	{
		for (int r=0; r&lt;tokenCount; r++)
		{
			if (token[r]-&gt;terminal)
			{
				continue;
			}
			cout&lt;&lt;token[r]-&gt;name&lt;&lt;&quot;:&quot;;
			for (int c=0; c&lt;tokenCount; c++)
			{
				if (table[r][c]!=-1)
				{
					cout&lt;&lt;token[c]-&gt;name&lt;&lt;&quot;=&quot;&lt;&lt;table[r][c]&lt;&lt;&quot;,&quot;;
				}
			}
			cout&lt;&lt;endl;
		}
	}
	else
	{
		cout&lt;&lt;&quot;the following is LR table\n&quot;;
		cout&lt;&lt;&quot;    &quot;;
		for (int col=0; col&lt;tokenCount; col++)
		{
			cout&lt;&lt;token[col]-&gt;name&lt;&lt;&quot;,&quot;;
		}
		cout&lt;&lt;endl;
		for (int r=0; r&lt;stateCount; r++)
		{
			cout&lt;&lt;r&lt;&lt;&quot;|  &quot;;
			for (int c=0; c&lt;tokenCount; c++)
			{
				int result=LRTable[r][c];
				if (isReduce(result))
				{
					cout&lt;&lt;&quot;r&quot;;
					cout&lt;&lt;mask2State(result);
				}
				else
				{
					if (isShift(result))
					{
						cout&lt;&lt;&quot;s&quot;;
						cout&lt;&lt;mask2State(result);
					}
					else
					{
						cout&lt;&lt;result;
					}
				}
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;endl;
		}
	}

}

void Grammar::buildLRTable()
{
	constructDFA();
	printDFA();
	cout&lt;&lt;&quot;\n&quot;;
	uninitializeDFA();
}
	
//initialized at &quot;initialize()&quot;
void Grammar::buildLLTable()
{
	int typeIndex;
	for (int r=0; r&lt;tokenCount; r++)
	{	
		if (token[r]-&gt;terminal)
		{
			typeIndex=matchTokens(token[r]-&gt;name);
			if (typeIndex!=-1)
			{
				type2token[typeIndex]=r;
				token2type[r]=typeIndex;
			}	
		}
		//initialize
		/*
		for (int c=0; c&lt;MaxTokenCount; c++)
		{
			table[r][c]=-1;
		}
		*/
	}

	for (int i=0; i&lt;tokenCount; i++)
	{	
		if (token[i]-&gt;terminal)
		{
			continue;
		}
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			int k=0, theRule=token[i]-&gt;production[j];
			int theToken=production[theRule][k];
			while (theToken!=-1)
			{
				addFirstIntoTable(i, theToken, theRule);
				if (!token[theToken]-&gt;isNull)
				{
					break;
				}
				k++;
				theToken=production[theRule][k];
			}
			if (theToken==-1)
			{
				addFollowIntoTable(i, theRule);
			}
		}
	}
}



void Grammar::buildTable(bool forLLGrammar)
{
	if (forLLGrammar)
	{
		buildLLTable();
	}
	else
	{
		buildLRTable();
	}
}
	/*
	nCount=tCount=0;

	for (int r=0; r&lt;MaxTokenCount; r++)
	{
		if (r&lt;tokenCount)
		{
			if (token[r]-&gt;terminal)
			{
				typeIndex=matchTokens(token[r]-&gt;name);
				if (typeIndex!=-1)
				{
					type2token[typeIndex]=r;
					token2type[r]=typeIndex;
				}
				//I am making a invertable table!!!
				tArray[r]=tCount++;
			}
			else
			{
				//it is an invertable table!!!
				nArray[r]=nCount++;
			}
		}

		for (int c=0; c&lt;MaxTokenCount; c++)
		{
			table[r][c]=-1;
		}
	}

	for (int i=0; i&lt;tCount; i++)
	{
		
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			int k=0, theRule=token[i]-&gt;production[j];
			int theToken=production[theRule][k];
			while (theToken!=-1)
			{
				addFirstIntoTable(theToken, j);
				if (!token[theToken]-&gt;isNull)
				{
					break;
				}
				k++;
				theToken=production[theRule][k];
			}
			if (theToken==-1)
			{
				addFollowIntoTable(j);
			}
		}
	}

}
*/

void Grammar::grammarError(int theVar, int theToken, int rule1, int rule2)
{
	cout&lt;&lt;&quot;error! the conflict of grammar at &quot;;
	cout&lt;&lt;token[theVar]-&gt;name&lt;&lt;&quot; for token of &quot;
		&lt;&lt;token[theToken]-&gt;name
		&lt;&lt;&quot; between rules of \n&quot;;
	printRule(rule1);
	cout&lt;&lt;&quot;     and      &quot;;
	printRule(rule2);
	cout&lt;&lt;endl;
}

void Grammar::addFollowIntoTable(int theVariable, int theRule)
{
	int temp;
	for (int i=0; i&lt;token[theVariable]-&gt;followCount; i++)
	{
		temp=token[theVariable]-&gt;follow[i];
		if (table[theVariable][temp]!=theRule)
		{
			if (table[theVariable][temp]!=-1)
			{
				grammarError(theVariable, temp, theRule, table[theVariable][temp]);
			}
			table[theVariable][temp]=theRule;
		}
	}
}

void Grammar::addFirstIntoTable(int theVariable, int theFirst, int theRule)
{
	for (int i=0; i&lt;token[theFirst]-&gt;firstCount; i++)
	{
		if (table[theVariable][token[theFirst]-&gt;first[i]]!=theRule)
		{
			if (table[theVariable][token[theFirst]-&gt;first[i]]!=-1)
			{
				grammarError(theVariable, token[theFirst]-&gt;first[i], theRule,
					table[theVariable][token[theFirst]-&gt;first[i]]);
			}
			table[theVariable][token[theFirst]-&gt;first[i]]=theRule;
		}
	}
}


int Grammar::varCount()
{
	int result=0;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (!token[i]-&gt;terminal)
		{
			result++;
		}
	}
	return result;
}

int Grammar::termCount()
{
	int result=0;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (token[i]-&gt;terminal)
		{
			result++;
		}
	}
	return result;
}



//leave the rule unremoved!!! as I have no way to do it!!!
void Grammar::removeToken(int tIndex)
{	
	delete[]token[tIndex]-&gt;name;
	delete token[tIndex];
	tokenCount--;
	for (int i=tIndex; i&lt;tokenCount; i++)
	{
		token[i]=token[i+1];
	}
}
	

bool Grammar::isRedundant(int tIndex)
{
	int k, theRule, theToken;
	for (int i=0; i&lt;tokenCount; i++)
	{
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			k=0;
			theRule=token[i]-&gt;production[j];
			theToken=production[theRule][k];
			while (theToken!=-1)
			{
				if (theToken==tIndex)
				{
					return false;
				}
				k++;
				theToken=production[theRule][k];
			}
		}
	}
	return true;
}


//what is redundant? except the start variable
//the non-terminal never appears in other rules!
void Grammar::removeRedundant()
{
	int tIndex=1;
	bool findNew=false;
	while (tIndex&lt;tokenCount)
	{
		findNew=false;
		if (!token[tIndex]-&gt;terminal)
		{
			if (isRedundant(tIndex))
			{
				removeToken(tIndex);
				findNew=true;
			}
		}
		if (findNew)
		{
			tIndex=1;
		}
		else
		{
			tIndex++;
		}
	}			
}

void Grammar::printAllRules()
{
/*
	cout&lt;&lt;&quot;\nnow print all rules\n&quot;;
	for (int i=0; i&lt;=prodIndex; i++)
	{
		cout&lt;&lt;&quot;rule index &quot;&lt;&lt;i&lt;&lt;&quot;: &quot;;
		printRule(i);
		cout&lt;&lt;&quot;\n&quot;;
	}
	*/
	int sum=0;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (!token[i]-&gt;terminal)
		{
			sum+=token[i]-&gt;count;
		}
	}
	cout&lt;&lt;&quot; total rules is:&quot;&lt;&lt;prodIndex+1;
	cout&lt;&lt;&quot;\n and the sum is &quot;&lt;&lt;sum&lt;&lt;endl;

}



void Grammar::addBeginEnd()
{
	int begin=addToken(startStr, false);
	int end=addToken(endStr, true);
	prodIndex++;
	production[prodIndex][0]=0;
	production[prodIndex][1]=end;
	production[prodIndex][2]=-1;
	addRuleForToken(begin, prodIndex);
	startSymbolIndex=begin;
	stackBottomIndex=end;
}
/*
void Grammar::calculateProperty()
{
	calculateNull();
	calculateFirst();
	calculateFollow();
}
*/
void Grammar::printToken(bool onlyVar)
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (onlyVar)
		{
			if (token[i]-&gt;terminal)
			{
				continue;
			}
		}
		cout&lt;&lt;&quot;name: &quot;&lt;&lt;token[i]-&gt;name&lt;&lt;endl;
		cout&lt;&lt;&quot;   isNull: &quot;&lt;&lt;(token[i]-&gt;isNull?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   isTerminal: &quot;&lt;&lt;(token[i]-&gt;terminal?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   first: &quot;;
		for (int j=0; j&lt;token[i]-&gt;firstCount; j++)
		{
			if (j!=0)
			{
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;&quot;[&quot;&lt;&lt;j&lt;&lt;&quot;]=&quot;&lt;&lt;token[token[i]-&gt;first[j]]-&gt;name;			
		}
		cout&lt;&lt;&quot;\n   follow: &quot;;
		for (j=0; j&lt;token[i]-&gt;followCount; j++)
		{
			if (j!=0)
			{
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;&quot;[&quot;&lt;&lt;j&lt;&lt;&quot;]=&quot;&lt;&lt;token[token[i]-&gt;follow[j]]-&gt;name;
		}
		cout&lt;&lt;endl;
	}
}

//must use while loop to discover new set!!!!!
void Grammar::calculateFirst()
{
	int i;
	bool addNew;
	do
	{
		i=0;
		addNew=false;
		while (i&lt;tokenCount)
		{		
			//the terminal contains itself
			if (token[i]-&gt;terminal)
			{
				//addFirst don't judge if it is nullable!!
				addFirst(i, i);			
				//token[i]-&gt;first[token[i]-&gt;firstCount++]=i;
			}
			else
			{
				//for all its rules
				for (int j=0; j&lt;token[i]-&gt;count; j++)
				{
					int theToken, k=0;
					theToken=production[token[i]-&gt;production[j]][k];
					//for each token in each rule
					do 
					{
						//add non epsilon set
						if (addIntoFirst(i, theToken))
						{
							addNew=true;
						}
						//if it is not null, means it is end
						if (!token[theToken]-&gt;isNull)
						{
							break;
						}
						//if it is nullable, continue
						k++;
						theToken=production[token[i]-&gt;production[j]][k];
					}while (theToken!=-1);
					//it means all token in this rule is nullable, so
					if (theToken==-1)
					{
						//it is nullable
						//addEpsilonIntoFirst(i);
						addFirst(i, emptyIndex);
					}
				}
			}
			i++;
		}		
	}while (addNew);
}

bool Grammar::addFollowIntoFollow(int target, int source)
{
	bool addNew=false;
	for (int i=0; i&lt;token[source]-&gt;followCount; i++)
	{
		if (addFollow(target, token[source]-&gt;follow[i]))
		{
			addNew=true;
		}
	}
	return addNew;
}

bool Grammar::addIntoFollow(int target, int source)
{
	bool addNew=false;
	if (source==-1)
	{
		return false;
	}
	for (int i=0; i&lt;token[source]-&gt;firstCount; i++)
	{
		//add non-epsilon
		if (!token[token[source]-&gt;first[i]]-&gt;isNull)
		{
			if (addFollow(target, token[source]-&gt;first[i]))
			{
				addNew=true;
			}
		}
	}
	return addNew;
}


void Grammar::calculateFollow()
{		
	int i;
	bool addNew, started;
	//token[startSymbolIndex]-&gt;follow[0]=stackBottomIndex;
	//token[startSymbolIndex]-&gt;followCount=1;
	do
	{
		i=0;
		addNew=false;
		while (i&lt;tokenCount)
		{		
			//the terminal contains itself
			if (!token[i]-&gt;terminal)
			{
				for (int tIndex=0; tIndex&lt;tokenCount; tIndex++)
				{
					//for all its rules
					if (token[tIndex]-&gt;terminal)
					{
						continue;
					}
					//for each its rule
					for (int j=0; j&lt;token[tIndex]-&gt;count; j++)
					{
						int theToken, k=0, theRule=token[tIndex]-&gt;production[j];
						started=false;
						theToken=production[theRule][k];
						//for each token in each rule
						do 
						{
							//the token appears here
							if (started)
							{
								if (addIntoFollow(i, theToken))
								{
									addNew=true;
								}
								if (!token[theToken]-&gt;isNull)
								{
									break;
								}
							}
							if (theToken==i)
							{
								started=true;
								//add non epsilon set, including -1 situation!!!								
							}
							
							//if it is not null, means it is end
							
							//if it is nullable, continue
							k++;
							theToken=production[theRule][k];
						}while (theToken!=-1);
						//it means all token in this rule is nullable, so
						if (started&amp;&amp;theToken==-1)
						{
							//it is nullable
							//add current variable Follow(j) into Follow(i);
							if (addFollowIntoFollow(i, tIndex))
							{
								addNew=true;
							}
						}
					}
				}					
			}
			i++;
		}		
	}while (addNew);
}

void Grammar::addRuleForToken(int tIndex, int ruleIndex)
{
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=ruleIndex;
}

void Grammar::shiftRule(int ruleIndex, bool left2Right, int offset)
{
	int end=0;
	/*
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	*/
	end=ruleLen(ruleIndex);
	if (left2Right)
	{
		for (int i=end; i&gt;=0; i--)
		{
			production[ruleIndex][i+offset]=production[ruleIndex][i];
		}
	}
	else
	{
		for (int i=0; i&lt;=end-offset; i++)
		{
			production[ruleIndex][i]=production[ruleIndex][i+offset];
		}
		checkEpsilon(ruleIndex);
	}
}

void Grammar::checkEpsilon(int ruleIndex)
{
	if (production[ruleIndex][0]==-1)
	{
		epsilonRule(ruleIndex);
	}
}


bool Grammar::addFollow(int target, int source)
{
	//check if it is already there
	for (int i=0; i&lt;token[target]-&gt;followCount; i++)
	{
		if (token[target]-&gt;follow[i]==source)
		{
			return false;
		}
	}
	//add at end
	token[target]-&gt;follow[token[target]-&gt;followCount++]=source;
	return true;
}

bool Grammar::addFirst(int target, int source)
{
	//check if it is already there
	for (int i=0; i&lt;token[target]-&gt;firstCount; i++)
	{
		if (token[target]-&gt;first[i]==source)
		{
			return false;
		}
	}
	//add at end
	token[target]-&gt;first[token[target]-&gt;firstCount++]=source;
	return true;
}

//add non epsilon into it.
bool Grammar::addIntoFirst(int target, int source)
{
	bool addNew=false;
	if (token[source]-&gt;terminal)
	{
		if (!token[source]-&gt;isNull)
		{
			return addFirst(target, source);
		}
		else
		{
			return false;
		}
	}
	for (int i=0; i&lt;token[source]-&gt;firstCount; i++)
	{
		//add non epsilon
		if (!token[token[source]-&gt;first[i]]-&gt;isNull)
		{
			if (addFirst(target, token[source]-&gt;first[i]))
			{
				addNew=true;
			}
		}
	}
	return addNew;
}


bool Grammar::Null(int tIndex)
{
	if (token[tIndex]-&gt;terminal)
	{
		return token[tIndex]-&gt;isNull;		
	}
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		int j=0, theToken;
		theToken=production[token[tIndex]-&gt;production[i]][j];
		do
		{
			if (theToken==tIndex||!Null(theToken))
			{
				break;
			}
			j++;
			theToken=production[token[tIndex]-&gt;production[i]][j];
		}while (theToken!=-1);
		if (theToken==-1)
		{			
			return true;
		}
	}
	return false;
}

void Grammar::calculateNull()
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		token[i]-&gt;isNull=Null(i);	
	}
}
	
int Grammar::findMetaSymbol(int&amp; begin, int&amp; end)
{
	int theRule, theToken, k;
	begin=end=-1;
	for (int i=0; i&lt;tokenCount; i++)
	{
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			k=0;
			theRule=token[i]-&gt;production[j];
			theToken=production[theRule][k];
			while (theToken!=-1)
			{
				if (theToken==beginIndex)
				{
					begin=k;
				}
				if (theToken==endIndex)
				{
					end=k;
					return theRule;
				}
				k++;
				theToken=production[theRule][k];
			}
		}
	}
	return -1;
}

	
void Grammar::doReplaceMetaSymbol(int ruleIndex, int begin, int end)
{
	int newTokenIndex=forgeToken(0), i=0;

	addRuleForToken(newTokenIndex, ++prodIndex);
	//token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=++prodIndex;
	//token[newTokenIndex]-&gt;terminal=false;

	copyRule(ruleIndex, prodIndex);
	//shrink 
	while (production[ruleIndex][i+end+1]!=-1)
	{
		production[ruleIndex][begin+i]=production[ruleIndex][i+end+1];
		i++;
	}
	production[ruleIndex][begin+i]=-1;
	addAtEnd(ruleIndex, newTokenIndex);
	/*
	production[ruleIndex][begin]=newTokenIndex;
	production[ruleIndex][begin+1]=-1;
	*/

	shiftRule(prodIndex, false, begin+1);
	
	production[prodIndex][end-begin-1]=newTokenIndex;
	production[prodIndex][end-begin]=-1;

	addEpsilonForToken(newTokenIndex);
	/*
	if (epsilonIndex==-1)
	{
		eRuleIndex=++prodIndex;
		epsilonRule(eRuleIndex);
	}
	else
	{
		eRuleIndex=epsilonIndex;
	}
	addRuleForToken(newTokenIndex, eRuleIndex);
	*/
}
		
void Grammar::replaceMetaSymbol()
{
	int begin, end, ruleIndex;
	while ((ruleIndex=findMetaSymbol(begin, end))!=-1)
	{
		doReplaceMetaSymbol(ruleIndex, begin, end);
	}
	/*
	for (int i=0; i&lt;tokenCount; i++)
	{
		//if (token[i]-&gt;isMeta)
		if (i==beginIndex||i==endIndex)
		{
			removeToken(i);
		}
	}
	*/
}


void Grammar::addAtEnd(int ruleIndex, int toAdd)
{
	int end;
	end=ruleLen(ruleIndex);
	production[ruleIndex][end++]=toAdd;
	production[ruleIndex][end]=-1;
}

//left-recursion:  A -&gt; A a | b | c
//change to: A  -&gt; b A' | c A'
//			 A' -&gt; a A' | empty
void Grammar::removeImmediate(int tIndex, int ruleIndex)
{
	int newIndex=forgeToken(tIndex);
	int holdRuleIndex=token[tIndex]-&gt;production[ruleIndex];
	//sequence matters!

	//change to: A -&gt; b A'
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		if (i!=ruleIndex)
		{
			addAtEnd(token[tIndex]-&gt;production[i], newIndex);
		}
	}
	
	//shift
	removeRuleFromToken(tIndex, ruleIndex);
	
	addRuleForToken(newIndex, holdRuleIndex);
	//token[newIndex]-&gt;production[token[newIndex]-&gt;count++]=holdRuleIndex;

	shiftRule(holdRuleIndex, false, 1);
	addAtEnd(holdRuleIndex, newIndex);
	
	//add epsilon rule for new variable
	addEpsilonForToken(newIndex);
	/*
	epsilonRule(++prodIndex);
	addRuleForToken(newIndex, prodIndex);
	*/
}


int Grammar::forgeToken(int index)
{
	char str[MaxGrammarTokenLength+2], ch;
	int len=strlen(token[index]-&gt;name);
	int temp=0, i=0;
	strcpy(str, token[index]-&gt;name);
	ch=str[len-1];//get last char of name
	if (ch&gt;='0'&amp;&amp;ch&lt;'9')
	{
		str[len-1]=ch+i+1;
	}
	else
	{
		str[len]='0'+i;
		str[len+1]='\0';
	}
	//I won't bother to check repetitation of name
	while (tokenIndex(str)!=-1)
	{
		i++;
		if (ch&gt;='0'&amp;&amp;ch&lt;'9')
		{
			str[len-1]=ch+i+1;
		}
		else
		{
			str[len]='0'+i;
			str[len+1]='\0';
		}
	}

	return addToken(str, false);//it is non-terminal for sure	
}

int Grammar::copyRule(int source, int target)
{
	int i=0;
	while (production[source][i]!=-1) 
	{
		production[target][i]=production[source][i];
		i++;
	}
	production[target][i]=-1;
	return i;
}

void Grammar::doAddRule(int tIndex)
{	
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=++prodIndex;
}

void Grammar::addRule(const char* str)
{
	int index;
	index=addToken(str);
	production[prodIndex][curPos++]=index;
	production[prodIndex][curPos]=-1;//set end
}


//if the token is already there, it return the index
//otherwise, it add new node in token array
int Grammar::addToken(const char* str, bool isTerminal)
{
	int index;
	index=tokenIndex(str);
	if (index==-1)
	{
		index=tokenCount;
	}
	else
	{		
		return index;
	}
	token[index]=createToken(str, isTerminal);
	if (strcmp(str, optionBegin)==0)
	{
		beginIndex=index;
	}
	if (strcmp(str, optionEnd)==0)
	{
		endIndex=index;
	}
	tokenCount++;
	if (strcmp(str, emptyStr)==0)
	{
		token[index]-&gt;isNull=true;
		emptyIndex=index;
	}
	return index;
}

void Grammar::newRule(const char* str)
{
	if (str!=NULL)
	{
		curIndex=addToken(str, false);
	}
	//add one pointer
	token[curIndex]-&gt;production[token[curIndex]-&gt;count++]=++prodIndex;
	token[curIndex]-&gt;terminal=false;
	curPos=0;//reset to restart;
}

GrammarToken* Grammar::createToken(const char* theName, bool isTerminal)
{
	GrammarToken* ptr=new GrammarToken;
	ptr-&gt;name=new char[strlen(theName)+1];
	strcpy(ptr-&gt;name, theName);
	ptr-&gt;terminal=isTerminal;
	ptr-&gt;count=ptr-&gt;firstCount=ptr-&gt;followCount=0;
	ptr-&gt;isNull=false;
	return ptr;
}

int Grammar::tokenIndex(const char* str)
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (strcmp(str, token[i]-&gt;name)==0)
		{
			return i;
		}
	}
	return -1;
}

int Grammar::checkRecursion(int tIndex, int curToken)
{
	for (int i=0; i&lt;token[curToken]-&gt;count; i++)
	{
		//token[tIndex]-&gt;production[i]=ruleIndex
		//production[ruleIndex][0] is the first left-recursion one
		if (production[token[curToken]-&gt;production[i]][0]&lt;=curToken)
		{
			return i;
		}
	}
	return -1;
}

void Grammar::printRule(int index)
{
	int nodeIndex=0;
	//cout&lt;&lt;&quot; ~&quot;&lt;&lt;index&lt;&lt;&quot;~ &quot;;
	while (production[index][nodeIndex]!=-1)
	{
		cout&lt;&lt;token[production[index][nodeIndex]]-&gt;name&lt;&lt;&quot; &quot;;
		nodeIndex++;
	}
}


void Grammar::printRule(int tIndex, int rulePos, int dotPos)
{
	int nodeIndex=0;
	cout&lt;&lt;token[tIndex]-&gt;name&lt;&lt;&quot; ==&gt; &quot;;
	
	//cout&lt;&lt;&quot; ~&quot;&lt;&lt;index&lt;&lt;&quot;~ &quot;;
	while (production[token[tIndex]-&gt;production[rulePos]][nodeIndex]!=-1)
	{
		if (nodeIndex==dotPos)
		{
			cout&lt;&lt;&quot; . &quot;;
		}
		cout&lt;&lt;token[production[token[tIndex]-&gt;production[rulePos]][nodeIndex]]-&gt;name&lt;&lt;&quot; &quot;;
		nodeIndex++;
	}
	if (nodeIndex==dotPos)
	{
		cout&lt;&lt;&quot; . &quot;;
	}
	//printRule(token[tIndex]-&gt;production[rulePos]);
}


void Grammar::initialize()
{
	tokenCount=curIndex=curPos=0;
	prodIndex=-1;//in order to ++ blindly
	for (int i=0; i&lt;MaxStateCount; i++)
	{
		for (int j=0; j&lt;MaxTokenCount; j++)
		{
			LRTable[i][j]=-1;
		}
	}
	for (i=0; i&lt;MaxTokenCount; i++)
	{
		for (int j=0; j&lt;MaxTokenCount; j++)
		{
			table[i][j]=-1;
		}
	}
}

Grammar::Grammar()
{
	initialize();
}

void Grammar::removeLeftRecursion()
{
	int tIndex=0, curToken=0;
	bool newChange=false;
	while (tIndex&lt;tokenCount)
	{		
		if (!token[tIndex]-&gt;terminal)
		{
			for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
			{
				curToken=production[token[tIndex]-&gt;production[i]][0];
				if (curToken&lt;=tIndex&amp;&amp;!token[curToken]-&gt;terminal)
				{
					if (curToken!=tIndex)
					{
						replaceRule(tIndex, i);
					}
					else
					{
						removeImmediate(tIndex, i);
					}
					newChange=true;				
				}
			}
		}
		//whenever there is some new findings, restart
		if (!newChange)
		{
			tIndex++;
		}
		else
		{
			tIndex=0;
			newChange=false;
		}
	}
}

void Grammar::replaceRule(int tIndex, int ruleIndex)
{
	int pos, j, targetEnd, sourceEnd, curRule;
	curRule=token[tIndex]-&gt;production[ruleIndex];
	int curToken=production[curRule][0];
	for (int i=token[curToken]-&gt;count-1; i&gt;=0; i--)
	{
		if (i&gt;0)
		{
			doAddRule(tIndex);
			pos=copyRule(token[curToken]-&gt;production[i], prodIndex);
			j=0;
			while (production[curRule][j+1]!=-1)
			{
				production[prodIndex][pos+j]=production[curRule][j+1];
				j++;
			}
			production[prodIndex][pos+j]=-1;
			//addRuleForToken(curToken, prodIndex);
		}
		else
		{
			targetEnd=sourceEnd=0;
			//curRule=token[tIndex]-&gt;production[ruleIndex];
			while (true)
			{
				if (production[token[curToken]-&gt;production[0]][sourceEnd]==-1&amp;&amp;
				production[curRule][targetEnd]==-1)
				{
					break;
				}
				if (production[token[curToken]-&gt;production[0]][sourceEnd]!=-1)
				{
					sourceEnd++;
				}
				if (production[curRule][targetEnd]!=-1)
				{
					targetEnd++;
				}
			}
			j=targetEnd+sourceEnd-1;
			while (j&gt;=0)
			{
				if (j&gt;=sourceEnd)
				{
					production[curRule][j]=production[curRule][j-sourceEnd+1];
				}
				else
				{
					production[curRule][j]=production[token[curToken]-&gt;production[0]][j];
				}
				j--;
			}
		}
	}
}

void Grammar::calculateLookAhead()
{
	replaceMetaSymbol();
	calculateNull();
	cout&lt;&lt;setw(20)&lt;&lt;&quot;The Rule                         &quot;&lt;&lt;&quot;Look-Ahead      &quot;&lt;&lt;endl;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (token[i]-&gt;terminal)
		{
			continue;
		}
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			int theRule, theToken, k=0;
			theRule=token[i]-&gt;production[j];			
			cout&lt;&lt;setw(5);
			cout&lt;&lt;token[i]-&gt;name&lt;&lt;&quot; -&gt; &quot;;
			cout&lt;&lt;setiosflags(ios::left);
			printRule(theRule);			
			cout&lt;&lt;&quot;               &quot;;
			theToken=production[theRule][k];
			while (theToken!=-1)
			{				
				if (k!=0)
				{
					cout&lt;&lt;&quot;+&quot;;
				}
				cout&lt;&lt;&quot;First(&quot;&lt;&lt;token[theToken]-&gt;name&lt;&lt;&quot;)&quot;;

				if (!token[theToken]-&gt;isNull)			
				{
					break;
				}
				k++;
				theToken=production[theRule][k];
			}
			if (theToken==-1)
			{
				if (k!=0)
				{
					cout&lt;&lt;&quot;+&quot;;
				}
				cout&lt;&lt;&quot;Follow(&quot;&lt;&lt;token[i]-&gt;name&lt;&lt;&quot;)&quot;;
			}
			cout&lt;&lt;&quot;\n&quot;;
		}
	}
}

void Grammar::LLoptimize()
{
	removeLeftRecursion();
	commonLeftFactor();
}


void Grammar::LRoptimize()
{
	calculateProperty();

}

//the return value is the count, not the index of rule
int Grammar::ruleLen(int ruleIndex)
{
	int end=0;
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	return end;
}

//optimize sequence is first common-factor then remove left recursion
//therefore I don't have to check if for same variable if there will be
//more than one left-recursion
void Grammar::optimize(bool forLLGrammar)
{
	replaceMetaSymbol();	
	addBeginEnd();
	if (forLLGrammar)
	{
		LLoptimize();
	}
	else
	{
		LRoptimize();
	}
	//removeRedundant();		
	calculateNull();
	calculateFirst();
	calculateFollow();
	buildTable(forLLGrammar);
}

int Grammar::findCommonFactor(int tIndex, int ruleIndex)
{
	for (int i=ruleIndex+1; i&lt;token[tIndex]-&gt;count; i++)
	{	
		//if the two rule has the same first token
		if (production[token[tIndex]-&gt;production[ruleIndex]][0]==
			production[token[tIndex]-&gt;production[i]][0])
		{
			/*
			//calculate if there is epsilon
			if (emptyIndex==-1)
			{
				emptyIndex=tokenIndex(emptyStr);
			}
			//if it is not epsilon
			if (production[token[tIndex]-&gt;production[i]][0]!=emptyIndex)
			{
				return i;
			}
			*/
			return i;
		}
	}
	return -1;
}

void Grammar::epsilonRule(int ruleIndex)
{
	production[ruleIndex][0]=addToken(emptyStr);
	production[ruleIndex][1]=-1;
}

//sample:  x -&gt; Aa 
//         x -&gt; Ab
//changed to:  x -&gt; Ax' //this is to change the old rule
//             x' -&gt; b //this is to change the old rule
//             x' -&gt; a //this is the new-added rule
void Grammar::doCommonFactor(int tIndex, int ruleIndex, int index)
{
	int newTokenIndex=forgeToken(tIndex);//create a new variable
	//move the second and after part to the new rule of new variable
	//doing: x' -&gt; a
	//curPos=0;
	//prepare to add one new rule 	
	addRuleForToken(newTokenIndex, ++prodIndex);
	//token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=++prodIndex;
	token[newTokenIndex]-&gt;terminal=false;

	copyRule(token[tIndex]-&gt;production[ruleIndex], prodIndex);
	shiftRule(prodIndex, false, 1);
	/*
	do
	{
		
		//do copying
		production[prodIndex][curPos]=
			production[token[tIndex]-&gt;production[ruleIndex]][curPos+1];
		curPos++;
	//even the -1 at end is copied
	}while (production[token[tIndex]-&gt;production[ruleIndex]][curPos]!=-1);
	*/

	//in order to show an empty string, in case the string is &quot;epsilon&quot;

	
	/*
	if (curPos==1)
	{
		epsilonRule(prodIndex);
	}
	*/


	//replace x -&gt; Aa with x -&gt; Ax'
	production[token[tIndex]-&gt;production[ruleIndex]][1]=newTokenIndex;
	production[token[tIndex]-&gt;production[ruleIndex]][2]=-1;//end
	
	//doing: x' -&gt; b
	//curPos=0;
	//prepare to add one new rule 
	//pointing new token to where old rule lies
	addRuleForToken(newTokenIndex, token[tIndex]-&gt;production[index]);
	/*
	token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=
		token[tIndex]-&gt;production[index];
		*/

	shiftRule(token[tIndex]-&gt;production[index], false, 1);

	removeRuleFromToken(tIndex, index);
}

void Grammar::removeRuleFromToken(int tIndex, int ruleIndex)
{
	token[tIndex]-&gt;count--;
	for (int i=ruleIndex; i&lt;token[tIndex]-&gt;count; i++)
	{
		token[tIndex]-&gt;production[i]=token[tIndex]-&gt;production[i+1];
	}	
}


void Grammar::commonLeftFactor()
{
	int index=-1, tIndex=0, ruleIndex=0;
	bool flag;
	//whenever a newrule is done, restart!
	while (tIndex&lt;tokenCount)
	{
		ruleIndex=0;
		flag=false;		
		while (ruleIndex&lt;token[tIndex]-&gt;count)
		{			
			index=findCommonFactor(tIndex, ruleIndex);
			if (index!=-1)
			{
				doCommonFactor(tIndex, ruleIndex, index);
				//restart
				flag=true;
				break;
			}
			else
			{
				ruleIndex++;
			}
		}
		if (flag)
		{
			tIndex=0;		
		}
		else
		{
			tIndex++;
		}
	}
}

GrammarToken* Grammar::operator[](int index)
{
	if (index&gt;=0&amp;&amp;index&lt;tokenCount)
	{
		return token[index];
	}
	else
	{
		return NULL;
	}
}

void Grammar::print()
{
	for (int i=0; i&lt;tokenCount; i++)
	{		
		if (!token[i]-&gt;terminal)
		{		
			cout&lt;&lt;token[i]-&gt;name&lt;&lt;&quot; ==&gt; &quot;;
			for (int j=0; j&lt;token[i]-&gt;count; j++)
			{			
				printRule(token[i]-&gt;production[j]);	
				if (j!=token[i]-&gt;count-1)
				{
					cout&lt;&lt;&quot; | &quot;;
				}
			}
			cout&lt;&lt;&quot;\n&quot;;
		}		
	}
}

void Grammar::uninitializeDFA()
{
	for (int i=0; i&lt;itemCount; i++)
	{
		delete items[i];
	}
	delete [] items;
	delete [] expandFinished;

	for (i=0; i&lt;stateCount; i++)
	{
		delete sets[i];
	}
	delete []sets;
}

void Grammar::initializeDFA()
{
	items=new Item*[MaxItemCount];
	for (int i=0; i&lt;MaxItemCount; i++)
	{
		items[i]=new Item;
		items[i]-&gt;rulePos=items[i]-&gt;dotPos=items[i]-&gt;varIndex=-1;	
	}

	sets= new BitSet*[MaxStateCount];
	expandFinished=new bool[MaxStateCount];
	for (i=0; i&lt;MaxStateCount; i++)
	{
		sets[i]=new BitSet;
		sets[i]-&gt;reset();
		expandFinished[i]=false;
	}
	itemCount=stateCount=0;
	//initialize LRTable
	for (i=0; i&lt;MaxStateCount; i++)
	{
		for (int j=0; j&lt;tokenCount; j++)
		{
			LRTable[i][j]=-1;
		}
	}
}


int Grammar::addItem(Item&amp; theItem)
{
	for (int i=0; i&lt;itemCount; i++)
	{
		if (*(items[i])== theItem)
		{
			return i;
		}
	}
	items[i]=new Item;
	*(items[i])=theItem;
	itemCount++;
	return i;
}


void Grammar::addSonItem(int tIndex, BitSet&amp; theSet)
{
	if (!token[tIndex]-&gt;terminal)
	{
		for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
		{
			Item temp;
			int itemIndex;//easy reading
			temp.varIndex=tIndex;
			temp.rulePos=i;
			temp.dotPos=0;
			itemIndex=addItem(temp);
			theSet.set(itemIndex);
			//recursive
			addSonItem(production[token[tIndex]-&gt;production[i]][0], theSet);
		}
	}
}

int Grammar::createItem(int theVar, int theNo, int position)
{
	Item temp;
	temp.varIndex=theVar;
	temp.rulePos=theNo;
	temp.dotPos=position;
	return addItem(temp);
}

void Grammar::constructDFA()
{
	int startIndex, itemIndex, stateIndex;
	BitSet theSet;

	//should initialize everything!!!
	initializeDFA();
	
	startIndex=createItem(startSymbolIndex, 0, 0);
	//theSet.set(startIndex);
	addClosure(startIndex, theSet);
	stateIndex=itemIndex=0;
	stateIndex= createState(theSet);
	expandState(stateIndex);//recursion inside

	//for all choices, do createState(startStart, tansition token) (

}

int Grammar::item2Var(int itemIndex)
{
	int tIndex, rulePos, dotPos, theVar, theRule;
	tIndex=items[itemIndex]-&gt;varIndex;
	rulePos=items[itemIndex]-&gt;rulePos;
	dotPos=items[itemIndex]-&gt;dotPos;
	theRule=token[tIndex]-&gt;production[rulePos];
	//so, get the var
	theVar=production[theRule][dotPos];
	return theVar;
}

void Grammar::addClosure(int itemIndex, BitSet&amp; theSet)
{
	int tIndex, rulePos, dotPos, theVar, theRule;
	
	if (!theSet.test(itemIndex))
	{
		theSet.set(itemIndex);//add itself first!!!
	}
	else
	{
		//prevent looping
		return;
	}
	
	tIndex=items[itemIndex]-&gt;varIndex;
	rulePos=items[itemIndex]-&gt;rulePos;
	dotPos=items[itemIndex]-&gt;dotPos;
	theRule=token[tIndex]-&gt;production[rulePos];
	//so, get the var
	theVar=production[theRule][dotPos];
	//to skip the epsilon
	if (theVar==emptyIndex)
	{
		theVar=production[theRule][dotPos+1];
	}
	//theVar=item2Var(itemIndex);
	if (theVar!=-1)
	{
		if (!token[theVar]-&gt;terminal)
		{
			for (int i=0; i&lt;token[theVar]-&gt;count; i++)
			{
				Item temp;
				int newItemIndex;//easy reading
				temp.varIndex=theVar;
				temp.rulePos=i;
				temp.dotPos=0;
				newItemIndex=addItem(temp);
				//theSet.set(itemIndex);
				//recursive
				addClosure(newItemIndex, theSet);
			}
		}
	}
}

int Grammar::createState(const BitSet&amp; theSet)
{
	for (int i=0; i&lt;stateCount; i++)
	{
		if (*(sets[i])==theSet)
		{
			return i;
		}
	}
	sets[i]=new BitSet;
	*sets[i] = theSet;
	stateCount++;
	return i;
}

//indicating if it is reduceable by returning true
void Grammar::doExpandState(int itemIndex, int tIndex, BitSet&amp; theSet)
{
	int theNewItemIndex;
	//int theNextVar=findNextTokenOfItem(itemIndex);

	theNewItemIndex=createItem(items[itemIndex]-&gt;varIndex, items[itemIndex]-&gt;rulePos,
		items[itemIndex]-&gt;dotPos+1);
	/*
	if (tIndex!=stackBottomIndex)
	{
		acceptedItemIndex=theNewItemIndex;
	}
	*/
	addClosure(theNewItemIndex, theSet);


	//	return findNextTokenOfItem(theNewItemIndex)==-1;//true=reduceable
}
	
int Grammar::item2Rule(int itemIndex)
{
	int ruleIndex, rulePos, varIndex;
	//only for easy reading
	varIndex=items[itemIndex]-&gt;varIndex;
	rulePos=items[itemIndex]-&gt;rulePos;
	ruleIndex =token[varIndex]-&gt;production[rulePos];
	return ruleIndex;
}


void Grammar::expandState(int stateIndex)
{
	int itemIndex, targetState, tIndex, reduceIndex=-1, shiftIndex;
	BitSet theSet;//a new temp
	bool findShift=false, findReduce=false, isAccepted=false;

	//this will prevent looping expanding!!!???
	expandFinished[stateIndex]=true;
	//to see if it is to reduce
	sets[stateIndex]-&gt;restart();
	while ((itemIndex=sets[stateIndex]-&gt;next())!=-1)
	{
		tIndex=findNextTokenOfItem(itemIndex);
		if (tIndex==-1)
		{
			findReduce=true;
			/*
			int ruleIndex, rulePos, varIndex;
			//only for easy reading
			varIndex=items[itemIndex]-&gt;varIndex;
			rulePos=items[itemIndex]-&gt;rulePos;
			ruleIndex =token[varIndex]-&gt;production[rulePos];
			*/
			int ruleIndex=item2Rule(itemIndex);
			add2Table(stateIndex, ruleIndex);
			if (reduceIndex!=-1)
			{
				errorHandle(ReduceReduceConflict, 
					(void*)(items[itemIndex]), (void*)(items[reduceIndex]));
			}
			else
			{
				reduceIndex=itemIndex;
			}
		}
		else
		{
			shiftIndex=itemIndex;			
			findShift=true;
		}
	}

	//how to do in LR(1)????
	//I am currently working on LR(0). I try not to think about it
	if (findReduce&amp;&amp;findShift)
	{
		//in LR(1), this is not necessarily a conflict, because you need to
		//check the first and follow set? I forgot it now. :)
		errorHandle(ShiftReduceConflict, (void*)(items[shiftIndex]),
			(void*)(items[reduceIndex]));
		//temparily
		int temp1, temp2;
		temp1=items[shiftIndex]-&gt;varIndex;
		temp2=items[shiftIndex]-&gt;rulePos;
		cout&lt;&lt;&quot;\nconflict at shift:\n&quot;;
		printRule(temp1, temp2, items[shiftIndex]-&gt;dotPos);
		temp1=items[reduceIndex]-&gt;varIndex;
		temp2=items[reduceIndex]-&gt;rulePos;
		cout&lt;&lt;&quot;\nconflict with reduce:\n&quot;;
		printRule(temp1, temp2, items[reduceIndex]-&gt;dotPos);
		cout&lt;&lt;endl;
		return;//currently ignore LR(1) and should be improved when LR(1)
	}

	for (int i=0; i&lt;tokenCount; i++)
	{
		if (i==emptyIndex||i==stackBottomIndex)
		{
			continue;
		}
		sets[stateIndex]-&gt;restart();
		//for all items, search if there is token to match to advance the dot
		while ((itemIndex=sets[stateIndex]-&gt;next())!=-1)
		{
			if (findNextTokenOfItem(itemIndex)== i)//
			{
				//do I need to add acceptance here?
				if (i==stackBottomIndex)
				{
				}
				doExpandState(itemIndex, i, theSet);//should finish addClosure
			}
		}
		//only expandState after you find one &quot;transition&quot;
		if (theSet.count()&gt;0)//if you do find some
		{
			targetState=createState(theSet);
			//here to write the LRTable
			add2Table(stateIndex, targetState, i);//3 params means to reduce
			findShift=true;
			theSet.reset();//prepare for next token
			//prevent infinite loop because DFA is cyclic graph
			if (!expandFinished[targetState])
			{
				expandState(targetState);
			}
		}//no one is set
	}
}

//this is for reduce of LR(0)
void Grammar::add2Table(int stateIndex, int ruleIndex)
{
	int result= reduce2Mask(ruleIndex);
	//currently this is for LR(0)
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (token[i]-&gt;terminal)
		{
			if (LRTable[stateIndex][i]!=-1)
			{
				errorHandle(OverWritingLRTable);
			}
			LRTable[stateIndex][i]=result;
		}
	}
}

//this is for shift
void Grammar::add2Table(int stateIndex, int targetIndex, int tIndex)
{
	int result=targetIndex;
	if (token[tIndex]-&gt;terminal)
	{
		result= shift2Mask(targetIndex);
	}
	if (LRTable[stateIndex][tIndex]!=-1)
	{
		errorHandle(OverWritingLRTable);
	}
	LRTable[stateIndex][tIndex]=result;
}


int Grammar::findNextTokenOfItem(int itemIndex)
{
	int tIndex, theNo, thePos, theVar;
	tIndex=items[itemIndex]-&gt;varIndex;//easy for reading
	theNo=items[itemIndex]-&gt;rulePos;
	thePos=items[itemIndex]-&gt;dotPos;
	//maybe I should write small function to wrap this kind of indexing!!!!
	theVar=production[token[tIndex]-&gt;production[theNo]][thePos];
	return theVar;
}



bool Grammar::isReduce(int input)
{
	return (SHIFTREDUCEMASK&amp;input)==REDUCEMASK;
}

bool Grammar::isShift(int input)
{
	return (SHIFTREDUCEMASK&amp;input)==SHIFTMASK;
}


int Grammar::mask2State(int input)
{
	return STATEMASK&amp;input;
}

//internal test purpose
void Grammar::printDFA()
{
	for (int i=0; i&lt;stateCount; i++)
	{
		cout&lt;&lt;&quot;\nstate no.&quot;&lt;&lt;i&lt;&lt;&quot;:\n&quot;;
		sets[i]-&gt;restart();
		int j=0;
		while ((j=sets[i]-&gt;next())!=-1)
		{
			printRule(items[j]-&gt;varIndex, items[j]-&gt;rulePos, items[j]-&gt;dotPos);
			cout&lt;&lt;endl;
		}

	}
}

//these are trivial functions and will later be changed to be inline
int Grammar::reduce2Mask(int input)
{
	return REDUCEMASK|input;
}

int Grammar::shift2Mask(int input)
{
	return SHIFTMASK|input;
}

BitSet::BitSet()
{
	current=-1;
	reset();
}

BitSet&amp; BitSet::operator =(const BitSet&amp; theSet)
{
	reset();
	this-&gt;operator |=(theSet);
	return *this;
}
	
int BitSet::next()
{
	for (int i=current+1; i&lt;bitset_size; i++)
	{
		if (at(i))
		{
			current=i;
			return i;
		}
	}
	current=-1;
	return -1;
}



bool operator == (Item&amp; first, Item&amp; second)
{
	return (first.dotPos==second.dotPos)&amp;&amp;(first.rulePos==second.rulePos)&amp;&amp;
		(first.varIndex==second.varIndex);
}


</pre>
<pre>

</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: parser.h</b></font></span></pre>
<pre>#include &quot;grammar.h&quot;

const int MaxStackLength=50;


class Parser
{
private:
	bool isLLParser;
	int stack[MaxStackLength];
	bool push(int num);
	bool pop(int&amp; num);
	int top;
	void initialize();
	bool pushToken(int tIndex, int theToken);
	void prepare();
	void pushRule(int theRule);
	void LLParse();
	void LRParse();
public:
	Parser(bool forLL=true);
	void parseFile(const char* fileName);	
	
};
</pre>
<pre>
</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: parser.cpp</b></font></span></pre>
<pre>#include &lt;iostream&gt;
#include &quot;parser.h&quot;
#include &quot;scanner.h&quot;
#include &quot;errorNo.h&quot;

using namespace std;

Scanner scanner;
extern Grammar grammar;
const char* defaultListFile=&quot;nickListFile.txt&quot;;
extern int startSymbolIndex;
extern int stackBottomIndex;
extern int table[MaxTokenCount][MaxTokenCount];
extern int token2type[MaxTokenCount];
extern int type2token[MaxTokenCount];
extern int emptyIndex;


bool Parser::pushToken(int tIndex, int theToken)
{
	if (table[tIndex][theToken]!=-1)
	{
		cout&lt;&lt;grammar.token[tIndex]-&gt;name&lt;&lt;&quot; =&gt; &quot;;
		grammar.printRule(table[tIndex][theToken]);
		cout&lt;&lt;endl;
		pushRule(table[tIndex][theToken]);
		return true;
	}
	return false;
}


void Parser::pushRule(int theRule)
{
	int len=0;
	while (grammar.production[theRule][len]!=-1)
	{
		len++;
	}
	for (int i=len-1; i&gt;=0; i--)
	{
		if (!push(grammar.production[theRule][i]))
		{
			errorHandle(StackOverFlow);
		}
	}
}


void Parser::LLParse()
{
	int theToken, theVar;
	bool canPop=true;

	while (scanner.nextToken())
	{
		if (!canPop)
		{
			errorHandle(UnexpectedEmptyStack);
		}
		if (scanner.token.type==COMMENTTYPE)
		{
			continue;
		}
		theToken=type2token[scanner.token.type];

		while ((canPop=pop(theVar))==true)
		{
			if (grammar.token[theVar]-&gt;terminal)
			{
				if (theToken==theVar)//match
				{					
					break;
				}
				else
				{
					if (theVar==emptyIndex)
					{
						continue;
					}
					errorHandle(IllegalGrammarToken);
				}
			}
			else
			{
				if (!pushToken(theVar, theToken))
				{
					errorHandle(IllegalGrammarToken);
				}
			}
		}
	}
	if (pop(theVar))
	{
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		/*
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		*/
	}
}


void Parser::LRParse()
{
	/*
	int theToken, theVar;
	bool canPop=true;

	while (scanner.nextToken())
	{
		if (!canPop)
		{
			errorHandle(UnexpectedEmptyStack);
		}
		if (scanner.token.type==COMMENTTYPE)
		{
			continue;
		}
		theToken=type2token[scanner.token.type];
		
		//if (grammar.isShift(LRTable[
	}

/*
		while ((canPop=pop(theVar))==true)
		{
			if (grammar.token[theVar]-&gt;terminal)
			{
				if (theToken==theVar)//match
				{					
					break;
				}
				else
				{
					if (theVar==emptyIndex)
					{
						continue;
					}
					errorHandle(IllegalGrammarToken);
				}
			}
			else
			{
				if (!pushToken(theVar, theToken))
				{
					errorHandle(IllegalGrammarToken);
				}
			}
		}
	}
	if (pop(theVar))
	{
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		/*
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		
	}
	*/

}


void Parser::parseFile(const char*fileName)
{
	scanner.readFromFile(fileName, defaultListFile);
	prepare();
	if (isLLParser)
	{
		LLParse();
	}
	else
	{
		LRParse();
	}
	
	/*

	while (scanner.nextToken())
	{
		if (!canPop)
		{
			errorHandle(UnexpectedEmptyStack);
		}
		if (scanner.token.type==COMMENTTYPE)
		{
			continue;
		}
		theToken=type2token[scanner.token.type];

		while ((canPop=pop(theVar))==true)
		{
			if (grammar.token[theVar]-&gt;terminal)
			{
				if (theToken==theVar)//match
				{					
					break;
				}
				else
				{
					if (theVar==emptyIndex)
					{
						continue;
					}
					errorHandle(IllegalGrammarToken);
				}
			}
			else
			{
				if (!pushToken(theVar, theToken))
				{
					errorHandle(IllegalGrammarToken);
				}
			}
		}
	}
	if (pop(theVar))
	{
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		/*
		if (theVar!=stackBottomIndex)
		{
			errorHandle(NotEmptyStack);
		}
		
	}
	*/
}


Parser::Parser(bool forLL)
{
	isLLParser=forLL;
	initialize();
}

void Parser::initialize()
{
	top=0;
}

bool Parser::pop(int&amp; num)
{
	if (top==0)
	{
		return false;
	}
	num=stack[--top];
	return true;
}

bool Parser::push(int num)
{
	if (top==MaxStackLength-1)
	{
		return false;
	}
	stack[top++]=num;
	return true;
}

void Parser::prepare()
{
	top=0;
	if (isLLParser)
	{
		pushRule(grammar.token[startSymbolIndex]-&gt;production[0]);
	}
	else
	{
		push(0);
	}
}
</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: scanner.h</b></font></span></pre>
<pre>///////////////////////////////////////////////////////////////////////////////////////////
//Program: SLang Scanner
//Author: Qingzhe Huang
//Date: Jan. 18, 2004
//FileName: scanner.h
//Features:
//	1.	I want to improve efficiency of scanning, so I used table-driven method.
//	2.	I used enum to represent character of all ASCII---CharType---where &quot;space, tab,
//		end of line, end of file are all considered to be White Space.
//	3.	All legal token is represented by enum TokenType.   
//	4.  I defined a huge amount of TokenState which is basically the state of a DFA. As
//		I don't want to search reserved keyword with linear search or whatever, I have 
//		many states for the reserved words.
//	5.  I deliberately make the sequence of first 38 TokenState elements exactly same as
//		all that of TokenType, so that each final state of DFA has a 1-1 correspondence with
//		type of token.
//	6.  I defined a struct of Token which may be used in future parser.
//	7.  I defined an errorNo variable to represent various errors. And a series error string
//		for displaying information.
//	8.  When class Scanner is created, it will initialize the big &quot;state-charType&quot; table.
//	9.  When readFromFile is called, it will first read one char in advance.
//	10. When an error is encountered, the caller of Scanner should understand that no further
//		char is read in. So, stop calling &quot;nextToken()&quot;. This is a bit controvercial, and I
//		plan to change it in next version.
//////////////////////////////////////////////////////////////////////////////////////////// 

/*////////////////////////////////////////////////////////////////////////////
Program: SLang Scanner
Author: Qingzhe Huang
Date: Jan. 21, 2004
FileName: scanner.h
Features:
	1. I restructured the struct Token, to make it a union field in order to store int 
	value for number.
	2. I restructured the function &quot;nextToken()&quot; in order to give out correct line no. 
	when error occurs.
*////////////////////////////////////////////////////////////////////////////////


#ifndef SCANNER_H
#define SCANNER_H
#include &lt;iostream&gt;

using namespace std;

extern enum ErrorCode;

const int TokenStateCount=138;
const int CharTypeCount=72;
const int MaxTokenLength=255;
const int MaxNumberLength=12;



enum CharType
{
	//all small letters 26
	SMALLA,SMALLB,SMALLC,SMALLD,SMALLE,SMALLF,SMALLG,SMALLH,SMALLI,SMALLJ,SMALLK,SMALLL,
	SMALLM,SMALLN,SMALLO,SMALLP,SMALLQ,SMALLR,SMALLS,SMALLT,SMALLU,SMALLV,SMALLW,SMALLX,
	SMALLY,SMALLZ,
	//all big letters 26
	BIGA,BIGB,BIGC,BIGD,BIGE,BIGF,BIGG,BIGH,BIGI,BIGJ,BIGK,BIGL,BIGM,BIGN,BIGO,BIGP,BIGQ,
	BIGR,BIGS,BIGT,BIGU,BIGV,BIGW,BIGX,BIGY,BIGZ,
	//all digit 1
	DIGIT, 
	//all symbols 16
	QUOTE, OPENPAR, CLOSEPAR, SEMICOLON,PLUS, MINUS, TIMES, SLASH, COLON,
	EQUAL,SMALLER,GREATER,EXCLAIM,OPENBRACKET, CLOSEBRACKET,COMMA,
	//space, tab, end of line are regarded as whitespace, 1
	WHITESPACE,
	//UNDERSCORE IS A SPECIAL SYMBOL 1
	UNDERSCORE,
	//all other ASCII is regarded as illegal 1
	ILLEGAL
};



//TOTAL 38, JUST 1-1 WITH THE FIRST 38 OF TOKENSTATE
enum TokenType 
{
	//GENERAL TYPE 5
	IDTYPE, NUMBERTYPE, CHARCONSTTYPE, COMMENTTYPE, ERRORTYPE,
	//THE FOLLOWING ARE SYMBOL TYPE	18
	OPENPARTYPE, CLOSEPARTYPE, SEMICOLONTYPE, PLUSTYPE, MINUSTYPE, TIMESTYPE, 
    SLASHTYPE, ASSIGNMENTTYPE, SMALLERTYPE, GREATERTYPE, EQUALTYPE, SMALLEREQUALTYPE,
	GREATEREQUALTYPE, NOTEQUALTYPE, OPENBRACKETTYPE, CLOSEBRACKETTYPE, COMMATYPE, 
	COLONTYPE, 
	//THE FOLLOWING ARE RESERVED TYPE 15
	BEGINTYPE, ENDTYPE, PROGRAMTYPE, VARIABLESTYPE,INTEGERTYPE, ARRAYTYPE, CHARTYPE, 
	MODULETYPE, IFTYPE, THENTYPE, ELSETYPE, LOOPTYPE, EXITTYPE, READTYPE, WRITETYPE
};


//total 138 states
enum TokenState
{
	//THE FINAL STATE 38, in order to easy initialize &quot;finalState&quot;, I put them in beginning
	//5 generals
	IDEND, NUMBEREND, CONSTCHAREND, COMMENTEND, ERROR,
	//18 symbols
	OPENPAREND, CLOSEPAREND, SEMICOLONEND, PLUSEND, MINUSEND, TIMESEND, 
	SLASHEND, ASSIGNMENTEND, SMALLEREND, GREATEREND, EQUALEND, SMALLEREQUALEND, 
	GREATEREQUALEND, NOTEQUALEND, OPENBRACKETEND, CLOSEBRACKETEND, COMMAEND, 
	COLONEND, 
	//15 reserved
	BEGINEND, ENDEND, PROGRAMEND, VARIABLESEND, INTEGEREND, ARRAYEND, CHAREND, 
	MODULEEND, IFEND, THENEND, ELSEEND, LOOPEND, EXITEND, READEND, WRITEEND,
	//THE FOLLOWING ARE ALL NON-FINAL STATES
	//THE very FIRST CHAR 1
	READY, 
	//THE FOLLOWING ARE ALL RESERVED STATE
	//the first char 12
	ARRAY1, BEGIN1, CHAR1, E1, I1, LOOP1, MODULE1, PROGRAM1, READ1, THEN1, VARIABLES1,
	WRITE1,
	//THE SECOND CHAR 15
	ARRAY2, BEGIN2, CHAR2, ELSE2, END2, EXIT2, IF2, INTEGER2, LOOP2, MODULE2, PROGRAM2, 
	READ2, THEN2, VARIABLES2, WRITE2, 
	//THE THIRD CHAR 14
	ARRAY3, BEGIN3, CHAR3, ELSE3, END3, EXIT3, INTEGER3, LOOP3, MODULE3, PROGRAM3, READ3, 
	THEN3, VARIABLES3, WRITE3,
	//THE FOURTH CHAR 13
	ARRAY4, BEGIN4, CHAR4, ELSE4, EXIT4, INTEGER4, LOOP4, MODULE4, PROGRAM4, READ4, THEN4, 
	VARIABLES4, WRITE4,

	//THE FIFTH CHAR 7
	ARRAY5, BEGIN5, INTEGER5, MODULE5, PROGRAM5, VARIABLES5, WRITE5,
	//THE SIXTH CHAR 4
	INTEGER6, MODULE6, PROGRAM6, VARIABLES6, 
	//THE SEVENTH CHAR 3
	INTEGER7, PROGRAM7, VARIABLES7,
	//THE EIGHTH CHAR 1
	VARIABLES8, 
	//THE NINETH CHAR 1
	VARIABLES9,

	//THESE ARE NON-RESERVED
	//THESE ARE GENERAL 9
	IDBEGIN, IDUNDERSCORE, NUMBERBEGIN, CONSTCHARQUOTEBEGIN, CONSTCHARBEGIN, COMMENTSTARBEGIN,
	COMMENTBEGIN, COMMENTSTAREND, COMMENTSLASHBEGIN,  
	//the SINGLE symbols 16
	QUOTEBEGIN, OPENPARBEGIN, CLOSEPARBEGIN, SEMICOLONBEGIN, 
	PLUSBEGIN, MINUSBEGIN, TIMESBEGIN, SLASHBEGIN, COLONBEGIN, SMALLERBEGIN, GREATERBEGIN, 
	EQUALBEGIN, EXCLAIMBEGIN, OPENBRACKETBEGIN, CLOSEBRACKETBEGIN, COMMABEGIN, 
	//MULTI SYMBOL 4
	ASSIGNMENTBEGIN, SMALLEREQUALBEGIN,
	GREATEREQUALBEGIN, NOTEQUALBEGIN 
};

//extern ErrorCode errorNo;

extern void errorHandle(ErrorCode errorNo);

struct Token
{
	TokenType type;	
	union
	{
		char name[MaxTokenLength+1];
		int number;
	};
};

class Scanner
{
private:	
	int tokenCount;
	unsigned char ch;
	void printLineNo();
	FILE* stream;
	bool nextChar();
	void initialize();
	bool resume();
public:
	Scanner();	
	static Token token;	
	bool readFromFile(const char* fileName, const char* listFileName);
	const char* getToken(){return token.name;}
	bool nextToken();
	void report();
};
	


void initialTokenState();

#endif
</pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: scanner</b></font></span><font size="3" color="#FF0000"><b>.cpp</b></font></pre>
<pre>/*////////////////////////////////////////////////////////////////////////////
Program: SLang Scanner
Author: Qingzhe Huang
Date: Jan. 21, 2004
FileName: scanner.cpp
Features:
	1. As Dr. Optrany said, the number should be stored as int or double whatever.
	2. I restructured the function &quot;nextToken()&quot; in order to give out correct line no. 
	when error occurs.
*////////////////////////////////////////////////////////////////////////////////


#include &lt;iostream&gt;
#include &lt;fstream&gt;
#include &quot;scanner.h&quot;
#include &quot;errorNo.h&quot;

using namespace std;

ofstream fList;

//this will determine how many errors of maximum the scanner will tolerant
const int MaxErrortolerant=10;
//as integer usually have max 12 digit roughly
int errorCount=0;
int lineCount=1;

//static memeber
Token Scanner::token;

const int TokenTypeCount=38;

//extern void errorHandle(int errorNo);

extern char* errorStr[ErrorCount];

//this is purely for displaying purpose
char* tokenTypeStr[TokenTypeCount]=
{
	//GENERAL TYPE 5
	&quot;ID&quot;, &quot;NUMBER&quot;, &quot;CHARACTER CONSTANT&quot;, &quot;COMMENT&quot;, &quot;ERROR&quot;,
	//THE FOLLOWING ARE SYMBOL TYPE	18
	&quot;(&quot;, &quot;)&quot;, &quot;;&quot;, &quot;+&quot;, &quot;-&quot;, &quot;*&quot;, 
    &quot;/&quot;, &quot;:=&quot;, &quot;&lt;&quot;, &quot;&gt;&quot;, &quot;=&quot;, &quot;&lt;=&quot;,
	&quot;&gt;=&quot;, &quot;!=&quot;, &quot;[&quot;, &quot;]&quot;, &quot;,&quot;, 
	&quot;:&quot;, 
	//THE FOLLOWING ARE RESERVED TYPE 15
	&quot;begin&quot;, &quot;end&quot;, &quot;program&quot;, &quot;variables&quot;,&quot;integer&quot;, &quot;array&quot;, &quot;char&quot;, 
	&quot;module&quot;, &quot;if&quot;, &quot;then&quot;, &quot;else&quot;, &quot;loop&quot;, &quot;exit&quot;, &quot;read&quot;, &quot;write&quot;
};

CharType charType[256];

TokenState tokenState[TokenStateCount][CharTypeCount];



//when error occurs, no message is immediately output, it is 
//postponed to next time, because when '\n' is read in, 
//lineCount is not incremented until token is decided.
//Therefore, when a token is ended with '\n', the line no is not updated
//until next round. So, we can keep the correct line no. for each token
bool Scanner::nextToken()
{
	TokenState state=READY;
	int digitCount=0;
	int value=0;
	int count=0;//to count the length of token
	char* ptr=token.name; 
	bool isComment=false;
	do
	{
		//map ch to CharType reducing 256 ASCII to 73 CharTypes
		//the table for &quot;state&quot; and &quot;CharType is 138x73, each entry is a
		//index for state.
		state=tokenState[state][charType[ch]];

		if (state==NUMBERBEGIN)
		{
			digitCount++;
			if (digitCount&gt;=MaxNumberLength)
			{
				errorHandle(ExceedNumberLimit);
				return resume();
			}
			//to accumulate the value
			value*=10;
			value+=ch-'0';
		}
		//because I put all final state in the first 38 positions
		if (state&lt;38)
		{
			//This is a dirty trick! Because I make the &quot;TokenType&quot; 1-1 with
			//TokenState for the 38 finals.
			*ptr='\0';
			token.type=(TokenType)(state);
			if (state==ERROR)
			{
				errorHandle(IllegalToken);
				//printLineNo();
				return resume();				
			}
			tokenCount++;
			if (state==NUMBEREND)
			{
				token.number=value;
			}
			//printLineNo();
			return true;
		}
		if (state==COMMENTBEGIN)
		{
			isComment=true;
		}
		if (count&gt;=MaxTokenLength)
		{
			errorHandle(TokenTooLong);
			token.type=ERRORTYPE;
			//printLineNo();
			return false;
		}
		//cout&lt;&lt;ch;
		if (!isComment&amp;&amp;state!=READY)
		{
			*ptr=ch;
			ptr++;
			count++;
		}
		//it is only at end to update line no.
		printLineNo();
	}while (nextChar());

	state=tokenState[state][charType[ch]];
	//at this point, it is either in ready state, or error state
	if (state==ERROR)
	{
		token.type=(TokenType)(state);
		errorHandle(UnexpectedReachEOF);			
	}		
	//but in all case it means end of file, so return false
	return false;
}

bool Scanner::resume()
{
	//the scanner will try to continue if error number is within 10
	if (errorCount==MaxErrortolerant)
	{
		return false;
	}
	return nextChar();
}

void Scanner::report()
{
	fList&lt;&lt;&quot;\ntotal number of tokens is &quot;&lt;&lt;tokenCount;
	fList&lt;&lt;&quot;\ntotal number of errors is &quot;&lt;&lt;errorCount;
}

void Scanner::printLineNo()
{
	if (ch=='\n')
	{
		fList&lt;&lt;++lineCount&lt;&lt;&quot;  &quot;;
	}
}

Scanner::Scanner()
{
	initialize();
}

void Scanner::initialize()
{
	errorCount=0;
	lineCount=1;
	tokenCount=0;
	initialTokenState();
}

bool Scanner::readFromFile(const char* fileName, const char* listFileName)
{
	if ((stream=fopen(fileName, &quot;r&quot;))==NULL)
	{
		errorHandle(CannotOpenFile);
		return false;
	}
	else
	{	
		fList.open(listFileName);
		fList&lt;&lt;lineCount&lt;&lt;&quot;  &quot;;
		//this is to prevent the empty file situation in which
		//you cannot even read one single char because my scanner need to read 
		//one char ahead
		if (!nextChar())
		{
			errorHandle(FileEmptyError);
			return false;
		}
	}
	return true;
}


bool Scanner::nextChar()
{
	ch=fgetc(stream);
	fList&lt;&lt;ch;
	return ch!=255;
}

</pre>
<pre></pre>
<pre>　</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: </b></font></span><font size="3" color="#FF0000"><b>e<span lang="en-ca">rrorno.h</span></b></font></pre>
<pre>#ifndef ERRORNO_H
#define ERRORNO_H



extern char* errorStr[];

void errorHandle(int errorNo, void* param1=NULL, void* param2=NULL);

/*
class ErrorInfo
{
private:
public:
	ErrorInfo(int errorNo);
	void addInt(int num);

};
*/

const int ScannerErrorCount=6;
const int ParserErrorCount=4;
const int LRParserErrorCount=3;
const int ErrorCount=ScannerErrorCount+ParserErrorCount+LRParserErrorCount;

//errors of scanner = 6
#define IllegalToken             0
#define TokenTooLong             1
#define UnexpectedReachEOF       2
#define  FileEmptyError          3
#define CannotOpenFile           4
#define ExceedNumberLimit        5

//error of parser =4
#define UnexpectedEmptyStack     6
#define IllegalGrammarToken      7
#define NotEmptyStack            8
#define StackOverFlow            9

//error of LR(0) = 3
#define ShiftReduceConflict      10
#define ReduceReduceConflict     11
#define OverWritingLRTable       12
#endif</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: </b></font></span><font size="3" color="#FF0000"><b>e<span lang="en-ca">rrorno.</span></b></font><span lang="en-ca"><font size="3" color="#FF0000"><b>cpp</b></font></span></pre>
<pre>#include &lt;iostream&gt;
#include &lt;fstream&gt;
#include &quot;scanner.h&quot;
#include &quot;errorNo.h&quot;
#include &quot;grammar.h&quot;

using namespace std;

extern Grammar grammar;


char* errorStr[ErrorCount]=
{
	//these are scanner errors:
	&quot;IllegalToken&quot;, 
	&quot;TokenTooLong&quot;, 
	&quot;UnexpectedReachEOF&quot;, 
	&quot;FileEmptyError&quot;, 
	&quot;CannotOpenFile&quot;, 
	&quot;ExceedNumberLimit&quot;,

	//these are parser errors
	&quot;UnexpectedEmptyStack&quot;,
	&quot;IllegalGrammarToken&quot;,
	&quot;NotEmptyStack&quot;,
	&quot;StackOverFlow&quot;,
	&quot;ShiftReduceConflict&quot;,
    &quot;ReduceReduceConflict&quot;,
	&quot;OverWritingLRTable&quot;
};



extern int errorCount;
extern int lineCount;
extern ofstream fList;



//this is going to be improved in future as parser need to 
//call it, too. so, more parameter should be added?
//No! the error no. itself specifies the error and it is
//error handler to try to find necessary info to display.
void errorHandle(int errorNo, void* param1, void* param2)
{
	void printItem(Item* ptr);
	//void printItem(Item* ptr)
	//{
	
	
	
	//}
	Item* ptr;

	if (errorNo&lt;ScannerErrorCount)
	{
		errorCount++;
		//the illegal token may be for various reason and I only suggest
		//a possible nearby place to spot the error occurs.
		fList&lt;&lt;&quot;\nerror of &quot;&lt;&lt;errorStr[errorNo]&lt;&lt;&quot; occurred at line &quot;
			&lt;&lt;lineCount&lt;&lt;&quot; near token &quot;&lt;&lt;Scanner::token.name&lt;&lt;endl;
	}
	else
	{
		if (errorNo&lt;ScannerErrorCount+ParserErrorCount)
		{
			errorCount++;
			fList&lt;&lt;&quot;\nerror of &quot;&lt;&lt;errorStr[errorNo]&lt;&lt;&quot; occurred at line &quot;
				&lt;&lt;lineCount&lt;&lt;&quot; near token &quot;&lt;&lt;Scanner::token.name&lt;&lt;endl;
			exit(errorNo);
		}
		else
		{
			if (errorNo&lt;ScannerErrorCount+ParserErrorCount+LRParserErrorCount)
			{
				errorCount++;
				//temparorily
				cout&lt;&lt;errorStr[errorNo]&lt;&lt;endl;
				switch (errorNo)
				{
					
				case ShiftReduceConflict:
					
					ptr=((Item*)(param1));					
					cout&lt;&lt;&quot;\nconflict at shift:\n&quot;;
					printItem(ptr);
				

					ptr=((Item*)(param2));
					
					cout&lt;&lt;&quot;\nconflict with reduce:\n&quot;;
					printItem(ptr);
					cout&lt;&lt;endl;
					break;
				case ReduceReduceConflict:
					
					ptr=((Item*)(param1));

					cout&lt;&lt;&quot;\nconflict at reduce:\n&quot;;
					printItem(ptr);

					ptr=((Item*)(param2));
					
					cout&lt;&lt;&quot;\nconflict with reduce:\n&quot;;
					printItem(ptr);
					cout&lt;&lt;endl;
					break;
				}
			}
		}
	}
}

void printItem(Item* ptr)
{
	int temp1, temp2;
	temp1=ptr-&gt;varIndex;
	temp2=ptr-&gt;rulePos;	
	grammar.printRule(temp1, temp2, ptr-&gt;dotPos);
}</pre>
<pre>　</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: </b></font></span><font size="3" color="#FF0000"><b><span lang="en-ca">initialize</span>.cpp </b></font></pre>
<pre>　</pre>
<pre>/*////////////////////////////////////////////////////////////////////////////
Program: SLang Scanner
Author: Qingzhe Huang
Date: Jan. 18, 2004
FileName: initialize.cpp
Features:
	1. This is purely mechnical job, you know to initialize a huge state table:
	138x72 is really a boring, routine job.
	2. For EOF, I want &quot;ch&quot; to be able to be an index in &quot;CharType&quot; array, so, it
	cannot be -1, but 255 for &quot;unsigned char&quot; which is declared in class Scanner.
*////////////////////////////////////////////////////////////////////////////////


#include &quot;scanner.h&quot;

extern enum CharType;
extern enum TokenState;


extern CharType charType[256];
extern TokenState tokenState[TokenStateCount][CharTypeCount];


void finalSymbolToken(TokenState state, TokenState endState);
void finalReservedToken(TokenState state, TokenState endState);
void initialCharType();
void setFinalTokenState();
void initialReserved(TokenState state);
void setRange(TokenState state, CharType start, CharType end, TokenState target);
void setState(TokenState state, TokenState targetState);
void setDefaultState();



void setDefaultState()
{
	//all states are by default error
	for (int i=0; i&lt;TokenStateCount; i++)
	{
		setState((TokenState)i, ERROR);
	}
	
	//the default for all letters are IDBEGIN
	setRange(READY, SMALLA, BIGZ, IDBEGIN);

	//THIS IS  another dirty trick, since I put all reserved states together
	//so you can initialize them together. 
	for (i=ARRAY1; i&lt;=VARIABLES9; i++)
	{
		initialReserved((TokenState)i);
	}
	setFinalTokenState();
}

void setFinalTokenState()
{
	//FOR ID
	finalReservedToken(IDBEGIN, IDEND);
	//for number
	finalReservedToken(NUMBERBEGIN, NUMBEREND);
	//THESE FOR RESERVED WORDS

	finalReservedToken(ARRAY5, ARRAYEND);
	finalReservedToken(BEGIN5, BEGINEND);
	finalReservedToken(CHAR4, CHAREND);
	finalReservedToken(ELSE4, ELSEEND);
	finalReservedToken(END3, ENDEND);
	finalReservedToken(EXIT4, EXITEND);
	finalReservedToken(IF2, IFEND);
	finalReservedToken(INTEGER7, INTEGEREND);
	finalReservedToken(LOOP4, LOOPEND);
	finalReservedToken(MODULE6, MODULEEND);
	finalReservedToken(PROGRAM7, PROGRAMEND);
	finalReservedToken(READ4, READEND);
	finalReservedToken(THEN4, THENEND);
	finalReservedToken(VARIABLES9, VARIABLESEND);
	finalReservedToken(WRITE5, WRITEEND);

	//THESE FOR SYMBOLS


	finalSymbolToken(OPENPARBEGIN, OPENPAREND);
	finalSymbolToken(CLOSEPARBEGIN, CLOSEPAREND);
	finalSymbolToken(SEMICOLONBEGIN, SEMICOLONEND);
	finalSymbolToken(PLUSBEGIN, PLUSEND);
	finalSymbolToken(MINUSBEGIN, MINUSEND);
	finalSymbolToken(TIMESBEGIN, TIMESEND);
	finalSymbolToken(SLASHBEGIN, SLASHEND);
	finalSymbolToken(ASSIGNMENTBEGIN, ASSIGNMENTEND);
	finalSymbolToken(SMALLERBEGIN, SMALLEREND);
	finalSymbolToken(GREATERBEGIN, GREATEREND);
	finalSymbolToken(EQUALBEGIN, EQUALEND);
	finalSymbolToken(SMALLEREQUALBEGIN, SMALLEREQUALEND);
	finalSymbolToken(GREATEREQUALBEGIN, GREATEREQUALEND);
	finalSymbolToken(NOTEQUALBEGIN, NOTEQUALEND);
	finalSymbolToken(OPENBRACKETBEGIN, OPENBRACKETEND);
	finalSymbolToken(CLOSEBRACKETBEGIN, CLOSEBRACKETEND);
	finalSymbolToken(COMMABEGIN, COMMAEND);
	finalSymbolToken(COLONBEGIN, COLONEND);

	//COMMENT
	finalSymbolToken(COMMENTSLASHBEGIN, COMMENTEND);
	//CONSTCHAR
	finalSymbolToken(CONSTCHARQUOTEBEGIN, CONSTCHAREND);

}

void initialTokenState()
{
	//initialize all charType
	initialCharType();
	//default is always error
	setDefaultState();

	//loop
	tokenState[READY][WHITESPACE]=READY;
	//number
	tokenState[READY][DIGIT]=NUMBERBEGIN;
	tokenState[NUMBERBEGIN][DIGIT]=NUMBERBEGIN;//HOW LONG SHOULD NUMBER BE?

	//ID
	//setRange(READY, SMALLA, BIGZ, IDBEGIN); THIS IS IN DEFAULT
	setRange(IDBEGIN, SMALLA, DIGIT, IDBEGIN);
	tokenState[IDBEGIN][UNDERSCORE]=IDUNDERSCORE;
	setRange(IDUNDERSCORE, SMALLA, DIGIT, IDBEGIN);

	//reserved words
	//ARRAY1, BEGIN1, CHAR1, E1, I1, LOOP1, MODULE1, PROGRAM1, READ1, THEN1, WRITE1,
	//VARIABLES1,
	tokenState[READY][SMALLA]=ARRAY1;
	tokenState[READY][SMALLB]=BEGIN1;
	tokenState[READY][SMALLC]=CHAR1;
	tokenState[READY][SMALLE]=E1;
	tokenState[READY][SMALLI]=I1;
	tokenState[READY][SMALLL]=LOOP1;
	tokenState[READY][SMALLM]=MODULE1;
	tokenState[READY][SMALLP]=PROGRAM1;
	tokenState[READY][SMALLR]=READ1;
	tokenState[READY][SMALLT]=THEN1;
	tokenState[READY][SMALLV]=VARIABLES1;
	tokenState[READY][SMALLW]=WRITE1;

	/* RESERVED WORDS
	ARRAY2 */
	tokenState[ARRAY1][SMALLR]=ARRAY2;
	//BEGIN2
	tokenState[BEGIN1][SMALLE]=BEGIN2;
	//CHAR2
	tokenState[CHAR1][SMALLH]=CHAR2;
	//ELSE2,
	tokenState[E1][SMALLL]=ELSE2;
	//EXIT2
	tokenState[E1][SMALLX]=EXIT2;
	//END2
	tokenState[E1][SMALLN]=END2;
	//IF2
	tokenState[I1][SMALLF]=IF2;
	//INTEGER2
	tokenState[I1][SMALLN]=INTEGER2;
	//LOOP2
	tokenState[LOOP1][SMALLO]=LOOP2;
	//MODULE2
	tokenState[MODULE1][SMALLO]=MODULE2;
	//PROGRAM2
	tokenState[PROGRAM1][SMALLR]=PROGRAM2;
	//READ2
	tokenState[READ1][SMALLE]=READ2;
	//THEN2
	tokenState[THEN1][SMALLH]=THEN2;
	//VARIABLES2
	tokenState[VARIABLES1][SMALLA]=VARIABLES2;
	//WRITE2
	tokenState[WRITE1][SMALLR]=WRITE2;

	/* RESERVED WORDS
	ARRAY3 */
	tokenState[ARRAY2][SMALLR]=ARRAY3;
	//BEGIN2
	tokenState[BEGIN2][SMALLG]=BEGIN3;
	//CHAR2
	tokenState[CHAR2][SMALLA]=CHAR3;
	//ELSE2,
	tokenState[ELSE2][SMALLS]=ELSE3;
	//END2
	tokenState[END2][SMALLD]=END3;
	//EXIT2
	tokenState[EXIT2][SMALLI]=EXIT3;
	//INTEGER2
	tokenState[INTEGER2][SMALLT]=INTEGER3;
	//LOOP2
	tokenState[LOOP2][SMALLO]=LOOP3;
	//MODULE2
	tokenState[MODULE2][SMALLD]=MODULE3;
	//PROGRAM2
	tokenState[PROGRAM2][SMALLO]=PROGRAM3;
	//READ2
	tokenState[READ2][SMALLA]=READ3;
	//THEN2
	tokenState[THEN2][SMALLE]=THEN3;
	//VARIABLES2
	tokenState[VARIABLES2][SMALLR]=VARIABLES3;
	//WRITE2
	tokenState[WRITE2][SMALLI]=WRITE3;

	/* RESERVED WORDS
	ARRAY3 */
	tokenState[ARRAY3][SMALLA]=ARRAY4;
	//BEGIN2
	tokenState[BEGIN3][SMALLI]=BEGIN4;
	//CHAR2
	tokenState[CHAR3][SMALLR]=CHAR4;
	//ELSE2,
	tokenState[ELSE3][SMALLE]=ELSE4;
	//EXIT2
	tokenState[EXIT3][SMALLT]=EXIT4;
	//INTEGER2
	tokenState[INTEGER3][SMALLE]=INTEGER4;
	//LOOP2
	tokenState[LOOP3][SMALLP]=LOOP4;
	//MODULE2
	tokenState[MODULE3][SMALLU]=MODULE4;
	//PROGRAM2
	tokenState[PROGRAM3][SMALLG]=PROGRAM4;
	//READ2
	tokenState[READ3][SMALLD]=READ4;
	//THEN2
	tokenState[THEN3][SMALLN]=THEN4;
	//VARIABLES2
	tokenState[VARIABLES3][SMALLI]=VARIABLES4;
	//WRITE2
	tokenState[WRITE3][SMALLT]=WRITE4;

	/* RESERVED WORDS
	ARRAY */
	tokenState[ARRAY4][SMALLY]=ARRAY5;
	//BEGIN2
	tokenState[BEGIN4][SMALLN]=BEGIN5;
	//INTEGER2
	tokenState[INTEGER4][SMALLG]=INTEGER5;
	//MODULE2
	tokenState[MODULE4][SMALLL]=MODULE5;
	//PROGRAM2
	tokenState[PROGRAM4][SMALLR]=PROGRAM5;
	//VARIABLES2
	tokenState[VARIABLES4][SMALLA]=VARIABLES5;
	//WRITE2
	tokenState[WRITE4][SMALLE]=WRITE5;

	// RESERVED WORDS*/
	//INTEGER2
	tokenState[INTEGER5][SMALLE]=INTEGER6;
	//MODULE2
	tokenState[MODULE5][SMALLE]=MODULE6;
	//PROGRAM2
	tokenState[PROGRAM5][SMALLA]=PROGRAM6;
	//VARIABLES2
	tokenState[VARIABLES5][SMALLB]=VARIABLES6;

	// RESERVED WORDS*/
	//INTEGER2
	tokenState[INTEGER6][SMALLR]=INTEGER7;
	//PROGRAM2
	tokenState[PROGRAM6][SMALLM]=PROGRAM7;
	//VARIABLES2
	tokenState[VARIABLES6][SMALLL]=VARIABLES7;
	// RESERVED WORDS*/
	//VARIABLES2
	tokenState[VARIABLES7][SMALLE]=VARIABLES8;
	//VARIABLES2
	tokenState[VARIABLES8][SMALLS]=VARIABLES9;

	/*
	CONSTCHAR, UNDERSCOREBEGIN, ASSIGNMENTBEGIN, SMALLEREQUALBEGIN,
	GREATEREQUALBEGIN, NOTEQUAL, COMMENTBEGIN, IDUNDERSCORE,*/

	//now is the symbols
	//QUOTEBEGIN, OPENPARBEGIN, CLOSEPARBEGIN, SEMICOLONBEGIN, 
	//PLUSBEGIN, MINUSBEGIN, TIMESBEGIN, SLASHBEGIN, COLONBEGIN, SMALLERBEGIN, GREATERBEGIN, 
	//EQUALBEGIN, EXCLAIMBEGIN, OPENBRACKETBEGIN, CLOSEBRACKETBEGIN, COMMABEGIN,
	//'
	tokenState[READY][QUOTE]=QUOTEBEGIN;
	//(
	tokenState[READY][OPENPAR]=OPENPARBEGIN;
	//)
	tokenState[READY][CLOSEPAR]=CLOSEPARBEGIN;
	//;
	tokenState[READY][SEMICOLON]=SEMICOLONBEGIN;
	//+
	tokenState[READY][PLUS]=PLUSBEGIN;
	//-
	tokenState[READY][MINUS]=MINUSBEGIN;
	//*
	tokenState[READY][TIMES]=TIMESBEGIN;
	///
	tokenState[READY][SLASH]=SLASHBEGIN;
	//:
	tokenState[READY][COLON]=COLONBEGIN;
	//&lt;
	tokenState[READY][SMALLER]=SMALLERBEGIN;
	//&gt;
	tokenState[READY][GREATER]=GREATERBEGIN;
	//=
	tokenState[READY][EQUAL]=EQUALBEGIN;
	//!
	tokenState[READY][EXCLAIM]=EXCLAIMBEGIN;
	//[
	tokenState[READY][OPENBRACKET]=OPENBRACKETBEGIN;
	//]
	tokenState[READY][CLOSEBRACKET]=CLOSEBRACKETBEGIN;
	//,
	tokenState[READY][COMMA]=COMMABEGIN;

	//AFTER QUOTE IT CAN BE ANY CHARACTER, INCLUDING ILLEGAL CHAR
	setRange(QUOTEBEGIN, SMALLA, ILLEGAL, CONSTCHARBEGIN);
	//ANY OTHER STATE IS BY DEFAULT ERROR
	tokenState[CONSTCHARBEGIN][QUOTE]=CONSTCHARQUOTEBEGIN;
	//FOR /, DEFAULT IS SLASHEND, EXCEPT * WHICH IS COMMENTSTARBEGIN
	tokenState[SLASHBEGIN][TIMES]= COMMENTSTARBEGIN; 

	//FOR :, DEFAULT IS COLONEND, EXCEPT FOR = WHICH IS ASSIGNMENTBEGIN
	tokenState[COLONBEGIN][EQUAL]= ASSIGNMENTBEGIN; 

	//FOR &lt;, DEFAULT IS SMALLEREND, EXCEPT FOR= WHICH IS SMALLEREQAULBEGIN
	tokenState[SMALLERBEGIN][EQUAL]=SMALLEREQUALBEGIN; 

	//FOR &gt;, DEFAULT IS GREATEREND, EXCEPT FOR= WHICH IS GREATEREQAULBEGIN
	tokenState[GREATERBEGIN][EQUAL]= GREATEREQUALBEGIN; 

	tokenState[EXCLAIMBEGIN][EQUAL]= NOTEQUALBEGIN; 
	//WITHIN COMMENT IT IS A LOOP, EXCEPT FOR * WHICH IS POSSIBLE FOR END OF COMMENT
	setRange(COMMENTSTARBEGIN, SMALLA, ILLEGAL, COMMENTBEGIN);
	tokenState[COMMENTSTARBEGIN][TIMES]=COMMENTSTAREND;
	setRange(COMMENTBEGIN, SMALLA, ILLEGAL, COMMENTBEGIN);
	tokenState[COMMENTBEGIN][TIMES]=COMMENTSTAREND;
	//FROM COMMENTSTARBEGIN, ALL IS BACK TO COMMENTBEGIN, EXCEPT / WHICH IS END OF COMMENT
	setRange(COMMENTSTAREND, SMALLA, ILLEGAL, COMMENTBEGIN);
	tokenState[COMMENTSTAREND][SLASH]=COMMENTSLASHBEGIN;
	//
}


void initialReserved(TokenState state)
{
	setRange(state, SMALLA, DIGIT, IDBEGIN);
	finalReservedToken(state, IDEND);
	tokenState[state][UNDERSCORE]=IDUNDERSCORE;//a_
}

void finalSymbolToken(TokenState state, TokenState endState)
{
	for (int i=SMALLA; i&lt;=WHITESPACE; i++)
	{
		tokenState[state][(CharType)i]=endState;
	}
}

void finalReservedToken(TokenState state, TokenState endState)
{
	//all non-letter, non-digit is regarded to be delimeter
	for (int i=QUOTE; i&lt;=WHITESPACE; i++)
	{
		tokenState[state][(CharType)i]=endState;
	}
}


//the default charType is ILLEGAL
void initialCharType()
{
	int chType;
	//the default charType is ILLEGAL
	for (int i=0; i&lt;256; i++)
	{
		charType[i]=ILLEGAL;
	}
	//chType is SMALLA
	chType=SMALLA;
	for (i='a'; i&lt;='z'; i++)
	{
		charType[i]=(CharType)(chType);
		chType++;
	}
	//chType is now BIGA
	chType=BIGA;//I don't want to rely on the trick.
	for (i='A'; i&lt;='Z'; i++)
	{
		charType[i]=(CharType)(chType);
		chType++;
	}
	chType=DIGIT;
	for (i='0'; i&lt;='9'; i++)
	{
		charType[i]=(CharType)(chType);
	}
	/*
	UNDERSCORE, QUOTE, OPENPAR, CLOSEPAR, SEMICOLON,PLUS, MINUS, TIMES, SLASH, COLON,
	EQUAL,SMALLER,GREATER,EXCLAIM,OPENBRACKET, CLOSEBRACKET,COMMA,
	SPACE,TAB, ENDLINE, ILLEGAL
	*/
	charType['_']=UNDERSCORE;
	charType['\'']=QUOTE;
	charType['(']=OPENPAR;
	charType[')']=CLOSEPAR;
	charType[';']=SEMICOLON;
	charType['+']=PLUS;
	charType['-']=MINUS;
	charType['*']=TIMES;
	charType['/']=SLASH;
	charType[':']=COLON;
	charType['=']=EQUAL;
	charType['&lt;']=SMALLER;
	charType['&gt;']=GREATER;
	charType['!']=EXCLAIM;
	charType['[']=OPENBRACKET;
	charType[']']=CLOSEBRACKET;
	charType[',']=COMMA;
	charType[' ']=WHITESPACE;
	charType['\t']=WHITESPACE;
	charType[10]=WHITESPACE;
	charType[13]=WHITESPACE;
	//pls note, since I changed the type of &quot;ch&quot; to be &quot;unsigned char&quot;
	//the EOF now is not -1, but 255
	charType[255]=WHITESPACE;//IT IS A KIND OF DELIMETER
}

void setRange(TokenState state, CharType start, CharType end, TokenState target)
{
	for (int i=start; i&lt;=end; i++)
	{
		tokenState[state][i]=target;
	}
}

void setState(TokenState state, TokenState targetState)
{
	for (int i=0; i&lt;CharTypeCount; i++)
	{
		tokenState[state][i]=targetState;
	}
}</pre>
<pre>　</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: main</b></font></span><font size="3" color="#FF0000"><b>.cpp (main)</b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;
#include &quot;Parser.h&quot;

using namespace std;

int main( int argc, char *argv[ ])
{
	if (!(argc==1||argc==3))
	{
		cout&lt;&lt;&quot;usage: CRGReader grammarSourceFileName  sourceFileToBeParsed&quot;;
		exit(1);
	}

	CFGReader R;
	Parser P(false);
	if (argc==1)
	{
		R.readFromFile(&quot;c:\\LRTest.txt&quot;);
	}
	else
	{
		R.readFromFile(argv[1]);
	}
    R.optimize(false);
	R.print();
	//R.calculateLookAhead();
	/*
	//R.print();
	cout&lt;&lt;&quot;\n now begin parsing...\n&quot;;
	if (argc==1)
	{
		P.parseFile(&quot;c:\\sourceTest.txt&quot;);
	}
	else
	{
		P.parseFile(argv[2]);
	}
*/
	return 0;
}</pre>
<pre>　</pre>
<pre>　</pre>
<pre><font color="#0000FF"><b><span lang="en-ca">The input of grammar file is something like following:(&quot;<a href="file:///c://ruleTest.txt">c:\\</span>LR<span lang="en-ca">Test.txt</span></a><span lang="en-ca"><span lang="en-ca">&quot;)</span></span></b></font></pre>
<pre>S -&gt; ( S ) S | e

</pre>
<pre></pre>
<pre><font color="#0000FF"><b>Here is the <a name="result"></a>result:</b></font></pre>

<pre>state no.0:
START ==&gt;  . S $
S ==&gt;  . ( S ) S
S ==&gt;  . e

state no.1:
START ==&gt; S  . $

state no.2:
S ==&gt;  . ( S ) S
S ==&gt;  . e
S ==&gt; (  . S ) S

state no.3:
S ==&gt; ( S  . ) S

state no.4:
S ==&gt;  . ( S ) S
S ==&gt;  . e
S ==&gt; ( S )  . S

state no.5:
S ==&gt; ( S ) S  .

S ==&gt; ( S ) S  | e
START ==&gt; S $
the following is LR table
    S,(,),e,START,$,
0|  1,s2,-1,-1,-1,-1,
1|  -1,-1,-1,-1,-1,-1,
2|  3,s2,-1,-1,-1,-1,
3|  -1,-1,s4,-1,-1,-1,
4|  5,s2,-1,-1,-1,-1,
5|  -1,r0,r0,r0,-1,r0,
Press any key to continue</pre>

<pre>　</pre>

<pre>
</pre>

<pre></pre>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                                   
&nbsp;&nbsp;&nbsp; <a href="WhoAmI.htm">                  







                       <img src="picture/back.gif" style="border: medium none" alt="back.gif (341 bytes)" width="32" height="35"></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
<a href="index.htm">
<img src="picture/up.gif" style="border: medium none" alt="up.gif (335 bytes)" width="35" height="32"></a>       &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                         
<img src="picture/next.gif" style="border: medium none" alt="next.gif (337 bytes)" width="32" height="35">          


</p>

</body>

</html>