<html>

<head><script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- MyFirstUnitAd -->
<ins class="adsbygoogle"
     style="display:inline-block;width:970px;height:250px"
     data-ad-client="ca-pub-5778386704669218"
     data-ad-slot="1503492166"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>

<meta http-equiv="Content-Language" content="zh-cn">
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<title>CFGReader</title>
</head>

<body>



<p align="left"><span lang="en-ca"><font size="6" color="#FF0000"><b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CFG 
Reader(</span>First<span lang="en-ca">) </span> </b></font></p>

<div align="left">
  <pre><b><font color="#ff0000" size="5">A. Fifth Edition</font></b></pre>
</div>
<div align="left">
  <pre><b>This is fifth?fourth? <span lang="en-ca">edition of my </span></b><b><span lang="en-ca">CFG Reader which stands for Context-Free-Grammar Reader. In this edition, </span></b></pre>
</div>
<div align="left">
  <pre><b><span lang="en-ca">I </span>finished the first set calculation and add a meta symbol---&quot;{&quot;, &quot;}&quot;.</b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">B</span>.<span lang="en-ca"><a name="problem"></a>The problem</span></font></b></pre>
</div>
<div align="left">
  <pre><b>What is First set?</b></pre>
</div>
<div align="left">
  <pre><b>Example: A  -&gt; a | b c | V</b></pre>
</div>
<div align="left">
  <pre><b>The variable A will have a first set of First(A) = {a, b, First(V)}.</b></pre>
</div>
<div align="left">
  <font FACE="CMSLTT10">
  <p ALIGN="LEFT">program -&gt; stmt-sequence<br>
  stmt-sequence -&gt; stmt-sequence ; statement | statement<br>
  statement -&gt; if-stmt | repeat-stmt | assign-stmt | read-stmt | write-stmt<br>
  if-stmt -&gt; if boolean-exp then stmt-sequence end<br>
  | if boolean-exp then stmt-sequence else stmt-sequence end<br>
  repeat-stmt -&gt; repeat stmt-sequence until boolean-exp<br>
  assign-stmt -&gt; identifier := integer-exp<br>
  read-stmt -&gt; read identifier<br>
  write-stmt -&gt; write integer-exp<br>
  boolean-exp -&gt; integer-exp comparison-op integer-exp<br>
  comparison-op -&gt; &lt; | =<br>
  integer-exp -&gt; integer-exp addop term | term<br>
  addop -&gt; + | -<br>
  term -&gt; term mulop factor | factor<br>
  mulop -&gt; * | /<br>
  factor -&gt; ( integer-exp ) | number | identifier</font></div>
<div align="left">
  <font FACE="CMSLTT10">
  <p ALIGN="LEFT">　</font></div>
<div align="left">
  <b><font color="#ff0000" size="5"><span lang="en-ca"><a name="explain"></a>C</span>.<span lang="en-ca">The
  </span></font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>idea of 
  program</b></font></span></div>
<div align="left">
  　</div>
<div align="left">
  <pre><b>This problem has a standard algorithm in text. So, will you just refer to the text if you are really interested in?</b></pre>
</div>
<div align="left">
  <pre><b>I highly suspect if there is anyone in my reader group would be interested in reading this or even want to have </b></pre>
</div>
<div align="left">
  <pre><b>any idea of &quot;what is First&quot; and how to calculate. </b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5">D.<span lang="en-ca"><a name="Method"></a>The </span>major functions</font></b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">E</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>Further improvement</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">F</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>File listing</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font size="3"><span lang="en-ca">1. CFGReader.h</span></font></b></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>2. </b></font></span><b><font size="3"><span lang="en-ca">CFGReader</span></font></b><span lang="en-ca"><font size="3"><b>.cpp  </b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>3. Grammar.h</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>4. Grammar.cpp</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>5. main.cpp (main)</b></font></span></pre>
  <pre>　</pre>
</div>
<div align="left">
  <pre>　</pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader.h</b></font></span></pre>
</div>
<pre>#include &quot;Grammar.h&quot;


class CFGReader
{
private:
	char buffer[BufferLength];
	FILE* stream;
	void newRule(const char* str);
	void readRule();
	void addRule(const char* str);
	int addToken(const char* str, bool isTerminal=true);
	void printRule(int index);
public:
	void readFromFile(const char* fileName);
	void optimize();
	void print();
};

</pre>
<pre>

</pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;

using namespace std;

Grammar grammar;

const char* deliStr=&quot; |\n&quot;;

//this would only remove the immediate left recursion
//and should I expand it to remove all kinds of left-recursion?


void CFGReader::readFromFile(const char* fileName)
{
	if ((stream=fopen(fileName, &quot;r&quot;))==NULL)
	{
		cout&lt;&lt;&quot;cannot open file &quot;&lt;&lt;fileName&lt;&lt;endl;
	}
	else
	{		
		readRule();		
	}
}

void CFGReader::readRule()
{
	char* ptr=buffer;
	char* hold;
	int count=0;

	while (!feof(stream))
	{
		count=0;
		fgets(buffer, BufferLength-1, stream);
		ptr=buffer;	
		
		while ((ptr=strtok(ptr, &quot; \n&quot;))!=NULL)
		{
			if (strcmp(ptr, &quot;|&quot;)!=0)
			{
				//test the first two token to see if old rule continues
				if (count==0)				
				{
					hold=ptr;
				}
				if (count==1)
				{
					if (strcmp(&quot;-&gt;&quot;, ptr)==0)
					{			
						/*
						curIndex=addToken(hold, false);//the index of cur token	
						grammar[curIndex]-&gt;terminal=false;
						newRule();	
						*/
						newRule(hold);
					}
					else
					{								
						addRule(hold);
						addRule(ptr);															
					}
				}
				if (count&gt;=2)//always add
				{
					addRule(ptr);
				}
				count++;
			}
			else
			{
				//this is an alternative production rule
				newRule(NULL);						
			}
			ptr=NULL;
		}		
	}
}

void CFGReader::newRule(const char* str)
{
	grammar.newRule(str);
}

void CFGReader::optimize()
{
	grammar.optimize();
}


void CFGReader::addRule(const char* str)
{
	grammar.addRule(str);
}

int CFGReader::addToken(const char* str, bool isTerminal)
{
	return grammar.addToken(str, isTerminal);
}



void CFGReader::printRule(int index)
{
	grammar.printRule(index);
}

void CFGReader::print()
{
	grammar.print();
	grammar.printToken();
}



</pre>
<pre>
</pre>
  <pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar.h</b></font></span></pre>
<pre>const int BufferLength=256;
const int MaxTokenCount=80;
const int MaxRHSCount=50;
const int MaxRuleCount=200;
const int MaxTokenLength=10;
const int MaxProduction=10;

struct Token
{
	bool isMeta;
	bool terminal;
	bool isNull;
	char* name;
	int production[MaxProduction];//pointer to the production it gives
	int count;
	int firstCount;
	int followCount;
	int follow[MaxTokenCount];
	int first[MaxTokenCount];
};

class Grammar
{
private:
	Token* token[MaxTokenCount];
	int tokenCount;
	int curIndex;//the index of current token
	int curPos;//the position at production rule
	//MaxRuleCount&gt;=MaxVariableCount!!!
	int production[MaxRuleCount][MaxRHSCount];
	int prodIndex;//pointing to current production, initialized to -1
	int checkRecursion(int tIndex, int curToken);
	bool addIntoFirst(int target, int source);
	bool addFirst(int target, int source);
	void shiftRule(int ruleIndex, bool Left2Right, int offset);
	void addAtEnd(int ruleIndex, int toAdd);
	void addRuleForToken(int tIndex, int ruleIndex);
	int copyRule(int source, int target);//return the position of -1
	void replaceRule(int curToken, int ruleIndex);
	void initialize();
	void removeRuleFromToken(int tIndex, int ruleIndex);
	void checkEpsilon(int ruleIndex);
	void removeImmediate(int tIndex, int ruleIndex);
	void doAddRule(int tIndex);
	void commonLeftFactor();
	int findCommonFactor(int tokenIndex, int ruleIndex);
	void removeLeftRecursion();
	void doCommonFactor(int tokenIndex, int ruleIndex, int index);
	int forgeToken(int index);//create a new variable
	void epsilonRule(int ruleIndex);
	//void calculateNull();
	void calculateFirst();
	void calculateFollow();
	void calculateNull();
	bool Null(int tIndex);
	bool addFirstTerminal(int target, int source);
	bool addFirstNonTerminal(int target, int source);
	void calculateProperty();
public:
	Grammar();
	void optimize();
	void printRule(int index);
	void print();
	void printToken();
	void addRule(const char* str);
	void newRule(const char* str);//this is an internal method, not suitable for outside
	Token* operator[](int index);
	int addToken(const char* str, bool isTerminal=true);
	Token* createToken(const char* theName, bool isTerminal);
	int tokenIndex(const char* str);
};</pre>
<pre>　</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;Grammar.h&quot;

using namespace std;

const char* emptyStr=&quot;e&quot;;
const char* optionBegin=&quot;{&quot;;
const char* optionEnd=&quot;}&quot;;
int emptyIndex=-1;

void Grammar::calculateProperty()
{
	calculateNull();
	calculateFirst();
	calculateFollow();
}

void Grammar::printToken()
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		cout&lt;&lt;&quot;name: &quot;&lt;&lt;token[i]-&gt;name&lt;&lt;endl;
		cout&lt;&lt;&quot;   isNull: &quot;&lt;&lt;(token[i]-&gt;isNull?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   isTerminal: &quot;&lt;&lt;(token[i]-&gt;terminal?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   first: &quot;;
		for (int j=0; j&lt;token[i]-&gt;firstCount; j++)
		{
			if (j!=0)
			{
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;&quot;[&quot;&lt;&lt;j&lt;&lt;&quot;]=&quot;&lt;&lt;token[token[i]-&gt;first[j]]-&gt;name;			
		}
		cout&lt;&lt;endl;
	}
}

//must use while loop to discover new set!!!!!
void Grammar::calculateFirst()
{
	int i;
	bool addNew;
	do
	{
		i=0;
		addNew=false;
		while (i&lt;tokenCount)
		{		
			//the terminal contains itself
			if (token[i]-&gt;terminal)
			{
				//addFirst don't judge if it is nullable!!
				addFirst(i, i);			
				//token[i]-&gt;first[token[i]-&gt;firstCount++]=i;
			}
			else
			{
				//for all its rules
				for (int j=0; j&lt;token[i]-&gt;count; j++)
				{
					int theToken, k=0;
					theToken=production[token[i]-&gt;production[j]][k];
					//for each token in each rule
					do 
					{
						//add non epsilon set
						if (addIntoFirst(i, theToken))
						{
							addNew=true;
						}
						//if it is not null, means it is end
						if (!token[theToken]-&gt;isNull)
						{
							break;
						}
						//if it is nullable, continue
						k++;
						theToken=production[token[i]-&gt;production[j]][k];
					}while (theToken!=-1);
					//it means all token in this rule is nullable, so
					if (theToken==-1)
					{
						//it is nullable
						//addEpsilonIntoFirst(i);
						addFirst(i, emptyIndex);
					}
				}
			}
			i++;
		}		
	}while (addNew);
}

void Grammar::calculateFollow()
{
}

void Grammar::addRuleForToken(int tIndex, int ruleIndex)
{
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=ruleIndex;
}

void Grammar::shiftRule(int ruleIndex, bool left2Right, int offset)
{
	int end=0;
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	if (left2Right)
	{
		for (int i=end; i&gt;=0; i--)
		{
			production[ruleIndex][i+offset]=production[ruleIndex][i];
		}
	}
	else
	{
		for (int i=0; i&lt;=end-offset; i++)
		{
			production[ruleIndex][i]=production[ruleIndex][i+offset];
		}
		checkEpsilon(ruleIndex);
	}
}

void Grammar::checkEpsilon(int ruleIndex)
{
	if (production[ruleIndex][0]==-1)
	{
		epsilonRule(ruleIndex);
	}
}

bool Grammar::addFirstTerminal(int target, int source)
{
	//addFirst
	token[target]-&gt;first[token[target]-&gt;firstCount++]=source;
	//if it is epsilon
	if (token[source]-&gt;isNull)
	{
		//token[target]-&gt;isNull=true;//unless all is
		return true;
	}
	return false;
}

bool Grammar::addFirst(int target, int source)
{
	//don't calculate meta symbol
	if (token[source]-&gt;isMeta)
	{
		return false;
	}
	//check if it is already there
	for (int i=0; i&lt;token[target]-&gt;firstCount; i++)
	{
		if (token[target]-&gt;first[i]==source)
		{
			return false;
		}
	}
	//add at end
	token[target]-&gt;first[token[target]-&gt;firstCount++]=source;
	return true;
}

//add non epsilon into it.
bool Grammar::addIntoFirst(int target, int source)
{
	bool addNew=false;
	for (int i=0; i&lt;token[source]-&gt;firstCount; i++)
	{
		//add non epsilon
		if (!token[token[source]-&gt;first[i]]-&gt;isNull)
		{
			if (addFirst(target, token[source]-&gt;first[i]))
			{
				addNew=true;
			}
		}
	}
	return addNew;
}
	/*
	int theToken;
	bool added, tokenIsNull=false;
	//if the token epsilon is added, surely the token will be nullable
	if (token[source]-&gt;terminal)
	{
		return addFirstTerminal(int target, int source));			
	}
	else
	{
		added=false;
		for (int i=0; i&lt;token[source]-&gt;count; i++)
		{
			int j=0;
			theToken=production[token[source]-&gt;production[i]][j];
			do
			{
				if (token[theToken]-&gt;terminal)
				{
					//it means it is terminal epsilon
					if (!addFirstTerminal(target, theToken))
					{
						break;
					}
				}
				else
				{
					//if it is nullable
					if (!addFirstNonTerminal(target, theToken))
					{
						break;
					}
				}
				j++;
				theToken=production[token[source]-&gt;production[i]][j];
			}while (theToken!=-1);
			//it must be the case that every choice is nullable
			if (theToken==-1)
			{
				token[target]-&gt;isNull=true;
			}					
		}
	}

	return true;
	*/


bool Grammar::Null(int tIndex)
{
	if (token[tIndex]-&gt;terminal)
	{
		return token[tIndex]-&gt;isNull;		
	}
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		int j=0, theToken;
		theToken=production[token[tIndex]-&gt;production[i]][j];
		do
		{
			if (!Null(theToken))
			{
				break;
			}
			j++;
			theToken=production[token[tIndex]-&gt;production[i]][j];
		}while (theToken!=-1);
		if (theToken==-1)
		{			
			return true;
		}
	}
	return false;
}

void Grammar::calculateNull()
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		token[i]-&gt;isNull=Null(i);	
	}
}
		

void Grammar::addAtEnd(int ruleIndex, int toAdd)
{
	int end=0;
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	production[ruleIndex][end++]=toAdd;
	production[ruleIndex][end]=-1;
}

//left-recursion:  A -&gt; A a | b | c
//change to: A  -&gt; b A' | c A'
//			 A' -&gt; a A' | empty
void Grammar::removeImmediate(int tIndex, int ruleIndex)
{
	int newIndex=forgeToken(tIndex);
	int holdRuleIndex=token[tIndex]-&gt;production[ruleIndex];
	//sequence matters!

	//change to: A -&gt; b A'
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		if (i!=ruleIndex)
		{
			addAtEnd(token[tIndex]-&gt;production[i], newIndex);
		}
	}
	
	//shift
	/*
	token[tIndex]-&gt;count--;
	for (i=ruleIndex; i&lt;token[tIndex]-&gt;count; i++)
	{
		token[tIndex]-&gt;production[i]=token[tIndex]-&gt;production[i+1];
	}
	*/
	removeRuleFromToken(tIndex, ruleIndex);
	
	addRuleForToken(newIndex, holdRuleIndex);
	//token[newIndex]-&gt;production[token[newIndex]-&gt;count++]=holdRuleIndex;

	shiftRule(holdRuleIndex, false, 1);
	addAtEnd(holdRuleIndex, newIndex);
	
	//add epsilon rule for new variable
	epsilonRule(++prodIndex);
	addRuleForToken(newIndex, prodIndex);
}


int Grammar::forgeToken(int index)
{
	char str[MaxTokenLength+2], ch;
	int len=strlen(token[index]-&gt;name);
	int temp=0, i=0;
	strcpy(str, token[index]-&gt;name);
	ch=str[len-1];//get last char of name
	if (ch&gt;='0'&amp;&amp;ch&lt;'9')
	{
		str[len-1]=ch+i+1;
	}
	else
	{
		str[len]='0'+i;
		str[len+1]='\0';
	}
	//I won't bother to check repetitation of name
	while (tokenIndex(str)!=-1)
	{
		i++;
		if (ch&gt;='0'&amp;&amp;ch&lt;'9')
		{
			str[len-1]=ch+i+1;
		}
		else
		{
			str[len]='0'+i;
			str[len+1]='\0';
		}
	}

	return addToken(str, false);//it is non-terminal for sure	
}

int Grammar::copyRule(int source, int target)
{
	int i=0;
	while (production[source][i]!=-1) 
	{
		production[target][i]=production[source][i];
		i++;
	}
	production[target][i]=-1;
	return i;
}

void Grammar::doAddRule(int tIndex)
{	
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=++prodIndex;
}

void Grammar::addRule(const char* str)
{
	int index;
	index=addToken(str);
	production[prodIndex][curPos++]=index;
	production[prodIndex][curPos]=-1;//set end
}


//if the token is already there, it return the index
//otherwise, it add new node in token array
int Grammar::addToken(const char* str, bool isTerminal)
{
	int index;
	index=tokenIndex(str);
	if (index==-1)
	{
		index=tokenCount;
	}
	else
	{		
		return index;
	}
	token[index]=createToken(str, isTerminal);	
	tokenCount++;
	if (strcmp(str, emptyStr)==0)
	{
		token[index]-&gt;isNull=true;
		emptyIndex=index;
	}
	return index;
}

void Grammar::newRule(const char* str)
{
	if (str!=NULL)
	{
		curIndex=addToken(str, false);
	}
	//add one pointer
	token[curIndex]-&gt;production[token[curIndex]-&gt;count++]=++prodIndex;
	token[curIndex]-&gt;terminal=false;
	curPos=0;//reset to restart;
}

Token* Grammar::createToken(const char* theName, bool isTerminal)
{
	Token* ptr=new Token;
	ptr-&gt;name=new char[strlen(theName)+1];
	strcpy(ptr-&gt;name, theName);
	ptr-&gt;terminal=isTerminal;
	ptr-&gt;count=ptr-&gt;firstCount=ptr-&gt;followCount=0;
	ptr-&gt;isNull=false;
	if (strcmp(theName, optionBegin)==0||strcmp(theName, optionEnd)==0)
	{
		ptr-&gt;isMeta=true;
	}
	else
	{
		ptr-&gt;isMeta=false;
	}
	return ptr;
}

int Grammar::tokenIndex(const char* str)
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (strcmp(str, token[i]-&gt;name)==0)
		{
			return i;
		}
	}
	return -1;
}

int Grammar::checkRecursion(int tIndex, int curToken)
{
	for (int i=0; i&lt;token[curToken]-&gt;count; i++)
	{
		//token[tIndex]-&gt;production[i]=ruleIndex
		//production[ruleIndex][0] is the first left-recursion one
		if (production[token[curToken]-&gt;production[i]][0]&lt;=curToken)
		{
			return i;
		}
	}
	return -1;
}

void Grammar::printRule(int index)
{
	int nodeIndex=0;
	while (production[index][nodeIndex]!=-1)
	{
		cout&lt;&lt;token[production[index][nodeIndex]]-&gt;name&lt;&lt;&quot; &quot;;
		nodeIndex++;
	}
}



void Grammar::initialize()
{
	tokenCount=curIndex=curPos=0;
	prodIndex=-1;//in order to ++ blindly
}

Grammar::Grammar()
{
	initialize();
}

void Grammar::removeLeftRecursion()
{
	int tIndex=0, curToken=0;
	bool newChange=false;
	while (tIndex&lt;tokenCount)
	{		
		if (!token[tIndex]-&gt;terminal)
		{
			for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
			{
				curToken=production[token[tIndex]-&gt;production[i]][0];
				if (curToken&lt;=tIndex&amp;&amp;!token[curToken]-&gt;terminal)
				{
					if (curToken!=tIndex)
					{
						replaceRule(tIndex, i);
					}
					else
					{
						removeImmediate(tIndex, i);
					}
					newChange=true;				
				}
			}
		}
		//whenever there is some new findings, restart
		if (!newChange)
		{
			tIndex++;
		}
		else
		{
			tIndex=0;
			newChange=false;
		}
	}
}

void Grammar::replaceRule(int tIndex, int ruleIndex)
{
	int pos, j, targetEnd, sourceEnd, curRule;
	int curToken=production[token[tIndex]-&gt;production[ruleIndex]][0];
	for (int i=token[curToken]-&gt;count-1; i&gt;=0; i--)
	{
		if (i&gt;0)
		{
			doAddRule(tIndex);
			pos=copyRule(token[curToken]-&gt;production[i], prodIndex);
			j=0;
			while (production[token[tIndex]-&gt;production[ruleIndex]][j+1]!=-1)
			{
				production[prodIndex][pos+j]=
					production[token[tIndex]-&gt;production[ruleIndex]][j+1];
				j++;
			}
			production[prodIndex][pos+j]=-1;
		}
		else
		{
			targetEnd=sourceEnd=0;
			curRule=token[tIndex]-&gt;production[ruleIndex];
			while (true)
			{
				if (production[token[curToken]-&gt;production[0]][sourceEnd]==-1&amp;&amp;
				production[curRule][targetEnd]==-1)
				{
					break;
				}
				if (production[token[curToken]-&gt;production[0]][sourceEnd]!=-1)
				{
					sourceEnd++;
				}
				if (production[curRule][targetEnd]!=-1)
				{
					targetEnd++;
				}
			}
			j=targetEnd+sourceEnd-1;
			while (j&gt;=0)
			{
				if (j&gt;=sourceEnd)
				{
					production[curRule][j]=production[curRule][j-sourceEnd+1];
				}
				else
				{
					production[curRule][j]=production[token[curToken]-&gt;production[0]][j];
				}
				j--;
			}
		}
	}
}


//optimize sequence is first common-factor then remove left recursion
//therefore I don't have to check if for same variable if there will be
//more than one left-recursion
void Grammar::optimize()
{
	removeLeftRecursion();
	commonLeftFactor();
	calculateNull();
	calculateFirst();
	calculateFollow();
}

int Grammar::findCommonFactor(int tIndex, int ruleIndex)
{
	for (int i=ruleIndex+1; i&lt;token[tIndex]-&gt;count; i++)
	{	
		//if the two rule has the same first token
		if (production[token[tIndex]-&gt;production[ruleIndex]][0]==
			production[token[tIndex]-&gt;production[i]][0])
		{
			/*
			//calculate if there is epsilon
			if (emptyIndex==-1)
			{
				emptyIndex=tokenIndex(emptyStr);
			}
			//if it is not epsilon
			if (production[token[tIndex]-&gt;production[i]][0]!=emptyIndex)
			{
				return i;
			}
			*/
			return i;
		}
	}
	return -1;
}

void Grammar::epsilonRule(int ruleIndex)
{
	production[ruleIndex][0]=addToken(emptyStr);
	production[ruleIndex][1]=-1;
}

//sample:  x -&gt; Aa 
//         x -&gt; Ab
//changed to:  x -&gt; Ax' //this is to change the old rule
//             x' -&gt; b //this is to change the old rule
//             x' -&gt; a //this is the new-added rule
void Grammar::doCommonFactor(int tIndex, int ruleIndex, int index)
{
	int newTokenIndex=forgeToken(tIndex);//create a new variable
	//move the second and after part to the new rule of new variable
	//doing: x' -&gt; a
	//curPos=0;
	//prepare to add one new rule 	
	addRuleForToken(newTokenIndex, ++prodIndex);
	//token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=++prodIndex;
	token[newTokenIndex]-&gt;terminal=false;

	copyRule(token[tIndex]-&gt;production[ruleIndex], prodIndex);
	shiftRule(prodIndex, false, 1);
	/*
	do
	{
		
		//do copying
		production[prodIndex][curPos]=
			production[token[tIndex]-&gt;production[ruleIndex]][curPos+1];
		curPos++;
	//even the -1 at end is copied
	}while (production[token[tIndex]-&gt;production[ruleIndex]][curPos]!=-1);
	*/

	//in order to show an empty string, in case the string is &quot;epsilon&quot;

	
	/*
	if (curPos==1)
	{
		epsilonRule(prodIndex);
	}
	*/


	//replace x -&gt; Aa with x -&gt; Ax'
	production[token[tIndex]-&gt;production[ruleIndex]][1]=newTokenIndex;
	production[token[tIndex]-&gt;production[ruleIndex]][2]=-1;//end
	
	//doing: x' -&gt; b
	//curPos=0;
	//prepare to add one new rule 
	//pointing new token to where old rule lies
	addRuleForToken(newTokenIndex, token[tIndex]-&gt;production[index]);
	/*
	token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=
		token[tIndex]-&gt;production[index];
		*/

	shiftRule(token[tIndex]-&gt;production[index], false, 1);
	/*
	do 
	{
		//left-shift productions
		production[token[tIndex]-&gt;production[index]][curPos]=
			production[token[tIndex]-&gt;production[index]][curPos+1];
		curPos++;

	}while (production[token[tIndex]-&gt;production[index]][curPos]!=-1);
	*/
	//add epsilon
	/*
	if (curPos==1)
	{
		epsilonRule(token[tIndex]-&gt;production[index]);
	}
	*/
	//remove from token's production array the rule-index----&quot;index&quot;
	//shrink the array by one
	/*
	for (int i=index; i&lt;token[tIndex]-&gt;count; i++)
	{
		token[tIndex]-&gt;production[i]=token[tIndex]-&gt;production[i+1];
	}
	token[tIndex]-&gt;count--;
	*/
	removeRuleFromToken(tIndex, index);
}

void Grammar::removeRuleFromToken(int tIndex, int ruleIndex)
{
	for (int i=ruleIndex; i&lt;token[tIndex]-&gt;count; i++)
	{
		token[tIndex]-&gt;production[i]=token[tIndex]-&gt;production[i+1];
	}
	token[tIndex]-&gt;count--;
}


void Grammar::commonLeftFactor()
{
	int index=-1, tIndex=0, ruleIndex=0;
	bool flag;
	//whenever a newrule is done, restart!
	while (tIndex&lt;tokenCount)
	{
		ruleIndex=0;
		flag=false;		
		while (ruleIndex&lt;token[tIndex]-&gt;count)
		{			
			index=findCommonFactor(tIndex, ruleIndex);
			if (index!=-1)
			{
				doCommonFactor(tIndex, ruleIndex, index);
				//restart
				flag=true;
				break;
			}
			else
			{
				ruleIndex++;
			}
		}
		if (flag)
		{
			tIndex=0;		
		}
		else
		{
			tIndex++;
		}
	}
}

Token* Grammar::operator[](int index)
{
	if (index&gt;=0&amp;&amp;index&lt;tokenCount)
	{
		return token[index];
	}
	else
	{
		return NULL;
	}
}

void Grammar::print()
{
	for (int i=0; i&lt;tokenCount; i++)
	{		
		if (!token[i]-&gt;terminal)
		{
			cout&lt;&lt;token[i]-&gt;name&lt;&lt;&quot; ==&gt; &quot;;
			for (int j=0; j&lt;token[i]-&gt;count; j++)
			{			
				printRule(token[i]-&gt;production[j]);	
				if (j!=token[i]-&gt;count-1)
				{
					cout&lt;&lt;&quot; | &quot;;
				}
			}
			cout&lt;&lt;&quot;\n&quot;;
		}		
	}
}
</pre>
<pre>
</pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: main</b></font></span><font size="3" color="#FF0000"><b>.cpp (main)</b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;

using namespace std;

int main()
{
	CFGReader R;
	R.readFromFile(&quot;c:\\ruleTest.txt&quot;);
    R.optimize();
	R.print();
	return 0;
}</pre>
<pre>　</pre>
<pre>　</pre>
<pre><font color="#0000FF"><b><span lang="en-ca">The input is something like following:</span></b></font></pre>
<pre><font color="#0000FF"><span lang="en-ca"><b>Please note that all tokens must be separated by space, including &quot;|&quot; and &quot;-&gt;&quot;</b></span><b> and &quot;{&quot;, &quot;}&quot; is meta symbol </b></font></pre>
<pre><font color="#0000FF"><b>equivalent to Kaleen star<span lang="en-ca">.</span> The <a href="ruleTest.txt">file is here</a>.</b></font></pre>
<pre>M' -&gt; program i ; Dl B
Dl -&gt; Dv Ml
Ml -&gt; Ml M | e
M -&gt; module i ( Vl ) Dv B
Dv -&gt; variables Vl | e
Vl -&gt; Vl V | V
v -&gt; Il : T ;
T -&gt; integer Ad ; | char Ad
Il -&gt; i | Il , i
Ad -&gt; e | array [ n ]
B -&gt; begin Sl end ;
Sl -&gt; S | S Sl
S -&gt; L := E ; | if C then S else S | loop Sl end ; | exit ; i ( Lp ) ; | B | read Ln ; | write Ln ; | e ;
Lp -&gt; e | Ln
Ln -&gt; L { , L }
L -&gt; i Ar
Ar -&gt; e | [ E ]
E -&gt; F { Oa F }
Oa -&gt; + | -
F -&gt; R { Om R }
Om -&gt; * | /
R -&gt; L | n | ( E ) | c
C -&gt; E Or E
Or -&gt; = | &lt; | &gt; | &lt;= | &gt;=


</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre><font color="#0000FF"><b>Here is the <a name="result"></a>result:</b></font></pre>

<pre>M' ==&gt; program i ; Dl B
Dl ==&gt; Dv Ml
B ==&gt; begin Sl end ;
Dv ==&gt; variables Vl  | e
Ml ==&gt; e Ml0
M ==&gt; module i ( Vl ) Dv B
Vl ==&gt; V Vl0
v ==&gt; Il : T ;
Il ==&gt; i Il0
T ==&gt; integer Ad ;  | char Ad
Ad ==&gt; e  | array [ n ]
Sl ==&gt; S Sl0
S ==&gt; L := E ;  | if C then S else S  | loop Sl end ;  | exit ; i ( Lp ) ;  | begin Sl end ;  | read
 Ln ;  | write Ln ;  | e ;
L ==&gt; i Ar
E ==&gt; F { Oa F }
C ==&gt; F { Oa F } Or E
Lp ==&gt; e  | Ln
Ln ==&gt; i Ar { , L }
Ar ==&gt; e  | [ E ]
F ==&gt; R { Om R }
Oa ==&gt; +  | -
R ==&gt; i Ar  | n  | ( E )  | c
Om ==&gt; *  | /
Or ==&gt; =  | &lt;  | &gt;  | &lt;=  | &gt;=
Ml0 ==&gt; module i ( Vl ) Dv B Ml0  | e
Vl0 ==&gt; V Vl0  | e
Il0 ==&gt; , i Il0  | e
Sl0 ==&gt; e  | Sl
name: M'
   isNull: false
   isTerminal: false
   first: [0]=program
name: program
   isNull: false
   isTerminal: true
   first: [0]=program
name: i
   isNull: false
   isTerminal: true
   first: [0]=i
name: ;
   isNull: false
   isTerminal: true
   first: [0]=;
name: Dl
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=variables,[2]=module
name: B
   isNull: false
   isTerminal: false
   first: [0]=begin
name: Dv
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=variables
name: Ml
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=module
name: M
   isNull: false
   isTerminal: false
   first: [0]=module
name: e
   isNull: true
   isTerminal: true
   first: [0]=e
name: module
   isNull: false
   isTerminal: true
   first: [0]=module
name: (
   isNull: false
   isTerminal: true
   first: [0]=(
name: Vl
   isNull: false
   isTerminal: false
   first: [0]=V
name: )
   isNull: false
   isTerminal: true
   first: [0]=)
name: variables
   isNull: false
   isTerminal: true
   first: [0]=variables
name: V
   isNull: false
   isTerminal: true
   first: [0]=V
name: v
   isNull: false
   isTerminal: false
   first: [0]=i
name: Il
   isNull: false
   isTerminal: false
   first: [0]=i
name: :
   isNull: false
   isTerminal: true
   first: [0]=:
name: T
   isNull: false
   isTerminal: false
   first: [0]=integer,[1]=char
name: integer
   isNull: false
   isTerminal: true
   first: [0]=integer
name: Ad
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=array
name: char
   isNull: false
   isTerminal: true
   first: [0]=char
name: ,
   isNull: false
   isTerminal: true
   first: [0]=,
name: array
   isNull: false
   isTerminal: true
   first: [0]=array
name: [
   isNull: false
   isTerminal: true
   first: [0]=[
name: n
   isNull: false
   isTerminal: true
   first: [0]=n
name: ]
   isNull: false
   isTerminal: true
   first: [0]=]
name: begin
   isNull: false
   isTerminal: true
   first: [0]=begin
name: Sl
   isNull: false
   isTerminal: false
   first: [0]=begin,[1]=;,[2]=i,[3]=if,[4]=loop,[5]=exit,[6]=read,[7]=write
name: end
   isNull: false
   isTerminal: true
   first: [0]=end
name: S
   isNull: false
   isTerminal: false
   first: [0]=begin,[1]=;,[2]=i,[3]=if,[4]=loop,[5]=exit,[6]=read,[7]=write
name: L
   isNull: false
   isTerminal: false
   first: [0]=i
name: :=
   isNull: false
   isTerminal: true
   first: [0]=:=
name: E
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=n,[2]=(,[3]=c
name: if
   isNull: false
   isTerminal: true
   first: [0]=if
name: C
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=n,[2]=(,[3]=c
name: then
   isNull: false
   isTerminal: true
   first: [0]=then
name: else
   isNull: false
   isTerminal: true
   first: [0]=else
name: loop
   isNull: false
   isTerminal: true
   first: [0]=loop
name: exit
   isNull: false
   isTerminal: true
   first: [0]=exit
name: Lp
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=i
name: read
   isNull: false
   isTerminal: true
   first: [0]=read
name: Ln
   isNull: false
   isTerminal: false
   first: [0]=i
name: write
   isNull: false
   isTerminal: true
   first: [0]=write
name: {
   isNull: false
   isTerminal: true
   first:
name: }
   isNull: false
   isTerminal: true
   first:
name: Ar
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=[
name: F
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=n,[2]=(,[3]=c
name: Oa
   isNull: false
   isTerminal: false
   first: [0]=+,[1]=-
name: +
   isNull: false
   isTerminal: true
   first: [0]=+
name: -
   isNull: false
   isTerminal: true
   first: [0]=-
name: R
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=n,[2]=(,[3]=c
name: Om
   isNull: false
   isTerminal: false
   first: [0]=*,[1]=/
name: *
   isNull: false
   isTerminal: true
   first: [0]=*
name: /
   isNull: false
   isTerminal: true
   first: [0]=/
name: c
   isNull: false
   isTerminal: true
   first: [0]=c
name: Or
   isNull: false
   isTerminal: false
   first: [0]==,[1]=&lt;,[2]=&gt;,[3]=&lt;=,[4]=&gt;=
name: =
   isNull: false
   isTerminal: true
   first: [0]==
name: &lt;
   isNull: false
   isTerminal: true
   first: [0]=&lt;
name: &gt;
   isNull: false
   isTerminal: true
   first: [0]=&gt;
name: &lt;=
   isNull: false
   isTerminal: true
   first: [0]=&lt;=
name: &gt;=
   isNull: false
   isTerminal: true
   first: [0]=&gt;=
name: Ml0
   isNull: true
   isTerminal: false
   first: [0]=module,[1]=e
name: Vl0
   isNull: true
   isTerminal: false
   first: [0]=V,[1]=e
name: Il0
   isNull: true
   isTerminal: false
   first: [0]=,,[1]=e
name: Sl0
   isNull: true
   isTerminal: false
   first: [0]=e,[1]=begin,[2]=;,[3]=i,[4]=if,[5]=loop,[6]=exit,[7]=read,[8]=write
Press any key to continue</pre>

<pre>






</pre>

<pre></pre>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                                   
&nbsp;&nbsp;&nbsp; <a href="WhoAmI.htm">                  







                       <img src="picture/back.gif" style="border: medium none" alt="back.gif (341 bytes)" width="32" height="35"></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
<a href="index.htm"><img src="picture/up.gif" style="border: medium none" alt="up.gif (335 bytes)" width="35" height="32"></a>       &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                         
<img src="picture/next.gif" style="border: medium none" alt="next.gif (337 bytes)" width="32" height="35">          


</p>

</body>

</html>