<html>

<head><script async src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script>
<!-- MyFirstUnitAd -->
<ins class="adsbygoogle"
     style="display:inline-block;width:970px;height:250px"
     data-ad-client="ca-pub-5778386704669218"
     data-ad-slot="1503492166"></ins>
<script>
(adsbygoogle = window.adsbygoogle || []).push({});
</script>

<meta http-equiv="Content-Language" content="zh-cn">
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<title>CFGReader</title>
</head>

<body>



<p align="left"><span lang="en-ca"><font size="6" color="#FF0000"><b>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; CFG 
Reader(</span>Follow<span lang="en-ca">) </span> </b></font></p>

<div align="left">
  <pre><b><font color="#ff0000" size="5">A. Sixth Edition</font></b></pre>
</div>
<div align="left">
  <pre><b>This is fifth?fourth? <span lang="en-ca">edition of my </span></b><b><span lang="en-ca">CFG Reader which stands for Context-Free-Grammar Reader. In this edition, </span></b></pre>
</div>
<div align="left">
  <pre><b><span lang="en-ca">I </span>finished the follow set calculation and removed meta symbol---&quot;{&quot;, &quot;}&quot; from last version because it is same</b></pre>
</div>
<div align="left">
  <pre><b>thing as &quot;|&quot;.</b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">B</span>.<span lang="en-ca"><a name="problem"></a>The problem</span></font></b></pre>
</div>
<div align="left">
  <pre><b>What is Follow set?</b></pre>
</div>
<div align="left">
  <pre><b>Example: B -&gt; A a</b></pre>
</div>
<div align="left">
  <pre><b>The variable A will have a follow set of Follow(A) = {a}.</b></pre>
</div>
<div align="left">
  <font FACE="CMSLTT10">
  <p ALIGN="LEFT">program -&gt; stmt-sequence<br>
  stmt-sequence -&gt; stmt-sequence ; statement | statement<br>
  statement -&gt; if-stmt | repeat-stmt | assign-stmt | read-stmt | write-stmt<br>
  if-stmt -&gt; if boolean-exp then stmt-sequence end<br>
  | if boolean-exp then stmt-sequence else stmt-sequence end<br>
  repeat-stmt -&gt; repeat stmt-sequence until boolean-exp<br>
  assign-stmt -&gt; identifier := integer-exp<br>
  read-stmt -&gt; read identifier<br>
  write-stmt -&gt; write integer-exp<br>
  boolean-exp -&gt; integer-exp comparison-op integer-exp<br>
  comparison-op -&gt; &lt; | =<br>
  integer-exp -&gt; integer-exp addop term | term<br>
  addop -&gt; + | -<br>
  term -&gt; term mulop factor | factor<br>
  mulop -&gt; * | /<br>
  factor -&gt; ( integer-exp ) | number | identifier</font></div>
<div align="left">
  <font FACE="CMSLTT10">
  <p ALIGN="LEFT">　</font></div>
<div align="left">
  <b><font color="#ff0000" size="5"><span lang="en-ca"><a name="explain"></a>C</span>.<span lang="en-ca">The
  </span></font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>idea of 
  program</b></font></span></div>
<div align="left">
  　</div>
<div align="left">
  <pre><b>This problem has a standard algorithm in text. So, will you just refer to the text if you are really interested in?</b></pre>
</div>
<div align="left">
  <pre><b>I highly suspect if there is anyone in my reader group would be interested in reading this or even want to have </b></pre>
</div>
<div align="left">
  <pre><b>any idea of &quot;what is First&quot; and how to calculate. </b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5">D.<span lang="en-ca"><a name="Method"></a>The </span>major functions</font></b></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">E</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>Further improvement</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font color="#ff0000" size="5"><span lang="en-ca">F</span>.</font></b><span lang="en-ca"><font size="5" color="#FF0000"><b>File listing</b></font></span></pre>
</div>
<div align="left">
  <pre><b><font size="3"><span lang="en-ca">1. CFGReader.h</span></font></b></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>2. </b></font></span><b><font size="3"><span lang="en-ca">CFGReader</span></font></b><span lang="en-ca"><font size="3"><b>.cpp  </b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>3. Grammar.h</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>4. Grammar.cpp</b></font></span></pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3"><b>5. main.cpp (main)</b></font></span></pre>
  <pre>　</pre>
</div>
<div align="left">
  <pre>　</pre>
</div>
<div align="left">
  <pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader.h</b></font></span></pre>
</div>
<pre>#include &quot;Grammar.h&quot;


class CFGReader
{
private:
	char buffer[BufferLength];
	FILE* stream;
	void newRule(const char* str);
	void readRule();
	void addRule(const char* str);
	int addToken(const char* str, bool isTerminal=true);
	void printRule(int index);
public:
	void readFromFile(const char* fileName);
	void optimize();
	void print();
};

</pre>
<pre>　</pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: CFGReader</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;

using namespace std;

Grammar grammar;

const char* deliStr=&quot; |\n&quot;;

//this would only remove the immediate left recursion
//and should I expand it to remove all kinds of left-recursion?


void CFGReader::readFromFile(const char* fileName)
{
	if ((stream=fopen(fileName, &quot;r&quot;))==NULL)
	{
		cout&lt;&lt;&quot;cannot open file &quot;&lt;&lt;fileName&lt;&lt;endl;
	}
	else
	{		
		readRule();		
	}
}

void CFGReader::readRule()
{
	char* ptr=buffer;
	char* hold;
	int count=0;

	while (!feof(stream))
	{
		count=0;
		fgets(buffer, BufferLength-1, stream);
		ptr=buffer;	
		
		while ((ptr=strtok(ptr, &quot; \n&quot;))!=NULL)
		{
			if (strcmp(ptr, &quot;|&quot;)!=0)
			{
				//test the first two token to see if old rule continues
				if (count==0)				
				{
					hold=ptr;
				}
				if (count==1)
				{
					if (strcmp(&quot;-&gt;&quot;, ptr)==0)
					{			
						/*
						curIndex=addToken(hold, false);//the index of cur token	
						grammar[curIndex]-&gt;terminal=false;
						newRule();	
						*/
						newRule(hold);
					}
					else
					{								
						addRule(hold);
						addRule(ptr);															
					}
				}
				if (count&gt;=2)//always add
				{
					addRule(ptr);
				}
				count++;
			}
			else
			{
				//this is an alternative production rule
				newRule(NULL);						
			}
			ptr=NULL;
		}		
	}
}

void CFGReader::newRule(const char* str)
{
	grammar.newRule(str);
}

void CFGReader::optimize()
{
	grammar.optimize();
}


void CFGReader::addRule(const char* str)
{
	grammar.addRule(str);
}

int CFGReader::addToken(const char* str, bool isTerminal)
{
	return grammar.addToken(str, isTerminal);
}



void CFGReader::printRule(int index)
{
	grammar.printRule(index);
}

void CFGReader::print()
{
	grammar.print();
	grammar.printToken();
	//grammar.printAllRules();
}



</pre>
<pre>
</pre>
<pre></pre>
  <pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar.h</b></font></span></pre>
<pre>const int BufferLength=256;
const int MaxTokenCount=80;
const int MaxRHSCount=50;
const int MaxRuleCount=200;
const int MaxTokenLength=10;
const int MaxProduction=10;

struct Token
{
	//bool isMeta;
	bool terminal;
	bool isNull;
	char* name;
	int production[MaxProduction];//pointer to the production it gives
	int count;
	int firstCount;
	int followCount;
	int follow[MaxTokenCount];
	int first[MaxTokenCount];
};

class Grammar
{
private:
	Token* token[MaxTokenCount];
	int tokenCount;
	int curIndex;//the index of current token
	int curPos;//the position at production rule
	//MaxRuleCount&gt;=MaxVariableCount!!!
	int production[MaxRuleCount][MaxRHSCount];
	int prodIndex;//pointing to current production, initialized to -1
	int checkRecursion(int tIndex, int curToken);
	void addBeginEnd();
	bool addIntoFirst(int target, int source);
	bool addFirst(int target, int source);
	void shiftRule(int ruleIndex, bool Left2Right, int offset);
	void addAtEnd(int ruleIndex, int toAdd);
	void addRuleForToken(int tIndex, int ruleIndex);
	int copyRule(int source, int target);//return the position of -1
	void replaceRule(int curToken, int ruleIndex);
	int findMetaSymbol(int&amp; begin, int&amp; end);
	void initialize();
	void doReplaceMetaSymbol(int ruleIndex, int begin, int end);
	void removeRuleFromToken(int tIndex, int ruleIndex);
	void checkEpsilon(int ruleIndex);
	void removeImmediate(int tIndex, int ruleIndex);
	void doAddRule(int tIndex);
	void commonLeftFactor();
	int findCommonFactor(int tokenIndex, int ruleIndex);
	void removeLeftRecursion();
	void doCommonFactor(int tokenIndex, int ruleIndex, int index);
	int forgeToken(int index);//create a new variable
	void epsilonRule(int ruleIndex);
	bool isRedundant(int tIndex);
	void replaceMetaSymbol();
	void calculateFirst();
	void calculateFollow();
	void calculateNull();
	bool Null(int tIndex);
	bool addFollow(int target, int source);
	bool addIntoFollow(int target, int source);
	bool addFollowIntoFollow(int target, int source);
    void removeRedundant();
	void removeToken(int tIndex);
	//	void calculateProperty();
public:
	Grammar();
	void optimize();
	void printRule(int index);
	void print();
	void printAllRules();
	void printToken();
	void addRule(const char* str);
	void newRule(const char* str);//this is an internal method, not suitable for outside
	Token* operator[](int index);
	int addToken(const char* str, bool isTerminal=true);
	Token* createToken(const char* theName, bool isTerminal);
	int tokenIndex(const char* str);
};</pre>
<pre>　</pre>
<pre>　</pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: Grammar</b></font></span><font size="3" color="#FF0000"><b>.cpp </b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;Grammar.h&quot;

using namespace std;

const char* emptyStr=&quot;e&quot;;
const char* optionBegin=&quot;{&quot;;
const char* optionEnd=&quot;}&quot;;
const char* startStr=&quot;START&quot;;
const char* endStr=&quot;$&quot;;
int startSymbolIndex=-1;
int stackBottomIndex=-1;
int beginIndex=-1;
int endIndex=-1;
int emptyIndex=-1;

//leave the rule unremoved!!! as I have no way to do it!!!
void Grammar::removeToken(int tIndex)
{	
	delete[]token[tIndex]-&gt;name;
	delete token[tIndex];
	tokenCount--;
	for (int i=tIndex; i&lt;tokenCount; i++)
	{
		token[i]=token[i+1];
	}
}
	

bool Grammar::isRedundant(int tIndex)
{
	int k, theRule, theToken;
	for (int i=0; i&lt;tokenCount; i++)
	{
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			k=0;
			theRule=token[i]-&gt;production[j];
			theToken=production[theRule][k];
			while (theToken!=-1)
			{
				if (theToken==tIndex)
				{
					return false;
				}
				k++;
				theToken=production[theRule][k];
			}
		}
	}
	return true;
}


//what is redundant? except the start variable
//the non-terminal never appears in other rules!
void Grammar::removeRedundant()
{
	int tIndex=1;
	bool findNew=false;
	while (tIndex&lt;tokenCount)
	{
		findNew=false;
		if (!token[tIndex]-&gt;terminal)
		{
			if (isRedundant(tIndex))
			{
				removeToken(tIndex);
				findNew=true;
			}
		}
		if (findNew)
		{
			tIndex=1;
		}
		else
		{
			tIndex++;
		}
	}			
}

void Grammar::printAllRules()
{
/*
	cout&lt;&lt;&quot;\nnow print all rules\n&quot;;
	for (int i=0; i&lt;=prodIndex; i++)
	{
		cout&lt;&lt;&quot;rule index &quot;&lt;&lt;i&lt;&lt;&quot;: &quot;;
		printRule(i);
		cout&lt;&lt;&quot;\n&quot;;
	}
	*/
	int sum=0;
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (!token[i]-&gt;terminal)
		{
			sum+=token[i]-&gt;count;
		}
	}
	cout&lt;&lt;&quot; total rules is:&quot;&lt;&lt;prodIndex+1;
	cout&lt;&lt;&quot;\n and the sum is &quot;&lt;&lt;sum&lt;&lt;endl;

}



void Grammar::addBeginEnd()
{
	int begin=addToken(startStr, false);
	int end=addToken(endStr, true);
	prodIndex++;
	production[prodIndex][0]=0;
	production[prodIndex][1]=end;
	production[prodIndex][2]=-1;
	addRuleForToken(begin, prodIndex);
	startSymbolIndex=begin;
	stackBottomIndex=end;
}
/*
void Grammar::calculateProperty()
{
	calculateNull();
	calculateFirst();
	calculateFollow();
}
*/
void Grammar::printToken()
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		cout&lt;&lt;&quot;name: &quot;&lt;&lt;token[i]-&gt;name&lt;&lt;endl;
		cout&lt;&lt;&quot;   isNull: &quot;&lt;&lt;(token[i]-&gt;isNull?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   isTerminal: &quot;&lt;&lt;(token[i]-&gt;terminal?&quot;true&quot;:&quot;false&quot;)&lt;&lt;endl;
		cout&lt;&lt;&quot;   first: &quot;;
		for (int j=0; j&lt;token[i]-&gt;firstCount; j++)
		{
			if (j!=0)
			{
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;&quot;[&quot;&lt;&lt;j&lt;&lt;&quot;]=&quot;&lt;&lt;token[token[i]-&gt;first[j]]-&gt;name;			
		}
		cout&lt;&lt;&quot;\n   follow: &quot;;
		for (j=0; j&lt;token[i]-&gt;followCount; j++)
		{
			if (j!=0)
			{
				cout&lt;&lt;&quot;,&quot;;
			}
			cout&lt;&lt;&quot;[&quot;&lt;&lt;j&lt;&lt;&quot;]=&quot;&lt;&lt;token[token[i]-&gt;follow[j]]-&gt;name;
		}
		cout&lt;&lt;endl;
	}
}

//must use while loop to discover new set!!!!!
void Grammar::calculateFirst()
{
	int i;
	bool addNew;
	do
	{
		i=0;
		addNew=false;
		while (i&lt;tokenCount)
		{		
			//the terminal contains itself
			if (token[i]-&gt;terminal)
			{
				//addFirst don't judge if it is nullable!!
				addFirst(i, i);			
				//token[i]-&gt;first[token[i]-&gt;firstCount++]=i;
			}
			else
			{
				//for all its rules
				for (int j=0; j&lt;token[i]-&gt;count; j++)
				{
					int theToken, k=0;
					theToken=production[token[i]-&gt;production[j]][k];
					//for each token in each rule
					do 
					{
						//add non epsilon set
						if (addIntoFirst(i, theToken))
						{
							addNew=true;
						}
						//if it is not null, means it is end
						if (!token[theToken]-&gt;isNull)
						{
							break;
						}
						//if it is nullable, continue
						k++;
						theToken=production[token[i]-&gt;production[j]][k];
					}while (theToken!=-1);
					//it means all token in this rule is nullable, so
					if (theToken==-1)
					{
						//it is nullable
						//addEpsilonIntoFirst(i);
						addFirst(i, emptyIndex);
					}
				}
			}
			i++;
		}		
	}while (addNew);
}

bool Grammar::addFollowIntoFollow(int target, int source)
{
	bool addNew=false;
	for (int i=0; i&lt;token[source]-&gt;followCount; i++)
	{
		if (addFollow(target, token[source]-&gt;follow[i]))
		{
			addNew=true;
		}
	}
	return addNew;
}

bool Grammar::addIntoFollow(int target, int source)
{
	bool addNew=false;
	if (source==-1)
	{
		return false;
	}
	for (int i=0; i&lt;token[source]-&gt;firstCount; i++)
	{
		//add non-epsilon
		if (!token[token[source]-&gt;first[i]]-&gt;isNull)
		{
			if (addFollow(target, token[source]-&gt;first[i]))
			{
				addNew=true;
			}
		}
	}
	return addNew;
}


void Grammar::calculateFollow()
{		
	int i;
	bool addNew, started;
	//token[startSymbolIndex]-&gt;follow[0]=stackBottomIndex;
	//token[startSymbolIndex]-&gt;followCount=1;
	do
	{
		i=0;
		addNew=false;
		while (i&lt;tokenCount)
		{		
			//the terminal contains itself
			if (!token[i]-&gt;terminal)
			{
				for (int tIndex=0; tIndex&lt;tokenCount; tIndex++)
				{
					//for all its rules
					if (token[tIndex]-&gt;terminal)
					{
						continue;
					}
					//for each its rule
					for (int j=0; j&lt;token[tIndex]-&gt;count; j++)
					{
						int theToken, k=0, theRule=token[tIndex]-&gt;production[j];
						started=false;
						theToken=production[theRule][k];
						//for each token in each rule
						do 
						{
							//the token appears here
							if (theToken==i)
							{
								started=true;
								//add non epsilon set, including -1 situation!!!
								if (addIntoFollow(i, production[theRule][k+1]))
								{
									addNew=true;
								}
							}
							//if it is not null, means it is end
							if (started&amp;&amp;!token[theToken]-&gt;isNull)
							{
								break;
							}
							//if it is nullable, continue
							k++;
							theToken=production[theRule][k];
						}while (theToken!=-1);
						//it means all token in this rule is nullable, so
						if (started&amp;&amp;theToken==-1)
						{
							//it is nullable
							//add current variable Follow(j) into Follow(i);
							if (addFollowIntoFollow(i, tIndex))
							{
								addNew=true;
							}
						}
					}
				}					
			}
			i++;
		}		
	}while (addNew);
}

void Grammar::addRuleForToken(int tIndex, int ruleIndex)
{
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=ruleIndex;
}

void Grammar::shiftRule(int ruleIndex, bool left2Right, int offset)
{
	int end=0;
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	if (left2Right)
	{
		for (int i=end; i&gt;=0; i--)
		{
			production[ruleIndex][i+offset]=production[ruleIndex][i];
		}
	}
	else
	{
		for (int i=0; i&lt;=end-offset; i++)
		{
			production[ruleIndex][i]=production[ruleIndex][i+offset];
		}
		checkEpsilon(ruleIndex);
	}
}

void Grammar::checkEpsilon(int ruleIndex)
{
	if (production[ruleIndex][0]==-1)
	{
		epsilonRule(ruleIndex);
	}
}


bool Grammar::addFollow(int target, int source)
{
	//check if it is already there
	for (int i=0; i&lt;token[target]-&gt;followCount; i++)
	{
		if (token[target]-&gt;follow[i]==source)
		{
			return false;
		}
	}
	//add at end
	token[target]-&gt;follow[token[target]-&gt;followCount++]=source;
	return true;
}

bool Grammar::addFirst(int target, int source)
{
	//check if it is already there
	for (int i=0; i&lt;token[target]-&gt;firstCount; i++)
	{
		if (token[target]-&gt;first[i]==source)
		{
			return false;
		}
	}
	//add at end
	token[target]-&gt;first[token[target]-&gt;firstCount++]=source;
	return true;
}

//add non epsilon into it.
bool Grammar::addIntoFirst(int target, int source)
{
	bool addNew=false;
	if (token[source]-&gt;terminal)
	{
		return addFirst(target, source);
	}
	for (int i=0; i&lt;token[source]-&gt;firstCount; i++)
	{
		//add non epsilon
		if (!token[token[source]-&gt;first[i]]-&gt;isNull)
		{
			if (addFirst(target, token[source]-&gt;first[i]))
			{
				addNew=true;
			}
		}
	}
	return addNew;
}


bool Grammar::Null(int tIndex)
{
	if (token[tIndex]-&gt;terminal)
	{
		return token[tIndex]-&gt;isNull;		
	}
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		int j=0, theToken;
		theToken=production[token[tIndex]-&gt;production[i]][j];
		do
		{
			if (theToken==tIndex||!Null(theToken))
			{
				break;
			}
			j++;
			theToken=production[token[tIndex]-&gt;production[i]][j];
		}while (theToken!=-1);
		if (theToken==-1)
		{			
			return true;
		}
	}
	return false;
}

void Grammar::calculateNull()
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		token[i]-&gt;isNull=Null(i);	
	}
}
	
int Grammar::findMetaSymbol(int&amp; begin, int&amp; end)
{
	int theRule, theToken, k;
	begin=end=-1;
	for (int i=0; i&lt;tokenCount; i++)
	{
		for (int j=0; j&lt;token[i]-&gt;count; j++)
		{
			k=0;
			theRule=token[i]-&gt;production[j];
			theToken=production[theRule][k];
			while (theToken!=-1)
			{
				if (theToken==beginIndex)
				{
					begin=k;
				}
				if (theToken==endIndex)
				{
					end=k;
					return theRule;
				}
				k++;
				theToken=production[theRule][k];
			}
		}
	}
	return -1;
}

	
void Grammar::doReplaceMetaSymbol(int ruleIndex, int begin, int end)
{
	int newTokenIndex=forgeToken(0), i=0;

	addRuleForToken(newTokenIndex, ++prodIndex);
	//token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=++prodIndex;
	//token[newTokenIndex]-&gt;terminal=false;

	copyRule(ruleIndex, prodIndex);
	//shrink 
	while (production[ruleIndex][i+end+1]!=-1)
	{
		production[ruleIndex][begin+i]=production[ruleIndex][i+end+1];
		i++;
	}
	production[ruleIndex][begin+i]=-1;
	addAtEnd(ruleIndex, newTokenIndex);
	/*
	production[ruleIndex][begin]=newTokenIndex;
	production[ruleIndex][begin+1]=-1;
	*/

	shiftRule(prodIndex, false, begin+1);
	
	production[prodIndex][end-begin-1]=newTokenIndex;
	production[prodIndex][end-begin]=-1;
	epsilonRule(++prodIndex);
	addRuleForToken(newTokenIndex, prodIndex);
}
		
void Grammar::replaceMetaSymbol()
{
	int begin, end, ruleIndex;
	while ((ruleIndex=findMetaSymbol(begin, end))!=-1)
	{
		doReplaceMetaSymbol(ruleIndex, begin, end);
	}	
	for (int i=0; i&lt;tokenCount; i++)
	{
		//if (token[i]-&gt;isMeta)
		if (i==beginIndex||i==endIndex)
		{
			removeToken(i);
		}
	}
}

void Grammar::addAtEnd(int ruleIndex, int toAdd)
{
	int end=0;
	while (production[ruleIndex][end]!=-1)
	{
		end++;
	}
	production[ruleIndex][end++]=toAdd;
	production[ruleIndex][end]=-1;
}

//left-recursion:  A -&gt; A a | b | c
//change to: A  -&gt; b A' | c A'
//			 A' -&gt; a A' | empty
void Grammar::removeImmediate(int tIndex, int ruleIndex)
{
	int newIndex=forgeToken(tIndex);
	int holdRuleIndex=token[tIndex]-&gt;production[ruleIndex];
	//sequence matters!

	//change to: A -&gt; b A'
	for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
	{
		if (i!=ruleIndex)
		{
			addAtEnd(token[tIndex]-&gt;production[i], newIndex);
		}
	}
	
	//shift
	removeRuleFromToken(tIndex, ruleIndex);
	
	addRuleForToken(newIndex, holdRuleIndex);
	//token[newIndex]-&gt;production[token[newIndex]-&gt;count++]=holdRuleIndex;

	shiftRule(holdRuleIndex, false, 1);
	addAtEnd(holdRuleIndex, newIndex);
	
	//add epsilon rule for new variable
	epsilonRule(++prodIndex);
	addRuleForToken(newIndex, prodIndex);
}


int Grammar::forgeToken(int index)
{
	char str[MaxTokenLength+2], ch;
	int len=strlen(token[index]-&gt;name);
	int temp=0, i=0;
	strcpy(str, token[index]-&gt;name);
	ch=str[len-1];//get last char of name
	if (ch&gt;='0'&amp;&amp;ch&lt;'9')
	{
		str[len-1]=ch+i+1;
	}
	else
	{
		str[len]='0'+i;
		str[len+1]='\0';
	}
	//I won't bother to check repetitation of name
	while (tokenIndex(str)!=-1)
	{
		i++;
		if (ch&gt;='0'&amp;&amp;ch&lt;'9')
		{
			str[len-1]=ch+i+1;
		}
		else
		{
			str[len]='0'+i;
			str[len+1]='\0';
		}
	}

	return addToken(str, false);//it is non-terminal for sure	
}

int Grammar::copyRule(int source, int target)
{
	int i=0;
	while (production[source][i]!=-1) 
	{
		production[target][i]=production[source][i];
		i++;
	}
	production[target][i]=-1;
	return i;
}

void Grammar::doAddRule(int tIndex)
{	
	token[tIndex]-&gt;production[token[tIndex]-&gt;count++]=++prodIndex;
}

void Grammar::addRule(const char* str)
{
	int index;
	index=addToken(str);
	production[prodIndex][curPos++]=index;
	production[prodIndex][curPos]=-1;//set end
}


//if the token is already there, it return the index
//otherwise, it add new node in token array
int Grammar::addToken(const char* str, bool isTerminal)
{
	int index;
	index=tokenIndex(str);
	if (index==-1)
	{
		index=tokenCount;
	}
	else
	{		
		return index;
	}
	token[index]=createToken(str, isTerminal);
	if (strcmp(str, optionBegin)==0)
	{
		beginIndex=index;
	}
	if (strcmp(str, optionEnd)==0)
	{
		endIndex=index;
	}
	tokenCount++;
	if (strcmp(str, emptyStr)==0)
	{
		token[index]-&gt;isNull=true;
		emptyIndex=index;
	}
	return index;
}

void Grammar::newRule(const char* str)
{
	if (str!=NULL)
	{
		curIndex=addToken(str, false);
	}
	//add one pointer
	token[curIndex]-&gt;production[token[curIndex]-&gt;count++]=++prodIndex;
	token[curIndex]-&gt;terminal=false;
	curPos=0;//reset to restart;
}

Token* Grammar::createToken(const char* theName, bool isTerminal)
{
	Token* ptr=new Token;
	ptr-&gt;name=new char[strlen(theName)+1];
	strcpy(ptr-&gt;name, theName);
	ptr-&gt;terminal=isTerminal;
	ptr-&gt;count=ptr-&gt;firstCount=ptr-&gt;followCount=0;
	ptr-&gt;isNull=false;
	return ptr;
}

int Grammar::tokenIndex(const char* str)
{
	for (int i=0; i&lt;tokenCount; i++)
	{
		if (strcmp(str, token[i]-&gt;name)==0)
		{
			return i;
		}
	}
	return -1;
}

int Grammar::checkRecursion(int tIndex, int curToken)
{
	for (int i=0; i&lt;token[curToken]-&gt;count; i++)
	{
		//token[tIndex]-&gt;production[i]=ruleIndex
		//production[ruleIndex][0] is the first left-recursion one
		if (production[token[curToken]-&gt;production[i]][0]&lt;=curToken)
		{
			return i;
		}
	}
	return -1;
}

void Grammar::printRule(int index)
{
	int nodeIndex=0;
	while (production[index][nodeIndex]!=-1)
	{
		cout&lt;&lt;token[production[index][nodeIndex]]-&gt;name&lt;&lt;&quot; &quot;;
		nodeIndex++;
	}
}



void Grammar::initialize()
{
	tokenCount=curIndex=curPos=0;
	prodIndex=-1;//in order to ++ blindly
}

Grammar::Grammar()
{
	initialize();
}

void Grammar::removeLeftRecursion()
{
	int tIndex=0, curToken=0;
	bool newChange=false;
	while (tIndex&lt;tokenCount)
	{		
		if (!token[tIndex]-&gt;terminal)
		{
			for (int i=0; i&lt;token[tIndex]-&gt;count; i++)
			{
				curToken=production[token[tIndex]-&gt;production[i]][0];
				if (curToken&lt;=tIndex&amp;&amp;!token[curToken]-&gt;terminal)
				{
					if (curToken!=tIndex)
					{
						replaceRule(tIndex, i);
					}
					else
					{
						removeImmediate(tIndex, i);
					}
					newChange=true;				
				}
			}
		}
		//whenever there is some new findings, restart
		if (!newChange)
		{
			tIndex++;
		}
		else
		{
			tIndex=0;
			newChange=false;
		}
	}
}

void Grammar::replaceRule(int tIndex, int ruleIndex)
{
	int pos, j, targetEnd, sourceEnd, curRule;
	curRule=token[tIndex]-&gt;production[ruleIndex];
	int curToken=production[curRule][0];
	for (int i=token[curToken]-&gt;count-1; i&gt;=0; i--)
	{
		if (i&gt;0)
		{
			doAddRule(tIndex);
			pos=copyRule(token[curToken]-&gt;production[i], prodIndex);
			j=0;
			while (production[curRule][j+1]!=-1)
			{
				production[prodIndex][pos+j]=production[curRule][j+1];
				j++;
			}
			production[prodIndex][pos+j]=-1;
			//addRuleForToken(curToken, prodIndex);
		}
		else
		{
			targetEnd=sourceEnd=0;
			//curRule=token[tIndex]-&gt;production[ruleIndex];
			while (true)
			{
				if (production[token[curToken]-&gt;production[0]][sourceEnd]==-1&amp;&amp;
				production[curRule][targetEnd]==-1)
				{
					break;
				}
				if (production[token[curToken]-&gt;production[0]][sourceEnd]!=-1)
				{
					sourceEnd++;
				}
				if (production[curRule][targetEnd]!=-1)
				{
					targetEnd++;
				}
			}
			j=targetEnd+sourceEnd-1;
			while (j&gt;=0)
			{
				if (j&gt;=sourceEnd)
				{
					production[curRule][j]=production[curRule][j-sourceEnd+1];
				}
				else
				{
					production[curRule][j]=production[token[curToken]-&gt;production[0]][j];
				}
				j--;
			}
		}
	}
}


//optimize sequence is first common-factor then remove left recursion
//therefore I don't have to check if for same variable if there will be
//more than one left-recursion
void Grammar::optimize()
{
	replaceMetaSymbol();
	removeLeftRecursion();
	commonLeftFactor();
	removeRedundant();
	addBeginEnd();
	calculateNull();
	calculateFirst();
	calculateFollow();
}

int Grammar::findCommonFactor(int tIndex, int ruleIndex)
{
	for (int i=ruleIndex+1; i&lt;token[tIndex]-&gt;count; i++)
	{	
		//if the two rule has the same first token
		if (production[token[tIndex]-&gt;production[ruleIndex]][0]==
			production[token[tIndex]-&gt;production[i]][0])
		{
			/*
			//calculate if there is epsilon
			if (emptyIndex==-1)
			{
				emptyIndex=tokenIndex(emptyStr);
			}
			//if it is not epsilon
			if (production[token[tIndex]-&gt;production[i]][0]!=emptyIndex)
			{
				return i;
			}
			*/
			return i;
		}
	}
	return -1;
}

void Grammar::epsilonRule(int ruleIndex)
{
	production[ruleIndex][0]=addToken(emptyStr);
	production[ruleIndex][1]=-1;
}

//sample:  x -&gt; Aa 
//         x -&gt; Ab
//changed to:  x -&gt; Ax' //this is to change the old rule
//             x' -&gt; b //this is to change the old rule
//             x' -&gt; a //this is the new-added rule
void Grammar::doCommonFactor(int tIndex, int ruleIndex, int index)
{
	int newTokenIndex=forgeToken(tIndex);//create a new variable
	//move the second and after part to the new rule of new variable
	//doing: x' -&gt; a
	//curPos=0;
	//prepare to add one new rule 	
	addRuleForToken(newTokenIndex, ++prodIndex);
	//token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=++prodIndex;
	token[newTokenIndex]-&gt;terminal=false;

	copyRule(token[tIndex]-&gt;production[ruleIndex], prodIndex);
	shiftRule(prodIndex, false, 1);
	/*
	do
	{
		
		//do copying
		production[prodIndex][curPos]=
			production[token[tIndex]-&gt;production[ruleIndex]][curPos+1];
		curPos++;
	//even the -1 at end is copied
	}while (production[token[tIndex]-&gt;production[ruleIndex]][curPos]!=-1);
	*/

	//in order to show an empty string, in case the string is &quot;epsilon&quot;

	
	/*
	if (curPos==1)
	{
		epsilonRule(prodIndex);
	}
	*/


	//replace x -&gt; Aa with x -&gt; Ax'
	production[token[tIndex]-&gt;production[ruleIndex]][1]=newTokenIndex;
	production[token[tIndex]-&gt;production[ruleIndex]][2]=-1;//end
	
	//doing: x' -&gt; b
	//curPos=0;
	//prepare to add one new rule 
	//pointing new token to where old rule lies
	addRuleForToken(newTokenIndex, token[tIndex]-&gt;production[index]);
	/*
	token[newTokenIndex]-&gt;production[token[newTokenIndex]-&gt;count++]=
		token[tIndex]-&gt;production[index];
		*/

	shiftRule(token[tIndex]-&gt;production[index], false, 1);

	removeRuleFromToken(tIndex, index);
}

void Grammar::removeRuleFromToken(int tIndex, int ruleIndex)
{
	token[tIndex]-&gt;count--;
	for (int i=ruleIndex; i&lt;token[tIndex]-&gt;count; i++)
	{
		token[tIndex]-&gt;production[i]=token[tIndex]-&gt;production[i+1];
	}	
}


void Grammar::commonLeftFactor()
{
	int index=-1, tIndex=0, ruleIndex=0;
	bool flag;
	//whenever a newrule is done, restart!
	while (tIndex&lt;tokenCount)
	{
		ruleIndex=0;
		flag=false;		
		while (ruleIndex&lt;token[tIndex]-&gt;count)
		{			
			index=findCommonFactor(tIndex, ruleIndex);
			if (index!=-1)
			{
				doCommonFactor(tIndex, ruleIndex, index);
				//restart
				flag=true;
				break;
			}
			else
			{
				ruleIndex++;
			}
		}
		if (flag)
		{
			tIndex=0;		
		}
		else
		{
			tIndex++;
		}
	}
}

Token* Grammar::operator[](int index)
{
	if (index&gt;=0&amp;&amp;index&lt;tokenCount)
	{
		return token[index];
	}
	else
	{
		return NULL;
	}
}

void Grammar::print()
{
	for (int i=0; i&lt;tokenCount; i++)
	{		
		if (!token[i]-&gt;terminal)
		{
			cout&lt;&lt;token[i]-&gt;name&lt;&lt;&quot; ==&gt; &quot;;
			for (int j=0; j&lt;token[i]-&gt;count; j++)
			{			
				printRule(token[i]-&gt;production[j]);	
				if (j!=token[i]-&gt;count-1)
				{
					cout&lt;&lt;&quot; | &quot;;
				}
			}
			cout&lt;&lt;&quot;\n&quot;;
		}		
	}
}
</pre>
<pre>
</pre>
<pre></pre>
<pre></pre>
<pre><span lang="en-ca"><font size="3" color="#FF0000"><b>file name: main</b></font></span><font size="3" color="#FF0000"><b>.cpp (main)</b></font></pre>
<pre>#include &lt;iostream&gt;
#include &quot;CFGReader.h&quot;

using namespace std;

int main()
{
	CFGReader R;
	R.readFromFile(&quot;c:\\ruleTest.txt&quot;);
    R.optimize();
	R.print();
	return 0;
}</pre>
<pre>　</pre>
<pre>　</pre>
<pre>　</pre>
<pre><font color="#0000FF"><b><span lang="en-ca">The input is something like following:</span></b></font></pre>
<pre><font color="#0000FF"><span lang="en-ca"><b>Please note that all tokens must be separated by space, including &quot;|&quot; and &quot;-&gt;&quot;</b></span><b> and &quot;{&quot;, &quot;}&quot; is meta symbol </b></font></pre>
<pre><font color="#0000FF"><b>equivalent to Kaleen star<span lang="en-ca">.</span> </b></font></pre>
<pre>M' -&gt; program i ; Dl B
Dl -&gt; Dv Ml
Ml -&gt; Ml M | e
M -&gt; module i ( Vl ) Dv B
Dv -&gt; variables Vl | e
Vl -&gt; Vl V | V
V -&gt; Il : T ;
T -&gt; integer Ad ; | char Ad
Il -&gt; i | Il , i
Ad -&gt; e | array [ n ]
B -&gt; begin Sl end ;
Sl -&gt; S | S Sl
S -&gt; L := E ; | if C then S else S | loop Sl end ; | exit ; i ( Lp ) ; | B | read Ln ; | write Ln ; | e ;
Lp -&gt; e | Ln
Ln -&gt; L { , L }
L -&gt; i Ar
Ar -&gt; e | [ E ]
E -&gt; F { Oa F }
Oa -&gt; + | -
F -&gt; R { Om R }
Om -&gt; * | /
R -&gt; L | n | ( E ) | c
C -&gt; E Or E
Or -&gt; = | &lt; | &gt; | &lt;= | &gt;=

</pre>
<pre></pre>
<pre></pre>
<pre></pre>
<pre><font color="#0000FF"><b>Here is the <a name="result"></a>result:</b></font></pre>

<pre>M' ==&gt; program i ; Dl B 
Dl ==&gt; Dv Ml 
B ==&gt; Sl end S ; 
Dv ==&gt; V )  | module 
Ml ==&gt; module Sl0 
Vl ==&gt; Il START 
V ==&gt; : T integer ; 
Il ==&gt; i $ 
T ==&gt; Ad char ;  | , char 
Ad ==&gt; module  | [ n ] begin 
Sl ==&gt; L Sl0 
S ==&gt; := E if ;  | C then else L loop L  | exit end S ;  | Lp ; i Vl read variables ;  | Sl end S ;  | Ln write ;  | } write ;  | module ; 
L ==&gt; i R 
E ==&gt; Om Il0 
C ==&gt; Om Il0 &gt;= if 
Lp ==&gt; module  | write 
Ln ==&gt; i R Sl0 
R ==&gt; i R  | ]  | Vl if variables  | &lt;= 
Om ==&gt; &lt;  | &gt; 
Or ==&gt; M'0  | M'1  | M'2  | Ml0  | Vl0 
M'0 ==&gt; * Om Il0  | module 
M'1 ==&gt; array := Sl0  | module 
M'2 ==&gt; = Or START  | module 
Ml0 ==&gt; ( i Vl ) variables Dv B Sl0  | module 
Vl0 ==&gt; i $ T integer ; START  | module 
Il0 ==&gt; array i $  | module 
Sl0 ==&gt; module  | end 
START ==&gt; M' $ 
name: M'
   isNull: false
   isTerminal: false
   first: [0]=program
   follow: [0]=$
name: program
   isNull: false
   isTerminal: true
   first: [0]=program
   follow: 
name: i
   isNull: false
   isTerminal: true
   first: [0]=i
   follow: 
name: ;
   isNull: false
   isTerminal: true
   first: [0]=;
   follow: 
name: Dl
   isNull: false
   isTerminal: false
   first: [0]=module,[1]=:
   follow: [0]=i
name: B
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=module,[1]=end
name: Dv
   isNull: false
   isTerminal: false
   first: [0]=module,[1]=:
   follow: [0]=module,[1]=i
name: Ml
   isNull: false
   isTerminal: false
   first: [0]=module
   follow: 
name: e
   isNull: true
   isTerminal: true
   first: [0]=e
   follow: 
name: module
   isNull: false
   isTerminal: true
   first: [0]=module
   follow: 
name: (
   isNull: false
   isTerminal: true
   first: [0]=(
   follow: 
name: Vl
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=read,[1]=if,[2]=)
name: )
   isNull: false
   isTerminal: true
   first: [0]=)
   follow: 
name: variables
   isNull: false
   isTerminal: true
   first: [0]=variables
   follow: 
name: V
   isNull: false
   isTerminal: false
   first: [0]=:
   follow: [0]=)
name: Il
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=program
name: :
   isNull: false
   isTerminal: true
   first: [0]=:
   follow: 
name: T
   isNull: false
   isTerminal: false
   first: [0]=,,[1]=module,[2]=[
   follow: [0]=integer
name: integer
   isNull: false
   isTerminal: true
   first: [0]=integer
   follow: 
name: Ad
   isNull: false
   isTerminal: false
   first: [0]=module,[1]=[
   follow: [0]=char
name: char
   isNull: false
   isTerminal: true
   first: [0]=char
   follow: 
name: ,
   isNull: false
   isTerminal: true
   first: [0]=,
   follow: 
name: array
   isNull: false
   isTerminal: true
   first: [0]=array
   follow: 
name: [
   isNull: false
   isTerminal: true
   first: [0]=[
   follow: 
name: n
   isNull: false
   isTerminal: true
   first: [0]=n
   follow: 
name: ]
   isNull: false
   isTerminal: true
   first: [0]=]
   follow: 
name: begin
   isNull: false
   isTerminal: true
   first: [0]=begin
   follow: 
name: Sl
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=end
name: end
   isNull: false
   isTerminal: true
   first: [0]=end
   follow: 
name: S
   isNull: false
   isTerminal: false
   first: [0]=:=,[1]=exit,[2]=},[3]=module,[4]=write,[5]=i,[6]=&lt;,[7]=&gt;
   follow: [0]=;
name: L
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=module,[1]=end,[2]=loop
name: :=
   isNull: false
   isTerminal: true
   first: [0]=:=
   follow: 
name: E
   isNull: false
   isTerminal: false
   first: [0]=&lt;,[1]=&gt;
   follow: [0]=if
name: if
   isNull: false
   isTerminal: true
   first: [0]=if
   follow: 
name: C
   isNull: false
   isTerminal: false
   first: [0]=&lt;,[1]=&gt;
   follow: [0]=then
name: then
   isNull: false
   isTerminal: true
   first: [0]=then
   follow: 
name: else
   isNull: false
   isTerminal: true
   first: [0]=else
   follow: 
name: loop
   isNull: false
   isTerminal: true
   first: [0]=loop
   follow: 
name: exit
   isNull: false
   isTerminal: true
   first: [0]=exit
   follow: 
name: Lp
   isNull: false
   isTerminal: false
   first: [0]=module,[1]=write
   follow: [0]=;
name: read
   isNull: false
   isTerminal: true
   first: [0]=read
   follow: 
name: Ln
   isNull: false
   isTerminal: false
   first: [0]=i
   follow: [0]=write
name: write
   isNull: false
   isTerminal: true
   first: [0]=write
   follow: 
name: }
   isNull: false
   isTerminal: true
   first: [0]=}
   follow: 
name: +
   isNull: false
   isTerminal: true
   first: [0]=+
   follow: 
name: -
   isNull: false
   isTerminal: true
   first: [0]=-
   follow: 
name: R
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=],[2]=&lt;=
   follow: [0]=module,[1]=end
name: Om
   isNull: false
   isTerminal: false
   first: [0]=&lt;,[1]=&gt;
   follow: [0]=array,[1]=module
name: *
   isNull: false
   isTerminal: true
   first: [0]=*
   follow: 
name: /
   isNull: false
   isTerminal: true
   first: [0]=/
   follow: 
name: c
   isNull: false
   isTerminal: true
   first: [0]=c
   follow: 
name: Or
   isNull: false
   isTerminal: false
   first: [0]=*,[1]=module,[2]=array,[3]==,[4]=(,[5]=i
   follow: [0]=program
name: =
   isNull: false
   isTerminal: true
   first: [0]==
   follow: 
name: &lt;
   isNull: false
   isTerminal: true
   first: [0]=&lt;
   follow: 
name: &gt;
   isNull: false
   isTerminal: true
   first: [0]=&gt;
   follow: 
name: &lt;=
   isNull: false
   isTerminal: true
   first: [0]=&lt;=
   follow: 
name: &gt;=
   isNull: false
   isTerminal: true
   first: [0]=&gt;=
   follow: 
name: M'0
   isNull: false
   isTerminal: false
   first: [0]=*,[1]=module
   follow: 
name: M'1
   isNull: false
   isTerminal: false
   first: [0]=array,[1]=module
   follow: 
name: M'2
   isNull: false
   isTerminal: false
   first: [0]==,[1]=module
   follow: 
name: Ml0
   isNull: false
   isTerminal: false
   first: [0]=(,[1]=module
   follow: 
name: Vl0
   isNull: false
   isTerminal: false
   first: [0]=i,[1]=module
   follow: 
name: Il0
   isNull: false
   isTerminal: false
   first: [0]=array,[1]=module
   follow: [0]=&gt;=
name: Sl0
   isNull: false
   isTerminal: false
   first: [0]=module,[1]=end
   follow: 
name: START
   isNull: false
   isTerminal: false
   first: [0]=program
   follow: 
name: $
   isNull: false
   isTerminal: true
   first: [0]=$
   follow: 
</pre>

<pre>　</pre>

<pre>





</pre>

<pre></pre>

<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                                   
&nbsp;&nbsp;&nbsp; <a href="WhoAmI.htm">                  







                       <img src="picture/back.gif" style="border: medium none" alt="back.gif (341 bytes)" width="32" height="35"></a>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
<a href="index.htm"><img src="picture/up.gif" style="border: medium none" alt="up.gif (335 bytes)" width="35" height="32"></a>       &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;                         
<img src="picture/next.gif" style="border: medium none" alt="next.gif (337 bytes)" width="32" height="35">          


</p>

</body>

</html>