Fancy Coder

share my projects and discoveries

Monthly Archives: April 2011

Recursion -> Loop

Sometimes, we need to convert recursive functions into loop for efficiency consideration, or stack size constrain. Converting tail recursion into iteration is trivial. The method we discuss in this post is a general method to convert any recursive functions into iterative ones.

The idea is to simulate stack behavior: we store frame record as a structure into a simulated stack when calling a function, and restore the top record in stack when function returns.

The record should contain 2 part of data:

  • local variable, including parameter
  • next instruction to execute(used when restoring the record)

We also need a global variable to store return value (the same as eax register).

“Next instruction to execute” can be stored using label, and we can “goto” corresponding instruction after restoring record. However, “goto” statement is not supported in some languages, and also not recommended. Thus, we use integer number to represent label id, and use switch statement to jump to different part of code.

Following example shows recursive and iterative version of factorization function. Iterative version is significantly longer in code length, and slightly slower in practice. However, iterative version can work on large input where recursive version may get stack overflow.

We select javascript to demonstrate because:

  • Dynamic type saves declaration code
  • Closure can simulate behavior of frame record perfectly (stack elements are actually functions)
  • Powerful literal value representation
 var fact=function(n){
		return n==0?1:n*fact(n-1);
	    };
	    var Fact=function(n){
		var _ret;
		var stk=[];
		var construct=function(n){
		    var l=0;
		    return function(){
			switch(l){
			case 0:
			    l=1;
			    if(n==0){
				_ret=1;
				stk.pop();
				return;
			    }
			    stk.push(construct(n-1));
			    return;
			case 1:
			    _ret*=n;
			    stk.pop();
			    return;
			}
		    };
		};
		stk.push(construct(n));
		while(stk){
		    if(stk.length==0)return _ret;
		    stk[stk.length-1]();
		}
	    };

Now let’s look at more complicated examples:

fibonacci number

  var Fib=function(n){
		var _ret;
		var stk=[];
		var construct=function(n){
		    var l=0,a,b;
		    return function(){
			switch(l){
			case 0:
			    l=1;
			    if(n<=1){
				_ret=n;
				stk.pop();
				return;
			    }
			    stk.push(construct(n-1));
			    return;
			case 1:
			    l=2;
			    a=_ret;
			    stk.push(construct(n-2));
			    return;
			case 2:
			    b=_ret;
			    _ret=a+b;
			    stk.pop();
			    return;
			}
		    };
		};
		stk.push(construct(n));
		while(true){
		    if(stk.length==0)return _ret;
		    stk[stk.length-1]();
		}
	    }
hanoi tower

	    var Hanoi=function(a,b,c,n){
		var stk=[];
		var construct=function(a,b,c,n){
		    var l=0;
		    return function(){
			switch(l){
			case 0:
			    l=1;
			    if(n>0){
				stk.push(construct(a,c,b,n-1));
				return;
			    }
			    stk.pop();
			    return;
			case 1:
			    l=2;
			    document.write(a+"->"+c+"<br>");
			    stk.push(construct(b,a,c,n-1));
			    return;
			case 2:
			    stk.pop();
			    return;
			}
		    };
		};
		stk.push(construct(a,b,c,n));
		while(stk){
		    if(stk.length==0)return;
		    stk[stk.length-1]();
		}
	    };
merge sort
	    var MergeSort=function(a,i,j){
		if(!i)i=0;
		if(!j)j=a.length;
		var stk=[];
		var construct=function(a,i,j){
		    var l=0;
		    var mid,t=[],x,y;
		    return function(){
			switch(l){
			case 0:
			    l++;
			    if(j-i<=1){
				stk.pop();
				return;
			    }
			    mid=floor((i+j)/2);
			    stk.push(construct(a,i,mid));
			    return;
			case 1:
			    l++;
			    stk.push(construct(a,mid,j));
			    return;
			case 2:
			    x=i, y=mid;
			    while(x<mid && y<j){
				if(a[x]<a[y]){
				    t.push(a[x]);
				    x++;
				}else{
				    t.push(a[y]);
				    y++;
				}
			    }
			    while(x<mid)t.push(a[x++]);
			    while(y<j)t.push(a[y++]);
			    for(var k=i;k<j;k++)
				a[k]=t[k-i];
			    stk.pop();
			    return;
			}
		    };
		};
		stk.push(construct(a,i,j));
		while(stk){
		    if(stk.length==0)return;
		    stk[stk.length-1]();
		}
	    };

Example for TreeList and BinaryIndexedTree

We mentioned BinaryIndexedTree and TreeList in previous post. Now I will give an example problem where both data structure are needed.

Given array a[0…n-1], which stores a permutation of 0 to n-1. We define array b[0…n-1] using following equation:


b[i]=|{x|x occurs before i in a[], and x > i}|

The problem is: given a[], compute b[] in O(nlogn) time; given b[], compute a[] in O(nlogn) time.

To convert a[] into b[], assume we have an array t[0…n](initially all 0). We go through each elements in a[], and for each element a[i] we encounter, b[a[i]]=t[0]+…+t[a[i]], and then we set t[a[i]]=1. The correctness is obvious. Implementing this method naively gives O(n^2). Notice that binary indexed tree is designed for this kind of prefix sum operation:

vector<int> a2b(vector<int> a){
	vector<int> b(a.size());
	BITWraper t(a.size(),0);
	for (int i = 0; i < a.size(); i++) {
		b[a[i]]=i-t.sum(a[i]);
		t.add(a[i]);
	}
	return b;
}

By default, binary indexed tree’s array index starts from 1. Thus, we implemented a wrapper class, which allows arbitrary starting index.

How to computing a[] given b[]? Let’s look at large elements first. Let’s first write down the largest element, for each elements i, there are b[i] elements greater than it. Since we write down elements in descending order, all b[i] elements must have been written down. Thus, we insert i into b[i] position. After all elements are inserted, a[] is constructed. Again, implementing using array or linked list will result in O(n^2) complexity, while TreeList only need O(nlogn) time to do all insertions.

vector<int> b2a(vector<int> b){
	TreeList<int> a;
	for(int i=b.size()-1;i>=0;i--)
		a.insert(i,b[i]);
	return a.toVector();
}

C code that print its own source

Very interesting. The trick is that use s to print itself, thus s should not contain any character like “\n”, “\”” which will be displayed differently as its literal value. Thus, we use numbers to represent those special characters.

#include <stdio.h>
int main(){const char *s="#include<stdio.h>%cint main(){const char *s=%c%s%c;printf(s,10,34,s,34);}";printf(s,10,34,s,34);}

Binary Indexed Tree

Binary Indexed Tree (also called Fenwick tree), a good tutorial is available here

Binary indexed tree’s logical structure is an integer array. Initially array is filled with 0. Binary Indexed Tree allows you to update value of element in array, and compute sum of elements in any index interval. All operations are O(log n).

The implementation is simple and efficient, although quite tricky, and it is very popular in programming contests. Source code follows.

//index from 1 to N
template<size_t N>
class BinaryIndexedTree{
	int *a;
	int low_bit(int x){return x & -x;}
public:
	// i \in [1,N]
	void add(int i,int k=1){
		while(i<=N){
			a[i]+=k;
			i+=low_bit(i);
		}
	}
	// i \in [0,N]
	int sum(int i){
		int res=0;
		while(i){
			res+=a[i];
			i-=low_bit(i);
		}
		return res;
	}
	// i \in [0,N], j \in [i,N]
	int rangeSum(int i,int j){
		return sum(j)-sum(i-1);
	}
	BinaryIndexedTree(){
		a=new int[N+1];
		fill(a,a+N,0);
	}
	~BinaryIndexedTree(){
		delete[] a;
	}
};

TreeList

As we know, array data structure has O(1) access time and O(n) insertion/deletion(kth element); linked list need O(n) for all operations, and O(1) insertion/deletion only when node pointer is given.

We also know that binary search tree and skip-list provide O(log n) for accessing, insertion and deletion.

Now what if we want a data structure, which behaves like a list, which can insert/delete element at kth position efficiently?

One of my friend suggest “blocked linked list”, which is a combination of array and linked list. Assume the input data size N is known, we can use a list of arrays to store data. List has length sqrt(N), each array also has length sqrt(N). Insertion and deletion has complexity O(sqrt(N)), and array is split when length exceed sqrt(N), and adjacent blocks are merged when length smaller than sqrt(N). This data structure is very popular in high school programming contest, where data size is known in advance. However, O(sqrt(N)) is not good enough: we want O(log n) for all operations.

Another method is to use binary search tree: binary search tree with “size” field can find “kth” element in O(log n) time. Elements have a floating point value as key for comparison. When inserting kth element, we first look for element k-1 and element k, and take their mean as key for new element. Thus new element is inserted into kth position. However, this method is not that elegant: size of the list depends on floating point precision.

Finally, comes our “TreeList” data structure. I don’t think I am the first guy who invent it, and I don’t know whether there is any other name for such data structure. The TreeList support 3 operations:

  • insert element into kth position
  • remove element in kth position
  • get kth element

The implementation is based on treap. Actually, any binary search tree can be used. The complexity for all 3 operations are O(log n). Source code is shown below.

When do we need this data structure? Follow my latest post

template<class T>
class TreeList{
	// list structure, support 3 operations:
	// 	-insert element into kth position
	// 	-remove element into kth position
	// 	-get kth element
	// implemented using treap
	struct node{
		T value;
		int pri,size;
		node *left,*right;	
		node(){}
		node(const T& v,node* l,node* r){
			value=v;
			left=l;
			right=r;
			size=1;
			pri=rand();
		}
	}*root,*null;
	void resize(node* cur){
		if(cur!=null)cur->size=cur->left->size+cur->right->size+1;
	}
	void left_rotate(node* &cur){
		node *res=cur->left;
		cur->left=res->right;
		res->right=cur;
		resize(cur);
		cur=res;		 
		resize(cur);
	}
	void right_rotate(node* &cur){
		node *res=cur->right;
		cur->right=res->left;
		res->left=cur;
		resize(cur);
		cur=res;		 
		resize(cur);
	}
	void remove_node(node* &cur){
		if(cur->left == null && cur->right == null){
			cur=null;
		}else if(cur->left->pri < cur->right->pri){
			left_rotate(cur);
			remove_node(cur->right);
			resize(cur);	 
		}else{ 
			right_rotate(cur);
			remove_node(cur->left);
			resize(cur);	 
		}
	}

	void insert_to_kth_pos(const T& x,node* &cur,int k){	
		if(cur == null && k==0){
			cur = new node(x,null,null);
		}else if(k <= cur->left->size){
			insert_to_kth_pos(x,cur->left,k);
			if(cur->pri > cur->left->pri)
					left_rotate(cur);	 
			resize(cur);
		}else if(k <= cur->size){
			insert_to_kth_pos(x,cur->right,k-cur->left->size-1);	
			if(cur->pri > cur->right->pri)
				right_rotate(cur);	   
			resize(cur);
		}
	}
	void remove_kth_element(node* & cur,int k){
		if(k == cur->left->size){	
			remove_node(cur);
		}else if(k < cur->left->size){
			remove_kth_element(cur->left,k);
			if(cur->pri > cur->left->pri)
				left_rotate(cur);	 
			resize(cur);
		}else if(k < cur->size){
			remove_kth_element(cur->right,k-cur->left->size-1);	
			if(cur->pri > cur->right->pri)
				right_rotate(cur);	   
			resize(cur);
		}
	}
	void print(node* cur){
		if(cur == null)return;
		print(cur->left);
		cout<<cur->value<<" ";
		print(cur->right);
	}
public:
	TreeList (){
		null=new node();
		null->pri = 0x7fffffff;
		null->size = 0;
		null->left=null->right=null;
		root=null;
	}
	void print(){
		print(root);
		cout<<endl;
	}
	void insert(const T& x,int k){
		insert_to_kth_pos(x,root,k);
	}
	void remove(int k){
		remove_kth_element(root,k);
	}
	T get(int k){
		node* cur=root;
		while(cur!=null){
			if(k < cur->left->size){
				cur=cur->left;
			}else if(k == cur->left->size){
				break;
			}else{
				k-=cur->left->size+1;
				cur=cur->right;
			}
		}
		return cur->value;
	}
};

[ACM/ICPC][Hangzhou 2005-2006]Generator

Warning: this post is very technical. You may find it very hard to understand, or boring.

One of my friend asked me about this problem.

The description of this problem is simple: given a string pattern, find the expected position of first occurrence of such pattern in random strings.

The problem looks like dynamic programming, because there seems exists some recursive relations between its subproblems.

Let’s first assume there exists a random string generator, which generate random strings letter by letter. Left F[n] be the number of expected letters to be generated before the pattern occur, given that most recently generated n letters matches first n letters of the pattern. For example, assume the pattern is ABCDE, and the random string generator generates ….ABC, then F[3] would be the expected number of letters needed before ABCDE is generated.

We can consider “N-prefix” of the pattern as “state” for DP(described by N), and F[0] would be the answer we need: expected number of letters needed before pattern is observed, given no prefix match(random string generator hasn’t generated anything yet).

Now let’s see how one state transfer into another state. Given that most recent generated N letters match N-prefix of patter, if one more letters is generated, how would be the suffix-prefix match goes? In the best case, it will become N+1, if the newly generated letter matches the N+1th letter in the pattern. In other cases, the state would transfer to some number no larger than N, and that number can be found easily: find the largest prefix-suffix match between generated string and pattern.

Now the recursive relationship between states becomes clear: F[i] = 1+\sum_{c=’A’}^{‘Z’}F[next(i,c)]/26. Base case would be F[pattern.length]=0.
In the original problem, number of letters is a variable, rather than fixed 26. next(i,c) is state transition function.

Now comes another problem: this DP cannot be solved because states graph is not DAG, and no topological order can be found. Actually, this is not a DP problem. After listing out all equations of form F[i] = 1+\sum_{c=’A’}^{‘Z’}F[next(i,c)]/26, we can already solve the linear system: using Gaussian-Elimination.

By now, the problem is mostly solved. However, when I submit the code to the judge, I got “wrong answer”. I guess this is because of precision problem: I use double data type, since the expected length can be real number. However, the required output is integer, thus I cast the result into “long long” data type. Real number data type can store 15-16 significant digits. However, long long data type can represent numbers with almost 20 significant digits. Thus, in extreme case, converting double may lose necessary precision when the answer is very large.

I fixed this problem by using “long double” instead of “double”. To my best knowledge, “long double” is some system dependent data type in C. It has higher precision than normal double: on my machine(gcc), sizeof(long double) gives 12, and actually 80 bits are used to store data, 16 more bits are for padding. I am still not sure how many significant digits “long double” can have, but this problem shows that at least it is useful in some cases.

#include <vector>
#include <iostream>
#include <algorithm>
#include <string>
using namespace std;
const long double EPI=1e-6;
string s;
int N;
long double abs(long double a){return a>0?a:-a;}
int g(int i,char c){
	string t=s.substr(0,i)+c;
	for(int j=0;j<i+1;j++){
		if(t==s.substr(0,t.size()))
			return i+1-j;
		t=t.substr(1);
	}
	return 0;
}
void sub_from(vector<long double> & a,vector<long double> &b, long double k){
//	a=a-bk
	for(int i=0;i<a.size();i++)
		a[i]-=b[i]*k;
}
vector<long double> solve_linear_system(vector<vector<long double> >& a){

	for(int i=0;i<a.size();i++){
		int j;
		for(j=i;j<a.size();j++)
			if(abs(a[j][i]) > EPI)
				break;
		swap(a[j],a[i]);
		for(j=i+1;j<a.size();j++)
			sub_from(a[j],a[i],a[j][i]/a[i][i]);
	}
	for(int i=a.size()-1;i>=0;i--){
		for(int j=i-1;j>=0;j--)
			sub_from(a[j],a[i],a[j][i]/a[i][i]);
	}
	vector<long double> ans;
	for(int i=0;i<a.size();i++)
		ans.push_back(a[i].back()/a[i][i]);
	return ans;
}
long double solve(){
	int m=s.size();
	vector<vector<long double> > a;
	for(int i=0;i<m;i++){
		a.push_back(vector<long double>(m+2,0));
		a[i][i]=N;
		a[i][m+1]=N;
		for(int j=0;j<N;j++)
			a[i][g(i,'A'+j)]-=1;
	}
	a.push_back(vector<long double>(m+2,0));
	a[m][m]=1;	//xm=0
	vector<long double> ans=solve_linear_system(a);
	return ans[0];	//x0
}
int main(int argc, const char *argv[])
{
	int times;
	cin>>times;
	for(int i=0;i<times;i++){
		cin>>N>>s;
		if(i)cout<<endl;
		cout<<"Case "<<i+1<<":"<<endl<<(long long)solve()<<endl;
	}
	return 0;
}

My grep

Thanks to Robert Sedgewick(Princeton)’s lecture notes, I just implemented a ‘grep’ program, which perform regular expression matching.
The code is only about 100 lines long, with the help of c++ STL.
Matching is done by converting RE into NFA. The claimed complexity is O(NM), N is the length of string, M is the length of RE.

Now it only support very basic regular expression matching:

a-z,A-Z,0-9 literal meaning
. match any character
* repeat multiple times
() group regular expression
| logic or, must appear inside ()

Can do some improvement in the future:
Support RE like a|b, i.e. | without parentheses. This can be done by adding ( ) around the input.
Add support for RE like [a-z], [^a-z], [abc].(may need a parser for this)
Add support for meta characters like \d: may need an abstract character type, which can match certain set of characters.
To support ^ and $, I think we need another abstract type for input string characters.

GDB tips

gdb is quite a useful tool. Despite it’s text interface, it can do more than GUI debuggers. I am still learning it, and this post is used as a memo.

  • ptype is useful to print type, structure(class) definition. This is useful to check user defined data types(sometimes deeply defined in header files).
  • disassemble is useful to see assembly code of function/statement.
  • info registers shows registers information.
  • print a@10 shows 10 elements start from address of variable a.
  • x command examine memory content. x/8xw $eip displays 8 words since where eip register points to. x/i $eip
  • shows instruction eip register points to.