Technical notes: January 2011

Sunday, January 30, 2011

Successive shortest paths algorithm description with source

•Start with flow f with f(u,v)=0 for all u,v.

•repeat until value(f ) = r

–Find the shortest path P in Gf from s to t

–Let q be the minimum residual capacity of an edge on P.

–Send min(q,r – value(f)) additional units of flow across P.

On the successive shortest paths algorithm :

•May use exponential running time

•Assume G has no negative edge costs.

•Gives optimal answer.

–Invariant: f has minimum cost among all flows with value value(f).

•Suppose we obtain f’ from f by sending across P.

•Let f’’ be a minimum cost flow with same value as f’.

•Write f’’ – f as weighted sum of paths from s to t in Gf and circuits in Gf. Argue that cost(f’ – f) / cost(f’’ – f), using that:

–P is shortest path

–Circuits have non-negative costs, by optimality of f.

Source code link is given here

Minimum cost Maxflow or mincut maxflow or Successive Shortest Path source code in C++

when there is need of minimum cost along with maximum flow then comes this problem.
Sometimes it called successive shortest path. Here is a problem for better understanding 10498

*************************************************************************************

#include<iostream>

#include<string>

#include<vector>

#include<queue>

#include<algorithm>



using namespace std;



#define MV 102



typedef long long LL;



LL  cst[MV][MV],

    cap[MV][MV],

    par[MV],dis[MV],

    source,sink,flow;



vector<int>adj[MV];



struct pq{

    LL d,n;

    void ini(LL a,LL b){n=a;d=b;}

    bool operator<(const pq &b)const{return d>b.d;}

};





bool PFS(int s,int sk)

{

    priority_queue<pq>Q;

    pq u,v;

    int i;



    memset(dis,50,sizeof dis);

    dis[1]=0;

    u.ini(1,0);

    Q.push(u);



    while(!Q.empty())

    {

        u=Q.top();Q.pop();

        for(i=0;i<(int)adj[u.n].size();i++)

            if(cap[u.n][adj[u.n][i]] && dis[u.n]+cst[u.n][adj[u.n][i]]<dis[adj[u.n][i]])

            {

                dis[adj[u.n][i]] = dis[u.n]+cst[u.n][adj[u.n][i]];

                v.ini(adj[u.n][i],dis[adj[u.n][i]]);

                Q.push(v);

                par[adj[u.n][i]]=u.n;

            }

    }



    if(dis[sk]<dis[0])  return true;

    return false;

}



void MakeResidal(int u)

{

    cst[par[u]][u]=cst[u][par[u]]=-cst[par[u]][u];

    cap[u][par[u]]++;

    cap[par[u]][u]--;

    if(u!=1)    MakeResidal(par[u]);

}



LL MinCostFlow(int s,int sk,LL f,LL lc)

{

    source=s;

    sink = sk;

    LL tt=0,cur;

    if(!f)  return 0;

    while(f && PFS(s,sk))

    {

        cur=f;

        if(f>lc)  cur=lc;

        f-=cur;

        tt+=cur*dis[sk];

        if(!f)  return tt;

        MakeResidal(sk);

    }

    return -1;

}



int main()

{

    int n,m,i,u,v;

    LL w,f,lc;

    //freopen("Data Flow.in","r",stdin);



    while(scanf("%d%d",&n,&m)==2)

    {

        for(i=1;i<=n;i++)

            adj[i].clear();

        memset(cap,0,sizeof cap);

        memset(cst,0,sizeof cst);



        while(m--)

        {

            //scanf("%d%d%lld",&u,&v,&w);   //%d -> %lld for w

            cin>>u>>v>>w;

            cst[u][v]=cst[v][u]=w;

            cap[u][v]=cap[v][u]=1;

            adj[u].push_back(v);

            adj[v].push_back(u);

        }



        //scanf("%lld%lld",&f,&lc);   //%d -> %lld

        cin>>f>>lc;



        w = MinCostFlow(1,n,f,lc);

        if(w==-1)   printf("Impossible.\n");

        //else printf("%lld\n",w);  //%d -> %lld for w

        else cout<<w<<endl;

    }

    return 0;

}

Basic Maxflow or Network Flow source code in c++

When there is one source and one sink and prospect is that to send maximum unit to sink then comes the need of maxflow.It is normally a bfs which helps to find the maximum capacity over the line load and reduce this capacity from load until the load comes to zero.I mean until there is any path from source to sink.

Here is the code for maxflow.Hope it help.

#include<cstdio>

#include<cmath>

#include<cstring>

#include<string>

#include<iostream>

#include<queue>

#include<algorithm>

using namespace std;

#define INF  2147483647





/***************** Max Flow Algorithm***************

****************************************************/

const int maxINDX=102;

int flow[maxINDX][maxINDX],cap[maxINDX][maxINDX];

int path[maxINDX];

int src,dest;



int BFS(int node) {

    int i,item,cf;

    queue<int> q;

    for(i=1;i<=node;i++)

        path[i]=-2;// color it

    q.push(src);

    path[src]=-1;

    while(!q.empty()) {

        item=q.front();

        q.pop();

        if(item==dest) {

            i=dest;

            cf=INF;

            while(path[i]!=-1) {

                if(cf>flow[path[i]][i])

                    cf=flow[path[i]][i];

                i=path[i];

            }

            return cf;

        }

        for(i=1;i<=node;i++) {

            if(flow[item][i]!=0 && path[i]==-2) {

                q.push(i);

                path[i]=item;

            }

        }

    } // end of while loop

    return 0;

}



int maxflow(int node) {

    int i,j,cf,totalflow=0;

    while((cf=BFS(node))!=0) {

        totalflow+=cf;

        i=dest;

        while(path[i]!=-1) {

            flow[path[i]][i]-=cf;

            flow[i][path[i]]+=cf;

            i=path[i];

            cout <<i<<" ->"; // for available pathS

        }

        cout << endl;

    }

    return totalflow;

}



void graph_initial(int node){

    int i,j;

    for(i=1;i<=node;i++){

        for(j=1;j<=node;j++) {

            cap[i][j]=0;

            flow[i][j]=0;

        }

    }



}



int main()

{





    int node,edge,a,b,c,i,resultedFlow,j;



    while(scanf("%d",&node)==1 && node!=0) {



        graph_initial(node);



        scanf("%d %d",&src,&dest);

        scanf("%d",&edge);



        for(i=1;i<=edge;i++) {

            scanf("%d %d %d",&a,&b,&c);

            cap[a][b]+=c;

            cap[b][a]+=c;

            flow[a][b]+=c;

            flow[b][a]+=c; // For BI-Directed Graph

        }



        resultedFlow=maxflow(node);



        printf("%d\n",resultedFlow);



    /*    for(i=1;i<=node;i++)

        for(j=1;j<=node;j++)

        cout << i<<","<<j<<":"<<flow[i][j] <<" "<<i<<":"<<path[i]<<endl;*/

    }



return 0;

}



/*

7

1 7 8

1 2 3

2 4 3

3 4 5

1 3 1

4 7 2

3 5 4

5 6 2

6 7 3

*/

ACM problem submit in Java with BigInteger in uva online judge

When to use :

•First of all in c,c++ we have some problem with Big number Like 2^32 or more.This problem Can be solved by using c,c++ ,but it is more easy or faster to use Java here.

•Because Java support BigInteger And BigDecimal in java.math.BigInteger library.

•Sometimes we have to generate number series like Fibonacci upto 1000.It is more easy in java to generate these without C.Where C or c++ is more complex.

•Note: When time is factor it is not right to use java

Library to use:

•Import java.io.*;

•Import java.math.BigInteger;

•Import java.math.BigDecimal;

•Import java.util.*;

Taking Input:

•For taking input Try to use Scanner.

•Because it handle all space and newline itself.

•Format:

Scanner in = new Scanner(new BufferedReader(new InputStreamReader(System.in)));

It will take input from Commandline.

Use while(in.hasNext()) to take every input from file or input line.

After that you have to take it as BigInteger ,BigDecimal,

Int ,String.Because All of them can be taken by this.

Exp: int no = in.nextInt();

Usable Math Operation:

•Power

•Mod

•Add

•Substruct

•Divide

•Multiply

•Remainder

•Compare

Using Procedure is given bellow:

All type Declaration:

•BigInteger bi, bi1, bi2, bi3, bi4;

•BigInteger[] bia; // array holding division result and remainder. String s; int i;

•long lng;

•float f;

•Double d;

•

•Note : Be sure to Submit solution By class Name as “Main”.

How to Use :

•bi = new BigInteger(s); Create BigInteger with decimal value represented by decimal String s.

• bi = BigInteger.ONE;Predefined value 1.

•bi = BigInteger.ZERO;Predefined value 0.

•bi = BigInteger.valueOf(lng);Use this factory method to create BigIntegers from numeric expressions. An int parameter will be automatically promoted to long.

•bi1 = bi2.abs();Returns BigInteger absolute value.

•bi1 = bi2.add(bi3);Returns sum of bi2 and bi3.

•bi1 = bi2.divide(bi3);Returns division of bi2 and bi3.

• bia = bi2.divideAndRemainder(bi3);Returns array of two BigIntegers representing the result of division and remainder of bi2 and bi3.

• bi1 = bi2.gcd(bi3);Returns greatest common divisor of bi2 and bi3.

•bi1 = bi2.max(bi3);Returns maximum of bi2 and bi3.

• bi1 = bi2.min(bi3);Returns minimum of bi2 and bi3.

• bi1 = bi2.mod(bi3);Returns remainder after dividing bi2 by bi3 .

•bi1 = bi2.multiply(bi3);Returns product of bi2 and bi3.

• bi1 = bi2.pow(bi3);Returns bi2 to the bi3 power.

• bi1 = bi2.remainder(bi3);Returns remainder of dividing bi2 by bi3. May be negative.

•i = bi.signum();-1 for neg numbers, 0 for zero, and +1 for positive.

•bi1 = bi2.subtract(bi3);Returns bi2 - bi3.

•d = bi.doubleValue();Returns double value equivalent of bi.

•f = bi.floatValue();Returns float value equivalent of bi.

•i = bi.intValue();Returns int value equivalent of bi.

• lng = bi.longValue();Returns long value equivalent of bi.

• s = bi.toString();Returns decimal string representation of bi.

• s = bi.toString(i);Returns string representation of bi in radix i.

•b = bi1.compareTo(bi2);Returns negative number if bi1<bi2, 0 if bi1==bi2, or positive number if bi1>bi2.

Example of Code in Uva:

•/**

• * @(#)prob_948.java

• * @author Sharma

• * @version 1.00 2010/10/6

• */

•

•import java.io.*;

•import java.math.BigInteger;

•import java.util.*;

•public class prob_948 {

•

•public static void main (String[] args) {

•

• Scanner in = new Scanner(new BufferedReader(new InputStreamReader(System.in)));

•

• BigInteger rp,pt,op,sub,di,re,st;

• BigInteger[] fb = new BigInteger[1050];

• fb[0] = BigInteger.ONE;

• fb[1] = BigInteger.ONE;

•

• for(int i=2;i<=1010;i++)

• fb[i] = fb[i-1].add(fb[i-2]);

•

• int cas = in.nextInt();

• op = BigInteger.ZERO;

• while(in.hasNext())

• {

•

• di =in.nextBigInteger();

• re = di;

• String str="";int o=0;

• if(di.compareTo(op) != 0)

• {

• for(int i=1000;i>=1;i--)

• {

• if(di.compareTo(fb[i]) >= 0)

• {

• str += "1";

• di = di.subtract(fb[i]);

• o = 1;

• }

• else if(o == 1)

• {

• str += "0";

• }

• else

• {

• str += "0";

• }

•

• System.out.println(re+" = "+str+" (fib)");

• cas = cas-1;

• if(cas == 0)break;

• }

•

• }

•

•}

Just change class prob_948 into class Main then it will accept.

Summary of a Search Engine Conference Presentation

Summary of a Search Engine:

Abstract

Today web has become larger with its own prospect.Every Day about 500 hundreds of new website is added to Internet.So it is rapidly increasing.These new web has new information and new element that an indivisual need.But it is not possible to learn or find any particular element from these sites where every sites may have avarage of 100 pages.Where comes the need of search.So from the prospect it is looklike that the search must be efficient and also the memory of storing these info is also need to be compromized.Here comes the choice of good search engine.So how and what makes a search engine different from other we will deal with it.

Key word :

1.Page Ranking

2. Probability

3. Anchor Text

4.Vector Space model.

5.Doc. Indexing.

6.Metadata.

7.Lexicon

8.Hitlist

9.Barrels

10.Parsing

1).Page Ranking:As search engine is developed the information may found in several web pages.So which page will come first,Then which one is next.It is done in a fashioned way that....... If the link goes from a page is (T), And comes to this page is (C).Here we have a multiply factor d(0<d<1).Particularly d = 0.85 .So the page Rank of a page is PR = (1-d)+PR(T1)/C(T1) +...+PR(Tn)/C(Tn).

2.Probability:Here it is somewhat possible to choose random page when a user is get bored.So this can be handle by probability.Suppose there is (n) link in a page for traversing and the user used 10 link serially.So selecting 1 link from these is (1/(n-10)).(Mine)

3.Anchor Text:Text of link is treated in a special way.The link is ascosiate with page the link points to. This has several advantages. Previous Method, anchors often provide more accurate descriptions of web pages than the pages themselves. Second, anchors may exist for documents which cannot be indexed by a text-based search engine, such as images, programs, and databases. This makes it possible to return webpages which have not actually been crawled. Note that pages that have not been crawled can causeproblems, since they are never checked for validity before being returned to the user. In this case, thesearch engine can even return a page that never actually existed, but had hyperlinks pointing to it.

However, it is possible to sort the results, so that this particular problem rarely happens.

4.Vector Space Model:It is 3 stage divided procedure.

Doc.Indexing

Weight of Indexed term

Rank of doc with query

5.Doc.Indexing: Remove (the,an) is non-significant words.

6.Metadata: Data about data.

Metadata sometime givesfake information about a site or page.

Content:Has keywords.

7.Lexicon: The lexicon has several different forms. One important change from earlier systems is that the lexicon can fit in memory for a reasonable price. In the current implementation we can keep the lexicon in memory on a machine with 256 MB of main memory. The current lexicon contains 14 million words(though some rare words were not added to the lexicon). It is implemented in two parts -- a list of the words (concatenated together but separated by nulls) and a hash table of pointers. For various functions, the list of words has some auxiliary information which is beyond the scope of this paper to explain fully.

8.Hit list: A hit list corresponds to a list of occurrences of a particular word in a particular document includingposition, font, and capitalization information. Hit lists account for most of the space used in both theforward and the inverted indices. Because of this, it is important to represent them as efficiently aspossible. We considered several alternatives for encoding position, font, and capitalization -- simpleencoding (a triple of integers), a compact encoding (a hand optimized allocation of bits), and Huffmancoding. In the end we chose a hand optimized compact encoding since it required far less space than thesimple encoding and far less bit manipulation than Huffman coding.

9.Barrels : Each document is converted into a set of word occurrences called hits. The hits record the word, position in document, an approximation of font size, and capitalization. The indexer distributes these hits into a set of "barrels", creating a partially sorted forward index.

10.Parsing: Any parser which is designed to run on the entire Web must handle a huge array ofpossible errors. These range from typos in HTML tags to kilobytes of zeros in the middle of a tag,non-ASCII characters, HTML tags nested hundreds deep, and a great variety of other errors thatchallenge anyone’s imagination to come up with equally creative ones. For maximum speed,instead of using YACC to generate a CFG parser, we use flex to generate a lexical analyzer whichwe outfit with its own stack. Developing this parser which runs at a reasonable speed and is veryrobust involved a fair amount of work.

Note : I am more and more interested in lexical analyzer(which I learn in a course named --”Compilers”).I will give a huge description and analysis and everything as far as i know in this prospect in another post.

11. Forward Index: The forward index is actually already partially sorted. It isstored in a number of barrels (we used 64). Each barrel holds a range of wordID’s. If a document contains words that fall into a particular barrel, the docID is recorded into the barrel, followed by a list of wordID’s with hitlists which correspond to those words. This scheme requires slightly more storage because of duplicated docIDs but the difference is very small for a reasonable number of buckets and saves considerable time and coding complexity in the final indexing phase done by the sorter.

12.Backward Index: The inverted index consists of the same barrels as the forward index, except that they have been processed by the sorter. For every valid wordID, the lexicon contains a pointer into the barrel that wordID falls into. It points to a doclist of docID’s together with their corresponding hit lists. This doclist represents all the occurrences of that word in all documents.

13.Crawler: Running a web crawler is a challenging task. There are tricky performance and reliability issues and even more importantly, there are social issues. Crawling is the most fragile application since it involves interacting with hundreds of thousands of web servers and various name servers which are all beyond the control of the system.

Note : I will give huge description later.

Total Step of Running a Search Engine:

1. Parse the query.

2. Convert words into wordIDs.

3. Seek to the start of the doclist in the short barrel for every word.

4. Scan through the doclists until there is a document that matches all the search terms.

5. Compute the rank of that document for the query.

6. If we are in the short barrels and at the end of any doclist, seek to the start of the doclist in the full barrel for every word and go to step 4.

7. If we are not at the end of any doclist go to step 4. Sort the documents that have matched by rank and return the top k.

Technical notes