Approximation Algorithms

Approximation algorithms are algorithms used to find approximate solution for optimization problems.Optimization problem refers to the problem of finding the best solution from all feasible solutions. There are many optimization problem that are polynomial time solvable like Minimum Spanning Tree, Min-cut, max-flow, maximum matching etc. But many practical and significant optimization problems are NP-Hard, in which we still do not know of any polynomial time algorithm. Few such problems are:

  • Traveling Salesman Problem: finding a minimum cost tour of all cities
  • Vertex Cover – find minimum set of vertex that covers all the edges in the graph
  • Max Sat – Assignment that can satisfy maximum number of clauses
  • Max Clique

As of today there is no polynomial time solution for NP-hard problems. Therefore we try to come up with approximate solutions for our problem and try to achieve the best possible bound wrt to optimal solution. This class of algorithms that run in polynomial time and output a solution close to the optimal solution is called approximation algorithm.

Formal Definition

An \alpha-approximation algorithm for an optimization problem is a polynomial time algorithm that for all instances of the problem produces a solution whose value is within a factor of \alpha of an optimum solution.

\alpha is the approximation ratio or approximation factor of the algorithm. For example, let P be the minimization problem and I be an instance of P, Let A be an algorithm that finds feasible solution to instances of P. Let A(I) is the cost of the solution returned by A for instance I, and OPT(I) is the cost of the optimal solution (minimum) for I. Then, A is said to be an \alpha-approximation algorithm for P if

\begin{aligned}  \forall I \frac{A(I)}{OPT(I)} \leq \alpha\\  \end{aligned}

where, \alpha > 1. For the maximization problem, the equation will be reverse of it ie

\begin{aligned}  \forall I \frac{A(I)}{OPT(I)} \geq \alpha\\  \end{aligned}

where, \alpha < 1

 

Approximation Algorithm for Vertex Cover

Given G=(V,E) find a minimum size subset S \subseteq V such that S covers all edges in G i.e. every edge is incident to at least one vertex in S.

Algorithm:

  1. Pick a maximal matching M in G
  2. S = all vertices of matching edges in M

Analysis

Algorithm gives a 2-approximation solution for any graph G.

To prove this we need to show:

A (solution provided by our algorithm) \begin{aligned}  = |S| \leq 2OPT.  \end{aligned}

LEMMA 1: |OPT(G)| \geq |M| for any maximal matching M.

By definition, we cannot add any more edge into a maximal matching set, thus any remaining edge is adjacent to at least one edge in M. If we pick both endpoints of all edges in M as a vertex cover S, then all edges inside S (connecting two vertices in S) are obviously covered and all the remaining edges will be incident to S (as they all were adjacent to M). The size of OPT(G) is at least the size of M, because OPT(G) needs to include at least one of each endpoint of all edges in M to cover all edges in G.

Also, from the algorithm |S| = 2|M|. Therefore from lemma 1 and algorithms we have,

\begin{aligned}  \frac{|S|}{2} \leq OPT\\  \Rightarrow |S| \leq 2OPT  \end{aligned}

Hence proved.

 

Traveling Salesman Problem

Definition:

Given a list of cities and the distances between each pair of cities (i.e. a complete graph), what is the shortest possible route that visits each city exactly once and returns to the origin city?

Proving Approximate Result for General TSP is NP Hard:

For any factor \alpha, finding \alpha-approximate solution for TSP is NP-hard.

Proof :

We will reduce Hamiltonian Cycle(HAM) to \alpha-approximate TSP. For HAM we have,

Input: an undirected (not necessarily complete) graph G = (V, E). Output: YES if G has a Hamiltonian cycle, NO otherwise

Reduction: 

Suppose A is a \alpha-approximation algorithm for TSP. We will use A to solve HC in polynomial time.

Create a graph G’ from G by completing G ie connecting all vertices. Set weight of original edges to 1 and to all new edges to some large number say M (M >> \alpha n). Pass this G’ to A and if cost of path given by A is \leq \alpha n then G has a HAM cycle otherwise if cost \geq M then original graph G does not has a HAM cycle.

Therefore, we proved above that even approximate TSP is hard if no constraint applied on edges.

 

Metric TSP

We constrain the edges by triangle inequality i.e. d_{i,j} \leq d_{i,k} + d_{k,j} for three vertex i,j and k.

Intuition

We can modify our TSP problem statement of visiting every vertex exactly once to visiting every vertex at least once as the cost for exactly once is no higher than the cost of at least once (from triangles inequality). For example, whenever you visit already visited vertex from a vertex u, store that vertex u and join it directly to the new vertex v visited afterwards. By, triangle inequality you know that the cost of adding edge u-v (shortcut) and removing all other edges in path from u to v (via already visited vertices) is no higher than original cost.

We know, given an Eulerian graph we can find an Eulerian tour in polynomial time. So if we had an Eulerian graph with cities from a TSP as vertices then we can use such a method for finding an Eulerian tour to find a TSP solution. By triangular inequality we know that the TSP tour can be no longer than the Eulerian tour and as such we have a approximate solution for the TSP.

Algorithm 1

  1. Find a minimum spanning tree for the problem
  2. Create duplicates for every edge to create an Eulerian graph
  3. Find an Eulerian tour for this graph
  4. Convert to TSP using shortcuts.

Analysis

Let A(I) be the solution given by our approximation algorithm for problem instance I. Optimum solution be OPT(I).

Claim 1: A(I) \leq 2MST(I). From step 1 in the algorithm cost is MST(I). In step 2 it becomes 2MST(I). Now, cost of euler tour will be 2MST(I). By triangles inequality, when we use shortcuts, cost ie A(I) is no higher than the cost of euler tour. Therefore, A(I) \leq 2MST.

Claim 2:  OPT(I) \geq MST(I). Let $\sigma$ be the cost of optimum solution. Than removing the edge of minimum weight from this TSP tour will give us a spanning tree T. The cost for this spanning tree is bounded by cost of MST. Therefore, OPT(I) = \sigma \geq \sigma(T) \geq MST.

From claim 1 and claim 2 we have, A(I) \leq 2OPT(I). Hence a 2-approximate solution.

A better 1.5-approximation algorithm

We will modify the step of duplicating the whole graph to only duplicate edges for odd degree vertices.

Claim: Number of odd degree vertices is even.

The sum of all the degrees is equal to twice the number of edges. Since the sum of the degrees is even and the sum of the degrees of vertices with even degree is even, the sum of the degrees of vertices with odd degree must be even. If the sum of the degrees of vertices with odd degree is even, there must be an even number of those vertices.

Therefore we can make these degrees even by pairing all odd degree vertices.

Algorithm 2 (Christofides’ algorithm)

  1. Find a minimum spanning tree for the problem
  2. Add the edges for perfect matching on odd degree vertices to create an Eulerian graph
  3. Find an Eulerian tour for this graph
  4. Convert to TSP using shortcuts

Analysis

From step 1 in the algorithm cost is MST(I). In step 2 it becomes 1.5 MST(I) as adding perfect matching edges for odd degree vertices induces a cost \leq \frac{1}{2}OPT. Visualize it as a cycle through all odd vertices. Now we have two perfect matching whose sum = OPT. Choose the matching with lower cost and hence cost \leq\frac{1}{2}OPT. Rest is same as described before for algorithm 1. Therefore, A(I) \leq 1.5OPT(I)