Friday, March 21, 2014

Working with the dual in practice

Recently we have been discussing about the use of dual problem: in our last post  we showed that building the dual can be a fairly automatized process; we also argued that there are at least two good reasons why we may want/need to work with it:
  1. solving the dual may be more efficient (see this example);
  2. the dual variables relate to the reduced coefficients and shadow costs (see this post);
So either we solve the primal and we would like to have the values of the dual variables, or the other way round! In both cases, we boil down to the question (remember the dual of the dual is the primal):

Given the solution of a linear program, how can we recover its dual variables?

We will discuss how to work on the dual solution for linear programs, but most of our arguments will apply to conic programs as well. Remember our primal in standard form:

\begin{equation}\tag{P}
\begin{aligned}
& \min \quad c^T x\\
& s.t.\\
& Ax -b =0\\
& x\geq 0,
\end{aligned}
\end{equation}

and the corresponding dual

\begin{equation}\tag{D}
\begin{aligned}
& \max \quad b^Ty\\
& s.t.\\
& c - A^Ty - s =0\\
& s\geq 0.
\end{aligned}
\end{equation}

For a given optimal primal solution $x^\star$ and dual $y^\star,s^\star$, duality theory tells us:

(1) Strong Duality: the optimal values of (P) and (D) coincide

(2) Complementary Slackness: it must hold  $x_i^\star s_i^\star = 0$ for all $i=1,\ldots,n.$ Thus $s_i=0$ for all $i=1,\ldots,n$ such that $x_i^\star>0$.

(3) Optimal Basis: using the partition of the primal variables induced by the optimal basis, we can define a full rank sub-matrix of the constraints coefficient matrix, and use it to compute $y^\star$...but this goes way to far for a simple blog post, you should refer for instance to Section 3.5 in [1].

If points (1) and (2) are straightforward, point (3) involves some work and attention....should we really spend time on that? The answer is NO! The reason is solvers usually provide some functionality to retrieve the dual variables after the primal have been solved:

1. Primal-dual linear solvers (such as MOSEK) solve the primal and the dual at the same time, and thus provide the dual variables directly.

2. Primal or Dual solvers based on the Simplex method have all the information to recover the dual variables easily basis (see again [1]).

So you just need to query somehow the solver that you use to retrieve the dual values. Let's play with MOSEK and a toy linear programming problem taken from [2] Section 6.2. In LP format the primal is


and its dual is
Note that the two problems are formulated in a natural way. MOSEK  represents linear problems as

\begin{equation}\tag{MP}
\begin{aligned}
& \min c^T x + c_f \\
& s.t. & \\
&& l^c \geq Ax \geq u^c\\
&& l^x \geq x \geq u^x
\end{aligned}
\end{equation}

which is a more general and natural way for the user. After some boring but trivial calculations we obtain the dual problem in the form

\begin{equation}\tag{MD}
\begin{aligned}
& \max (l^c)^T s_l^c - (u^c)^T s_u^c + (l^x)^T s_l^x - (u^x)^T s_u^x\\
& s.t. & \\
&& A^T(s_l^c - s_u^c)  + s_l^x - s_u^x=c\\
&& s_l^x,s_u^x, s^c_l, s_u^c \geq 0
\end{aligned}
\end{equation}

As you may see, there are exactly a variable for each primal constraints. Running the solver on the primal we get
and the solution looks like The columns named as dual represent the corresponding dual variables. Running the solver on the dual we obtain
and the solution looks like
But wait a minute...it seems MOSEK just swap primal and dual!?!?!?

True in some sense! MOSEK implements a primal-dual algorithm that solves both primal and dual at the same time. For small problems there might be no difference at all. Few comments on the solver output:
  1. Feasibility and optimality are achieved at the same time: during the execution the intermediate solutions are neither primal nor dual feasible. This is shown in the columns PFEAS and DFEAS as the norm of the constraint violation.
  2. Column POBJ and DOBJ show the progress towards the optimal value and, as Strong Duality Theorem told us, they are the same once optimality is reached.
  3. Dual variables are reported in the LOWER/UPPER DUAL columns.
  4. To each non active bound corresponds a zero dual variable, as for the Complementary Slackness.
  5. Dual values are zero for each missing bounds: this is because a solver, as MOSEK does, actually sets a very large bound (in absolute value) in order to overcome numerical issues. A missing bound resolves in practice to a non active bound, as in the previous point.

If you want to dig more in the meaning of the dual information, follow Section 6.2 in [2]. For us,  it is important you get the message:

Dual information are readily available from your solver and can be useful!

Most of what we have said is also, and even more, true for second order conic programs: duality holds but to play with it you should ask your preferred solver.

[1] Papadimitriou, C. H., & Steiglitz, K. (1998). Combinatorial optimization: algorithms and complexity. Courier Dover Publications.
[2] Williams, H. P. (1999). Model building in mathematical programming.