Thursday, November 6, 2014

Linearize That!

For whatever reason, I've seen a bunch of questions posted on various fora boiling down to "How do I linearize <insert grossly nonlinear function>?" Whether by coincidence or due to some virtual viral epidemic, I've seen three or four in the past week that involved logarithms. So, without further ado, here is the quick solution:

Source: Norbert Schnitzler (reproduced under Creative Commons license)
Just put your function/constraint on the ground in front of it and back away carefully.

Which is to say, you cannot linearize an arbitrary nonlinear function. That's why they stuck the "non" in nonlinear.

I suspect the confusion stems in part from the fact that you can linearize certain otherwise nonlinear expressions involving binary (or, if you care to do binary expansions, general integer) variables. In fact, I've written about it myself (cf. Binary Variables and Quadratic TermsInteger Variables and Quadratic Terms). What is perhaps not apparent is that, under the hood, the linearization techniques amount to decomposing the feasible region into convex chunks on which the original nonlinear function is actually linear. For example, if we want to linearize the product $f(x,y)=xy$ where $x$ is a binary variable and $y$ is a continuous variable, the linearization described in the first of my two posts equates to splitting the feasible regions into two subregions, one where $x=0$ and $f(x,y) = 0$ (trivially linear) and the other where $x=1$ and $f(x,y)=y$ (also linear).

Now consider the nonlinear, concave and suspiciously logarithmic function $f(x)$ shown in red in the following plot:

 
Suppose that you have the constraint $f(x)\le b$ in a mathematical program, where $b$ is either a constant or a linear function. Since $f$ is concave and on the left side of a $\le$ constraint, the feasible region becomes nonconvex, a major problem.

Perhaps there is a way to linearize this, other than the analog method I alluded to in the photo, but if so I do not know one. What you can do, however, is a linear approximation. In the plot, I broke the (finite) domain of $x$ into intervals $[t_0, t_1], [t_1, t_2], \dots, [t_{N-1},t_N]$. Suppose now that we introduce continuous variables $\alpha_0,\dots,\alpha_N$ satisfying three requirements:
  • $\sum_{i=0}^N \alpha_i = 1$;
  • at most two of the $\alpha$ variables can be nonzero; and
  • if two are nonzero, they must be consecutive ($\alpha_j$ and $\alpha_{j+1}$ for some $j$).
We can now write $$x=\sum_{i=0}^N t_i \alpha_i$$(a linear constraint) and replace $f(x)$ with the linear expression $$\sum_{i=0}^N f(t_i) \alpha_i,$$noting that $f(t_i)$ is a constant. As an illustration, the midpoint of the interval $[t_1, t_2]$ in the plot would correspond to $\alpha_1 = \alpha_2 = 0.5$, all other $\alpha_j = 0$. The blue piecewise-linear curve shows the approximation $\sum_{i=0}^N f(t_i) \alpha_i,$.

The $\alpha$ variables form what is known as a Type 2 Special Ordered Set (SOS2). Although the $\alpha$ variables appear continuous, the SOS2 itself makes the problem a mixed integer program, effectively chopping the feasible region into subdomains (in this example, the intervals $[t_i, t_{i+1}]$) on which the approximation is linear.