Saturday, June 29, 2013

The End is Nigh

We're about two days from the demise of Google Reader (which also means the loss of a go-to topic for blog posts). Here are some last minute links for anyone who has not switched yet:
  • has a Twitter poll listing various options. (Hat tip to The Endeavour for the link.) Be advised that some of the options in the poll are not really replacements for Reader (they're ancillary services and such), some are single-platform options that will not sync across devices (for instance, desktop only), and at least one requires that you run your own server.
  • Google Reader: With the End Near, Here Are Some Alternatives is a recent (week-old) eWeek article that lists a few popular alternatives to Reader.
For what it's worth, I've been using the free version of Netvibes for a while. Netvibes is strictly browser-based, at least for the moment. The desktop version is quite good. On my Android tablet, the mobile version works, but it lacks some of the usability of the desktop version. That's to be expected, since the mobile version is presumably designed for use on a smart phone. My tablet has a 10" screen, though, and I want the missing features. The desktop version works tolerably well in the Chrome browser but does not display properly in Firefox, and there are quirks with scrolling and using certain links in Chrome. (I'm not sure if the quirks are the fault of the page design or something in Chrome.)

I've just started testing InoReader, the preferred choice of the author of the eWeek article. InoReader is also free and web-based. I like the interface, and the desktop version appears to work identically in the desktop and Android versions of Firefox, but there may be some quirks with it. Some posts are time-stamped incorrectly. More worrying, when I checked this morning I had 10 unread posts. When I checked back after lunch, there were only five, so some of the posts that I did not read either disappeared or were marked "read" (but not by me).

Update: I just discovered that one of the default settings in InoReader marks articles as read after you scroll past them. This, like "blow up building upon exit", strikes me as a setting that might possibly be useful but perhaps should not be the default. Barring evidence to the contrary, I'll assume that accounts for the "lost" posts. I also discovered that InoReader decided (again by default?) that I live in the Hawaiian time zone. (I wish!) We'll see if timestamps appear correctly now that I've disabused it of that idea. InoReader, by the way, has a truly impressive (if not dizzying) set of options that you can configure.

Update #2: InoReader is now my sole RSS reader, and I have no complaints at all. The one item on my wish list would be a mechanism to create RSS feeds on their servers to follow Twitter accounts. As mentioned in this post and this post, Twitter dropped support for RSS. Various third parties stepped in to provide free JSON-to-RSS bridges to follow Twitter accounts, but unfortunately they all seem to be getting slammed by use (despite Twitter's contention that RSS is "infrequently used today"). It would be nice if InoReader could provide a bridge of their own. Meanwhile, if you're still shopping for a GReader replacement, published an article "9 Great Google Reader Replacements" that might interest you.

Thursday, June 27, 2013

Updating JabRef

I rely on JabRef to created and edit bibliographic databases (for use with BibTeX in LaTeX documents). For whatever reason, the supported version in the Canonical repositories for Ubuntu (and Linux Mint) is lagging further behind than is usual for the repositories (which is saying something). The current stable version (as of this writing) is 2.9.2, while the current beta version is 2.10 beta 2. The "official" version in the repositories is 2.7 beta 1, which I estimate to be between 12 and 18 months old.

I looked around for more current Debian packages but ultimately decided the easiest way to update is manually. Most of the files from the official version do not need to be changed. Here are step-by-step instructions for installing a new version. They worked for me and should work for anyone else using a Linux distribution that employs the Debian packaging system. I assume here that you already have the "official" version of JabRef installed. Do not uninstall the old package, since we're going to retain a lot of its files.
  1. Download the latest version (stable or beta, your choice) from the SourceForge project page. Henceforth I will refer to the file (whose name should be JabRef-<version information>.jar) as <new jar>.
  2. Open a terminal in the download directory.
  3. Run the command
    sudo mv <new jar> /usr/share/jabref
    which will install the new version in parallel with the old one.
  4. Optional but recommended: run
    java -jar /usr/share/jabref/<new jar>
    just to make sure the downloaded version works. Once you're sure it works, exit JabRef.
  5. Still in the terminal, run the commands
    sudo rm /usr/share/java/jabref.jar
    sudo ln -s /usr/share/jabref/<new jar> /usr/share/java/jabref.jar
    This switches the version of JabRef being executed from the old one to the new one.
  6. Test JabRef by running it from the application menu and/or by double-clicking a .bib file.
  7. Optional: run
    sudo rm /usr/share/jabref/<old jar>
    to delete the old version. 
I'm not sure whether Synaptic will identify JabRef as a "broken" package as a result of this. We'll see.

Saturday, June 22, 2013

Accelerating Linux Hibernation

I'm about to commit what is probably analytics heresy: I'm going to give a subjective impression without collecting data (because I'm too lazy to do so).

I frequently hibernate my Linux Mint desktop rather than doing a full shutdown, just to save time both when I'm done for the day and when I'm resuming the next day. Both hibernation and wake-up seem to take progressively longer as the time between full boots increases. I notice this more if I've been using the NetBeans IDE for coding, to the point that I'll occasionally do a full restart just to make hibernation cycles go faster.

Looking for a way to speed things up, I found µswsusp (userspace software suspend -- I'm not sure why they want to start with a "mu", but for free software I'll forgive the occasional bit of fluff) and a very clear and helpful HOWTO. (Update: The HOWTO may have disappeared -- at least I can't get to it at the moment -- so here is an alternative source. Check the comment on Aug. 16, 2012.) I installed the small package from the Canonical repository, added the 00sleep_module file, and hibernated. Hibernation and reanimation both seem notably faster. Some of this may be the result of a percentage progress measure being displayed, but I think there is a real difference in both directions.

Sunday, June 16, 2013

No More Twitter RSS Feeds

I like to aggregate most of the social media content that I want to see (blog posts, forum posts, certain Twitter timelines) in one place. Before its announced (and impending) demise, that was Google Reader. Of late it has been Netvibes. The key is that, in both places, I obtain content via RSS feeds. I have a separate Twitter client, but prefer that a few "important" Twitter feeds be included in my RSS reader. The reason is that I regularly check the RSS reader for new content, but I check the Twitter client extremely sporadically. (If anyone needs proof that I'm not a teenager, the preceding sentence should suffice.)

Much of the world, myself included, subscribes to the mantra that "if it ain't broke, don't fix it". Unfortunately, software developers have a different mantra: "if it ain't broke, it needs more features". Throw in possibly some concerns over security and/or a desire to assert more control, and you arrive at Twitter's recent decision to do away with RSS feeds. They retired version 1.0 of their API and replaced it with version 1.1, which supports JSON only (no XML, no RSS, no anything else). Quoting their developer site:
Consequently, we've decided to discontinue support for XML, Atom, and RSS, which are infrequently used today.
If you're curious about the definition of "infrequently used", I suggest you monitor Twitter, tech forums and the blogosphere for the ensuing howls of outrage by those of us "infrequent users" whose feeds are now broken.

If you have a web site (for instance, a blog) on which you provide users an RSS feed to your Twitter account, you will need to replace that. If your site contains a widget displaying recent posts from your Twitter account, and if it used XML or RSS, it will be frozen in time (or broken) until you replace it with a newer widget. If, like me, you used RSS feeds to read other accounts, you will also need to find an alternative.

Netvibes recommends installing the updated version of their Twitter widget, but that's not an option I fancy. The Twitter widget is apparently a full-fledged Twitter client embedded in the Netvibes dashboard, and the last thing I need is another Twitter client to ignore (see first paragraph). Happily, there is at least one other option available: the aptly named Twitter RSS site. (Hat tip to Social Media Slant, which is where I found out about the site.) Twitter RSS, which is apparently not a site owned by Twitter, is a free service that is extremely simple to use. Type in the Twitter handle (minus the "@" symbol) for the account you want to follow, and they will supply you with a URL that provides an RSS feed for that account.

So I've replaced my original Twitter RSS links in Netvibes with links provided by Twitter RSS, and now I just need to sit down and catch up on my reading. (It took me a few days to get around to addressing the problem, and you know how fast stuff piles up on Twitter.) I have no idea who is behind Twitter RSS, how long they will be around, or how long it will take the "infrequent users" of RSS to overwhelm their servers, but for now we at least have an option.

Update: The Twitter RSS web site is gone -- the domain name lease seems to have expired. I've found another possible solution, described here.

Wednesday, June 12, 2013

The Signum Function in Mathematical Programs

In mathematics, the signum function is defined by\[ \DeclareMathOperator{\sgn}{sgn} \sgn(x)=\begin{cases} +1 & x>0\\ \phantom{+}0 & x=0\\ -1 & x\le0 \end{cases}. \]Occasionally, someone wants to create the equivalent of the signum function in a mathematical programming model. Since math programs are generally written using exclusively nonnegative variables, this effectively translates to tying a binary variable $z$ to the continuous variable $x$ so that $z=1\Longleftrightarrow x>0$. (If $x$ is unrestricted in sign, we use the difference $z^+-z^-$ of two binary variables $z^+$ and $z^-$, such that $z^+=1\Longleftrightarrow x>0$ and $z^- =1\Longleftrightarrow x<0$.) It turns out that this can be done in some cases but only approximated in others.

Suppose that $x$ and $z$ are components of some larger model\[ \begin{array}{lrcl} \mathrm{optimize} & f(x,y,z)\\ \mathrm{s.t.} & (x,y,z) & \in & \mathcal{P}\\ & x & \ge & 0\\ & y & \in & \mathcal{Y}\\ & z & \in & \{0,1\} \end{array} \]where $\mathcal{P}$ and the convex hull of $\mathcal{Y}$ are closed, convex polyhedra. These are the typical assumptions for a mathematical program. By requiring $\DeclareMathOperator{\conv}{conv} \conv(\mathcal{Y})$ rather than $\mathcal{Y}$ to be convex, I'm allowing $y$ to be an arbitrary mix of continuous and integer-valued variables. I'm also going to assume that the feasible region \[\mathcal{F}=\{(x,y,z)\in\mathcal{P}\,\left|\,x\ge 0,y\in\mathcal{Y},z\in \{0,1\}\right.\}\]is bounded. Models for real-world problems usually are, and arguably always should be, bounded. If $\mathcal{F}$ is unbounded, it should be possible to bound it by asserting liberal but finite bounds on $x$ and $y$.

Let $\gamma=\inf\{x\,\left|\,(x,y,z)\in\mathcal{F}, x\gt 0\}\right.\ge0$, and let $\Gamma=\sup\{x\,\left|\,(x,y,z)\in\mathcal{F}\right.\}<\infty$. Half the battle is solved by adding the constraint \[x\le \Gamma z,\]which enforces \[x>0 \implies z=1.\]The smaller $\Gamma$ is, the tighter the linear programming (LP) relaxation of the model is, and so (likely) the shorter the time required by the solver to optimize the model.

Now comes the tricky part. If $\gamma>0$ (and, more specifically, $\gamma$ is large enough to be distinguishable from 0 when the solver allows for rounding error), we can finish the job by adding the constraint \[x\ge\gamma z,\]which enforces\[x=0\implies z=0.\]The problematic case is when $\gamma=0$. If $\gamma=0$, we can find a feasible sequence $(x_n,y_n,z_n)$ such that $\lim_{n\rightarrow\infty}x_n=0$. Presumably $z_n=1$ for all $n$. Since $\conv(\mathcal{F})$ is closed and bounded (hence compact) and $\conv(\mathcal{Y})$ is closed, we can pass to a subsequence (which I will again write $(x_n,y_n,z_n)$ to avoid creating messy subscripts) such that $y_n\rightarrow y$ for some $y\in\conv(\mathcal{Y})$ and thus $(x_n,y_n,z_n)\rightarrow(0,y,1)\in\conv(\mathcal{F})$. With $(0,y,1)\in\conv(\mathcal{F})$ for some $y$, it is not possible to enforce\[x=0\implies z=0.\]The best we can do is to choose some $\epsilon>0$ (with $\epsilon$ large enough that the solver sees it as not within rounding error of zero), add the constraint\[x\ge\epsilon z,\]and in the process modify the model by eliminating all previously feasible solutions with $0\lt x \lt \epsilon$. Hopefully, we can choose $\epsilon$ so that the true optimum is not eliminated.

Incidentally, if we do not know $\gamma$ and $\Gamma$ a priori, and do not have at hand reasonable estimates $g$ and $G$ with $0\lt g\le \gamma$ and $\Gamma \le G\lt \infty$, we can obtain them (at some computational expense) by solving two LPs, first minimizing $x$ over $\conv(\mathcal{F})$ (to find $\gamma$ and then maximizing $x$ over $\conv(\mathcal{F})$ to find $\Gamma$.

Sunday, June 9, 2013

Arrowheads in TiKZ

For most publication graphics (including drawings used in blog posts), I favor the TiKZ package for LaTeX. Many of those drawings have included lines with arrowheads, and the arrowheads never posed any problems ... until recently. Consider the following small chunk of TiKZ code:

   \draw[->] (0,0) -- (1,1);

and the similar chunk

   \draw[->] (0,0) -- (1,1);

Both should draw a diagonal arrow, with the arrowheads differing only in terms of their style. (The optional argument in the first line of the second version specifies the use of LaTeX-style arrowheads.) Indeed, when compiling the document with latex (producing DVI output), ps2pdf (producing PDF output by way of an intermediate Postscript file) or XeTeX, I get the expected output:
output produced by ps2pdf: two line segments with arrowheads
On the other hand, when I compile the document with pdflatex (the program I normally use) or LuaTeX, the arrow head in the first version goes missing:

output produced by pdflatex: two line segments with an arrowhead only on the second
(As a sidebar, compiling with dvipdfm, which I never use, produces a PDF file with a single blank page.)

This occurs on a Linux Mint 14 (Nadia) system with the TeXLive 2012 LaTeX distribution. As I said, the problem is new to me. I've previously drawn arrows in TiKZ using TeXLive (perhaps the 2011 or 2010 version) without specifying LaTeX-style arrowheads and compiled successfully using pdflatex. The workaround is at least straightforward. I just have to remember to specify the arrowhead style.

On an unrelated note, if anyone is curious how I converted the PDF output to the PNG images above, I took a somewhat circular route. For whatever reason, the ImageMagick convert utility, which on a good day requires specification of various command line options to produce  decent looking PNG, insisted on producing empty image files from my PDF files. So I opened each PDF file in a viewer, took a screen capture by pressing Control-PrtScn, and pasted the image into IrfanView (running under WINE). In IrfanView, which has excellent image manipulation tools, I just cropped the image and saved it to a PNG file.

Friday, June 7, 2013

Objective Constraints: The Sequel

It belatedly occurred to me that I could have done a much better job explaining my previous post with a picture than I did with (copious) algebra. To recap the previous post, I claimed that adding a constraint to a mixed-integer programming model that puts an upper or lower bound on the objective value can cause solver performance to degrade of any of a several reasons, the least obvious of which is creating either dual degeneracy or multiple optimal solutions to the dual in the node linear program (LP) relaxation. We start with a model of the form\[ \begin{array}{lrclr} \mathrm{minimize} & c'x\\ \mathrm{s.t.} & Ax & \ge & b\\ & x & \ge & 0\\ & x_{i} & \in & \mathbb{Z} & (i\in I) \end{array} \](where $\mathbb{Z}$ is the set of integers), relax integrality and add a constraint $c'x\ge d$ to obtain the modified node LP\[ \begin{array}{lrclr} \mathrm{minimize} & c'x\\ \mathrm{s.t.} & Ax & \ge & b\\ & c'x & \ge & d\\ & x & \ge & 0. \end{array} \]The following picture depicts the scene at a typical node.
plot of the feasible region for the node LP, including a lower bound constraint on the objective
Vertex A represents the optimal solution in the absence of the added constraint (which runs through B and C), with optimal objective value $d^*$. As I mentioned in the previous post, at nodes where $d^*>d$ the added constraint will be nonbinding and just "extra baggage". Adding it can only benefit the user if $d>d^*$ at some nodes, as depicted here.

The shaded region is the feasible region including the objective constraint. The optimal value of the modified node LP is $d$, and B, C and every point on the line segment between them are all optimal solutions. So the node LP now has multiple optima, which will make the dual problem degenerate. The dual degeneracy can actually be seen in the picture. Slight increases or decreases in $d$ will shift the B-C edge up or down a bit, but that edge will clearly continue to be optimal. If we change $d$ to $d\pm \epsilon$ for small $\epsilon\gt 0$, the optimal objective value changes equally (to $d\pm \epsilon$), and so the dual variable $z$ corresponding to the objective constraint takes the value $z^*=1$.

Now observe what happens if we make small perturbations in the constraints that run through A and B or A and C. Either B or C will slide a bit in or out along the B-C edge (say, to B' or C' respectively), but the segment [B', C] or [B, C'] will continue to be optimal with the objective value $d$ unchanged. So the dual variables corresponding to those constraints must take value zero. Variables dual to the nonbinding constraints take value zero by virtue of the complementary slackness theorem of linear programming. We end up with all dual variables equal to zero except the one ($z$) connected to the added constraint. As I mentioned in the previous post, that is likely to have adverse consequences for pricing decisions, presolving, etc.

An upper bound constraint on the objective ($c'x\le d$) would switch the reduced feasible region to the triangle A-B-C, and A would be the sole optimum. This illustrates why the upper bound constraint is less likely to create algorithmic problems (other than drag). If it cuts through the feasible region (as depicted), it will be nonbinding, and the dual solution will be unaffected. If it misses the feasible region, the modified problem will be infeasible (and the node will be pruned). Only if the added constraint is tangent at A ($d=d^*$) do we run into misadventures.

Thursday, June 6, 2013

Objective Functions Make Poor Constraints

[Author note: This post has been edited to correct errors and reduce fluff.]

From time to time someone will mention the notion of adding an explicit constraint on the objective value of a mixed-integer programming (MIP) model to the model. If the original model can be written (without loss of generality) as\[ \begin{array}{lrclr} \mathrm{minimize} & c'x\\ \mathrm{s.t.} & Ax & \ge & b\\ & x & \ge & 0\\ & x_{i} & \in & \mathbb{Z} & (i\in I) \end{array} \](where $\mathbb{Z}$ is the set of integers), then the modified model is either\[ \begin{array}{lrclr} \mathrm{minimize} & c'x\\ \mathrm{s.t.} & Ax & \ge & b\\ & c'x & \ge & d\\ & x & \ge & 0\\ & x_{i} & \in & \mathbb{Z} & (i\in I) \end{array} \]if the user seeks to impose a lower bound on the objective value or\[ \begin{array}{lrclr} \mathrm{minimize} & c'x\\ \mathrm{s.t.} & Ax & \ge & b\\ & -c'x & \ge & -d\\ & x & \ge & 0\\ & x_{i} & \in & \mathbb{Z} & (i\in I) \end{array} \]if the user seeks to impose an upper bound on the objective value. Imposing a lower bound may serve to tighten bounds in the LP relaxation a bit; imposing an upper bound may allow for earlier pruning of portions of the search tree (much as supplying a good starting solution would, but without the starting solution and with the danger that too aggressive a bound might cause the solver to conclude incorrectly that the problem was infeasible).

There are two relatively obvious and possibly minor disadvantages to adding the objective constraint explicitly. First, it increases the number of rows in the constraint matrix by one, and thus adds a bit of drag to pivoting operations and a bit of extra memory consumption. Second, the added constraint may raise the density of the constraint matrix, which again creates computational drag. In large scale models, it is not uncommon for the constraints to be sparse -- indeed, they may have to be sparse if you are to solve the model -- but the objective function to be relatively dense. A dense objective function generally creates less computational drag than a dense constraint does.

A possibly more serious issue, though, is a bit less obvious. I first came across it on a support forum, so the ideas here are by no means original. (I believe I first saw it mentioned by IBM's Tobias Achterberg, but I won't swear to that.) Let's start with the linear relaxation of the expanded model in the first case (asserting a lower bound). The dual linear program (LP) is\[ \begin{array}{lrcl} \mathrm{maximize} & b'y+dz\\ \mathrm{s.t.} & A'y+cz & \le & c\\ & y,z & \ge & 0. \end{array} \]Suppose that $x=x^*$ is an optimal solution to the primal problem, and let $(y^*,z^*)$ be a corresponding optimal solution to the dual. The critical observation is that the added constraint can only be useful if it is binding, so we may assume that $$c'x^*=d=b'y^*+dz^*.$$The impact of this will be that the added constraint creates either dual degeneracy or an infinite number of optimal solutions to the dual.

Let me dispense with the most common case first. Suppose that $z^*=1$. It follows that $A'y^*\le 0$ and $b'y^*=0$, in which case $y=\lambda y^*,z=z^*=1$ is optimal in the dual for any $\lambda \ge 0$. So there are infinitely many solutions to the dual (in fact, an entire ray) unless $y=0$ (which is actually the most likely scenario), in which case the dual solution is massively degenerate. I illustrate this in my next post.

The remaining cases are "edge cases" (no pun intended), corresponding to the added constraint being tangent to the original feasible region (making the primal solution degenerate). I will grind through the algebra below, but feel free to skip it, as it is not clear that the added constraint will be tangent to the feasible region often enough to worry about.

Now assume that $z^*\lt 1$. Let $z=z^* + \epsilon$ and $$y=\left(1-\frac{\epsilon}{1-z^*}\right)y^*$$for an arbitrary $\epsilon$ with $0\lt \epsilon \le 1-z^*$. We verify easily that $y\ge 0$,$$\begin{gather*} A'y+cz=\left(1-\frac{\epsilon}{1-z^{*}}\right)A'y^{*}+\left(z^{*}+\epsilon\right)c\\ \le\left(1-\frac{\epsilon}{1-z^{*}}\right)\left(c-z^{*}c\right)+\left(z^{*}+\epsilon\right)c\\ \le\left(1-\frac{\epsilon}{1-z^{*}}\right)\left(1-z^{*}\right)c+\left(z^{*}+\epsilon\right)c\\ =\left(1-z^{*}-\epsilon\right)c+\left(z^{*}+\epsilon\right)c\\ =c, \end{gather*}$$and $$\begin{gather*} b'y+dz=\left(1-\frac{\epsilon}{1-z^{*}}\right)b'y^{*}+d\left(z^{*}+\epsilon\right)\\ =\left(1-\frac{\epsilon}{1-z^{*}}\right)\left(d-dz^{*}\right)+dz^{*}+d\epsilon\\ =d-dz^{*}-d\epsilon+dz^{*}+d\epsilon\\ =d. \end{gather*}$$So $(y,z)$ is another optimal dual solution for any $\epsilon\in (0,1-z^*)$, and we again have an infinite number of optimal solutions to the dual.

Finally, the case $z^*\gt 1$ is similar (and similarly tedious). Set $z=z^*-\epsilon$ where $0\lt \epsilon\lt z^*-1$ and set$$y=\left(1-\frac{\epsilon}{z^*-1}\right)y^*.$$Verification that $(y,z)$ is feasible in the dual and that $b'y+dz=d$ is left to the reader as an annoying exercise.

It turns out that adding an upper bound on the objective (lower bound in a maximization) is less likely to be problematic (as I discuss/hand-wave in my next post). The extra constraint still does add drag, and if it happens to be tangent to the feasible region, we can again encounter multiple dual solutions and/or dual degeneracy.

Why do we care that the dual has an infinite number of solutions? The dual solution comes into play in a number of ways when solving a MIP model, from pricing primal columns to fixing primal integer variables and other node presolve reductions. While there is no guarantee that having an infinite number of dual solutions will slow the algorithm, and in fact that is always possible even without constraining the original objective, it is reasonable to suspect that making the dual information at a node less precise is unlikely to help and plausibly likely to hurt ... a suspicion that I believe has been born out in practice.

There are alternatives to imposing constraints on the objective value. One is to change the objective to minimization of a new variable $w$ and add the constraint $c'x-w=0$ (or $c'x-w\le 0$, which is equivalent). You are still expanding the problem and possibly increasing the density of the constraint matrix, but I believe this will avoid the problem of introducing an infinite number of dual solutions. CPLEX has parameters CutUp and CutLo that allow you to impose an upper limit on the objective of a minimization problem and a lower limit on the objective of a maximization problem respectively. This gets you the potential earlier pruning of the search tree without introducing multiple dual solutions. I suspect that other solvers provide something similar.

Update: For a discussion of this topic, including why the use of $w$ may still create issues, see Tobias Achterberg's reply in this thread on the  CPLEX support forum.