Wednesday, December 31, 2014

HDMI Audio in Mythbuntu

My adventure replacing my old Mythbuntu box with a new one continues. It's been working for a while now, in terms of waking when it should, recording and replaying shows. Yesterday's immediate concern was sound. The PC hooks to my TV via an HDMI cable. While both live TV and recordings viewed through MythTV worked fine (including sound), applications run outside MythTV (such as VLC) had no sound.

As usual, Google was both my friend and my enemy -- it found lots of results for all of my search queries, most of which sent me off chasing wild geese. For some reason, Application > Settings > Settings Manager in Mythbuntu does not provide anything to configure sound, and installing the unity-control-center package did not remedy that. (I wonder if they block it to prevent users from configuring something that conflicts with the MythTV audio settings, which you set when configuring the MythTV front end?) I ran alsamixer and unmuted S/PDIF, but that did not fix things.

I did notice at some point that the configuration for ALSA (the sound system) at the operating system level seemed to default to PulseAudio. I read somewhere that MythTV, which uses ALSA, might kill the PulseAudio service to avoid conflicts with ALSA. That suggested either creating a configuration file for ALSA that avoids PulseAudio or configuring PulseAudio properly. I tried the latter and it worked. The steps were:
  1. Install the pavucontrol package from a repository.
  2. Run pavucontrol in a terminal.
  3. In the Configuration menu, choose the correct output device (in my case, "HDMI").
  4. Exit, cross fingers, reboot and test.
The fix may not be perfect. It works -- both MythTV and the VLC player (run from the desktop) have sound -- but I now get system error messages when I exit MythTV, which I was not getting before. The error messages are not specific (something unnamed apparently crashed), and I haven't poked through the logs. So far, the errors have not coincided with an perceptible loss of functionality, so they're just an annoyance. I can't be positive they are due to the sound fix, but they started immediately after it.

Thursday, December 18, 2014

Remotely Mounting a USB Drive

As I try to retire an old PC running Mythbuntu in favor of a new replacement, I continue to learn things about Linux ... with a gun to my head, as it were.

Without getting into the gory details, I can no longer work directly on the old PC -- I can't get a functioning display. I can, however, connect either by SSH terminal session or remote desktop. Once I get my recordings and database off the old box and onto the new one, I'm going to wipe the operating system, install something like Linux Mint, and then donate the old PC to anyone looking for a slightly noisy space heater.

The problem is moving the 300+ GB of stored recordings. I picked up a 1 TB Western Digital My Passport Ultra USB hard drive to use both for back and for file transfer. The trick was getting the bugger to mount without a direct connection. Using a remote desktop, I could see that Mythbuntu created a drive icon on my desktop, needing only to be mounted. Unfortunately, all that double-clicking the drive to open it, or right-clicking and selecting mount, got me was a message that I was lacking the authority to mount the drive. This is probably a consequence of remote connections having restricted permissions.

The answer proved to be mounting the drive (as root) from the command line -- once I found the key information on how to do so. It's actually quite painless when you know how. What follows was all done in a remote terminal. The first step was to run
sudo mkdir /media/passport 
to create a landing spot for the drive. (Feel free to substitute a different name for "passport".) Next, to get the particulars about the drive, I ran
sudo fdisk -l
(that's an "el", not a "1"), obtaining the following poop (omitting the parts about other drives):

Disk /dev/sdb: 1000.2 GB, 1000170586112 bytes
255 heads, 63 sectors/track, 121597 cylinders, total 1953458176 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x50dfffc7

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048  1953458175   976728064    7  HPFS/NTFS/exFAT

The key takeaways here, from the last line, are that the system sees the device as /dev/sdb1, and that it has an NTFS format (which I already knew from the packaging, which charitably assumes you're buying it for a Windows PC). Following the aforementioned directions, I ran
sudo mount -t ntsf-3g /dev/sdb1 /media/passport
and she was good to go. For FAT drives, refer to the instructions linked above.

Waking Up is Hard to Do

(Apologies to Neil Sedaka for the weak pun in the title.)

In the continuing war of man (me) against machine (my new Acer PC, which I'm configuring as a Mythbuntu PVR - see my last post), I've spent a full day trying to get the box to wake on a BIOS real-time clock alarm, so that MythTV can convince it to wake up when a recording is scheduled. The battle took so long because two things needed to be set correctly, and I kept getting stuck on incorrect combinations (this one right, that one wrong; vice versa; both wrong) while spending huge amounts of time searching the web and beating my head against the wall.

I copied a couple of time tested shell scripts from my previous Mythbuntu box, one that sets the BIOS alarm to an arbitrary time (used by MythTV to schedule actual recordings) and another that simply sets up an alarm for five minutes in the future (useful for testing). Before trying them out, I reviewed the official ACPI Wakeup wiki for MythTV. One of the first things it suggests is looking in the kernel log to make sure the BIOS supports ACPI wakeup (although I think pretty much all modern BIOSes do). In the log, I found the good news that the BIOS supports alarms up to one month in advance and this bit of seemingly bad news: "System wakeup disabled by ACPI".

I nonetheless followed the directions on the wiki for tweaking the shutdown script (under "Disable hwclock updates", although the tweak for Ubuntu does not so much disable them as reassert the alarm time after doing them). I ran the wake-in-5 test, and it failed.

Next, I got into the BIOS and checked the power settings. My BIOS has a "Deep Power Off Mode" (enabled by default) and a "Power on by RTC Alarm" feature (disabled by default). According to the BIOS help, the latter works only if you also disable the former. With "Deep Power Off Mode" disabled and "Power on by RTC Alarm" enabled, I was able to schedule a wakeup call for a fixed time either on a give day of every month (e.g., the 15th) or every day of the month. Neither is what I want -- I want to wake up only when there is something to record -- but at least I was able to confirm that the machine could wake on its own.

I'll skip a lot of gory details here, since they pertain to wasted effort. I was doing a lot of reboots during my experiments, and I noticed that one of the system logs was showing about five hours and change between reboots when actually they were minutes apart. Since I'm in the US Eastern time zone (GMT-5), I suspected that some confusion between local time and UTC (a five hour difference) was going on. Eventually I came to look in /etc/default/rcS, where I found the setting "UTC=no", telling the operating system that the BIOS was using local time. That seems to be the default setting, since my desktop PC has it as well. The BIOS on the Acer, however, was using UTC. I changed "no" to "yes", and I think that has fixed the five hour bumps. By itself, however, it did not fix the wakeup problem.

It turns out that having "Power on by RTC Alarm" not only schedules fixed wakeups I don't want, it also blocks the system from setting any other wakeup calls. This is covered in the "Fussy BIOS" section of the ACPI Wakeup document. So I went back into the BIOS, disabled "Power on by RTC Alarm" (but left "Deep Power Off Mode" also disabled), rebooted and ran the five minute wake-up test successfully.

The next step was to program the machine to wake and record a show ... which it did not do. So I went back, followed the directions for Configuring MythTV Automatic Wakeup and Shutdown (following directions -- what a concept!), and it seems to be working now.

Configuring wakeup/shutdown included setting up MythWelcome, which I had planned to do in any case. The directions are very clear and well-written, including explanations of one or two less obvious bits. There is one thing on the page that I find a bit confusing, though. In two different locations, the instructions call for you to exit MythWelcome by starting a terminal (F12 key) and then running a command to kill it. MythWelcome has a pop-up menu (accessed via the 'M' key), and one of the choices on that menu is to exit MythWelcome. It's both easier (much less typing) and a bit more natural/graceful, so I'm not sure why the instructions have you do it the hard way. In any case, I used the menu to exit, and things worked correctly.

Tuesday, December 16, 2014

RStudio Git Support

One of the assignments in the R Programming MOOC (offered by Johns Hopkins University on Coursera) requires the student to set up and utilize a (free) Git version control repository on GitHub. I use Git (on other sites) for other things, so I thought this would be no big deal. I created an account on GitHub, created a repository for my assignment, cloned it to my PC, and set about coding things. As a development IDE, I'm using the excellent (and free) RStudio, which I was happy to discover has built-in support for Git. All went well until I committed some changes and tried to push them up to the GitHub repo, at which point RStudio balked with the following error message(s):
error: unable to read askpass response from 'rpostback-askpass'
fatal: could not read Username for 'https://github.com': No such device or address
I searched high and low in RStudio but could not find any place to enter credentials for the remote repository. No worries, thought I; I'll just add my public encryption key on GitHub, and use the private key on the PC, which works for me when I'm using the NetBeans IDE with BitBucket. Alas, no joy.

According to the error messages, the immediate issue seems to be not my password (I don't think the challenge got that far) but my user name. Git has a global value for my user name recorded on my PC, but it's not the same as my user name on GitHub. I was able to set a "local" user name, matching the one I have on GitHub, by opening a terminal in my R project directory and entering the command
git config user.name <my GitHub name, in quotes>
git config user.email <my email address>
That's a bit more arcane than what I would expect a beginner to know, but so be it. I thought that would fix the problem. It did not; the error message remained unchanged. I suspect that the issue is that Git now has two names for me (global and local-to-the-R-project). If I run
git config -l
in the project directory, I see the following:
user.name=<my global user name>
user.email=<my global email address>

...
user.name=<my GitHub user name>
user.email=<my GitHub email address, same as the global one>
With two user names to choose from, perhaps RStudio is grabbing the global one? Or perhaps I'm barking up an entirely incorrect tree trying to find the source of the error.

At any rate, I can't seem to push updates from the PC to GitHub using RStudio. Not to worry, though. There are other options. You can do it from the command line (if you are command-line user of Git, which for the most part I'm not). You can also use a separate Git client program, which is what I did. My Git GUI of choice is SmartGit, from which it is no chore to push (or pull) updates.


Sunday, December 14, 2014

Updating Adobe Flash

Within the past week, give or take, Firefox started blocking the Adobe Shockwave plugin from running in web pages, due to a security problem with it. I could (and did, when I trusted the site) override the warning, but in general bypassing security is a bad idea. Even when the site is trusted, you have to consider the possibility that someone managed to sneak a malicious Flash application onto it.

I was initially happy to upgrade the bugger (to version 11.2.202.425, which I think is the first one Firefox will allow), but the universe conspired to make that ridiculously difficult. I'm running Linux Mint 16 Petra, which is plumbing-compatible with Ubuntu 13.10 Saucy Salamander. Shockwave is packaged for Ubuntu, not for Mint.

The "Plugins" tab of the Firefox Add-ons Manager flagged Shockwave and gave me a link for updating. That link took me to an Adobe page where I selected which Linux release of the latest version I wanted. I was hoping it would then download a .deb package file, but nooo, it gave me an "APT link", to which I needed to associate an application. Fortunately, I had AptURL installed, so I just had to point to it (in /usr/bin/apturl). Unfortunately, it did not work; I got a message about an unknown channel. So I was stymied there.

The Synaptic package manager was no help; it thought (incorrectly) that the version I had was the most recent. I added the Ubuntu Mozilla Security PPA, which has the latest version, as a source, but I still could not update.

I think I finally figured out what was causing all this wasted effort. As best I can tell, Adobe skipped Ubuntu 13.10 when they released 11.2.202.425. The version they released for Ubuntu 12.04 (Precise Pangolin) also apparently works on 13.10 ... but the package manager either can't see it or won't use it, because it's for an earlier version of the operating system than what I have. At least I'm guessing that's the problem. The "unknown channel" error I got trying to update via the Adobe site named "Precise" as the channel, so apparently it was looking for the "Precise" version but the package manager wasn't having any of that.

I did eventually succeed, though, and I'll document it here in case any weary pilgrim stumbles on this page. You can find the AMD64 version of the installer package on the Ubuntu packages site. (If you need a different version, backtrack along the bread crumbs -- and good hunting!) The link on that page will download the installer .deb file. Right-click that file, open it with GDebi, click "Install", give the necessary password, and it will download and install the actual Flash player. Restart Firefox and check on the plugins page to verify that you have the up-to-date version.

This was a rather remarkable time-suck compared to most upgrades on Mint or Ubuntu. :-(

Friday, December 12, 2014

Mythbuntu and USB WiFi

I'm in the (slow, painful) process of configuring a new PC to act as a video recorder. This post contains some notes on how I got it connected to my home WiFi network.

First, a note to self/strong suggestion to others: before screwing with anything that is likely to cause reboots, go into Applications > System > Mythbuntu Control Center > Startup behaviors and uncheck the box that starts the MythTV front end automatically at boot. Turn that back on only after all configuration agony is behind you. This is particularly important with the keyring nag (below), in which the front end started (and wiped out my access to the nag) before I could type the password, and then promptly froze the system.

The machine came with a flavor of Windows, but I immediately downloaded the latest version of Mythbuntu, burned the ISO file onto a DVD (using Brasero), and installed it on the PC, wiping away any vestige of Windows. I then rebooted and tried to configure Mythbuntu (but I'm pretty sure I will need to redo that later).

The PC comes with onboard Ethernet (one jack) but no WiFi. I will spare you the gory details (and myself the painful memories) of three failed attempts to add a USB WiFi adapter. Let's just say there were compatibility issues.

I finally ordered an Edimax EW-7811Un adapter. The Edimax is incredibly small, even more incredibly cheap (about a quarter the cost of some of the adapters I tried), and is advertised to be compatible with most Linux systems. With the power off, I plugged it into a USB slot and booted the new PC.

Oops. No network connection. Running lsusb in a terminal showed the Edimax device, but Settings > Network Connections showed two Ethernet connections and no WiFi connections. I don't think either of the "Ethernet" connections was the Edimax, but if it was, that was unhelpful, because my home LAN is secured (meaning the PC needs a password to get on it).

So I unplugged the Edimax. The Settings > Network Connections application still showed two Ethernet connections. I deleted both, not needing the onboard one and thinking the other was spurious. Big mistake. My external TV tuner is a Hauppauge WinTV-DCR, which MythTV mistakes for an HDHomeRun Prime (close enough). The first connection (which I eventually restored), eth0, was indeed the onboard Ethernet adapter. The second connection, eth1, was actually the Hauppauge tuner (even though it is coming in via a USB cable and not into an Ethernet jack). I wasted a lot of time unsuccessfully trying to install a channel lineup before it finally dawned on me that MythTV could not talk to the tuner because I had disconnected it. Fortunately, restoring the connection fixed that.

With the PC rebooted (and Settings > Network Connections showing no connections at all), I plugged in the Edimax. This time, the system immediately gave me a notification that a secured WiFi network was available. Back in Settings > Network Connections, I added a new connection, set it to WiFi, filled in my home LAN's SSID, set the security to "WPA & WPA2 Personal" and filled in the network password. Just like that, I had working WiFi.

To finish (hah! I should be so lucky) the job, I further edited the connection and made the IP address (obtained by running ifconfig in a terminal, and confirmed by looking at the status screen for my network modem) static, with mask 255.255.255.0 and gateway 192.168.1.254. I got those values by connecting my (Linux Mint) laptop to the WiFi network and running route -n in a terminal.

Next problem: on every reboot or log-in, I was nagged to type in a password to unlock the default keyring. (The correct password is my login password.) This was more than just annoying; it was fatal, as the front end would start without waiting for the password and then frequently freeze the system. Failure to unlock the keyring also seemed to keep the machine from gaining access to the WiFi network (at least I think that was the culprit). The solution I found was as follows:
  1. Unlock the bleeping keyring so that I could connect to WiFi.
  2. Install the "Passwords and Keys" application using the Ubuntu Software Center. (Seems to me this should be installed by default, even if the installation is purely a networked backend server.)
  3. Go to Applications > Settings > Passwords and Keys, and right-click the Default keyring entry.
  4. Select "Change password" and enter the current (login) password as the "old" password.
  5. Leave the new password blank (horrors! -- a major security hole!) and finish.
After a restart, the bloody keyring nag was gone and the machine connected to my WiFi LAN.

After all this, I found that I could ping outside servers by IP address but not by name. My LAN uses a (now obsolete?) 2Wire wireless DSL modem. Turns out I had to go back to Settings > Network Connections, edit the WiFi connection, and enter the gateway address (192.168.1.254, same as before) in the field for DNS servers. With that, I finally had a useable WiFi connection (which has survived a few reboots).

Sadly, there is still a lot of configuring to do.

Thursday, November 6, 2014

Linearize That!

For whatever reason, I've seen a bunch of questions posted on various fora boiling down to "How do I linearize <insert grossly nonlinear function>?" Whether by coincidence or due to some virtual viral epidemic, I've seen three or four in the past week that involved logarithms. So, without further ado, here is the quick solution:

Source: Norbert Schnitzler (reproduced under Creative Commons license)
Just put your function/constraint on the ground in front of it and back away carefully.

Which is to say, you cannot linearize an arbitrary nonlinear function. That's why they stuck the "non" in nonlinear.

I suspect the confusion stems in part from the fact that you can linearize certain otherwise nonlinear expressions involving binary (or, if you care to do binary expansions, general integer) variables. In fact, I've written about it myself (cf. Binary Variables and Quadratic TermsInteger Variables and Quadratic Terms). What is perhaps not apparent is that, under the hood, the linearization techniques amount to decomposing the feasible region into convex chunks on which the original nonlinear function is actually linear. For example, if we want to linearize the product $f(x,y)=xy$ where $x$ is a binary variable and $y$ is a continuous variable, the linearization described in the first of my two posts equates to splitting the feasible regions into two subregions, one where $x=0$ and $f(x,y) = 0$ (trivially linear) and the other where $x=1$ and $f(x,y)=y$ (also linear).

Now consider the nonlinear, concave and suspiciously logarithmic function $f(x)$ shown in red in the following plot:

 
Suppose that you have the constraint $f(x)\le b$ in a mathematical program, where $b$ is either a constant or a linear function. Since $f$ is concave and on the left side of a $\le$ constraint, the feasible region becomes nonconvex, a major problem.

Perhaps there is a way to linearize this, other than the analog method I alluded to in the photo, but if so I do not know one. What you can do, however, is a linear approximation. In the plot, I broke the (finite) domain of $x$ into intervals $[t_0, t_1], [t_1, t_2], \dots, [t_{N-1},t_N]$. Suppose now that we introduce continuous variables $\alpha_0,\dots,\alpha_N$ satisfying three requirements:
  • $\sum_{i=0}^N \alpha_i = 1$;
  • at most two of the $\alpha$ variables can be nonzero; and
  • if two are nonzero, they must be consecutive ($\alpha_j$ and $\alpha_{j+1}$ for some $j$).
We can now write $$x=\sum_{i=0}^N t_i \alpha_i$$(a linear constraint) and replace $f(x)$ with the linear expression $$\sum_{i=0}^N f(t_i) \alpha_i,$$noting that $f(t_i)$ is a constant. As an illustration, the midpoint of the interval $[t_1, t_2]$ in the plot would correspond to $\alpha_1 = \alpha_2 = 0.5$, all other $\alpha_j = 0$. The blue piecewise-linear curve shows the approximation $\sum_{i=0}^N f(t_i) \alpha_i,$.

The $\alpha$ variables form what is known as a Type 2 Special Ordered Set (SOS2). Although the $\alpha$ variables appear continuous, the SOS2 itself makes the problem a mixed integer program, effectively chopping the feasible region into subdomains (in this example, the intervals $[t_i, t_{i+1}]$) on which the approximation is linear.

Thursday, October 16, 2014

Java Gotchas

I was writing what should have been (and, in the end, was) a very simple Java program yesterday. It wound up taking considerably longer than it should have, due to my tripping over two "gotchas", which I will document here for my own benefit (the next time I trip over them).

Issue 1: Importing from the default package


The program I was writing was actually a homework assignment for a MOOC. It involved a single Java class, and I was writing it in the NetBeans IDE (which will prove germane). I had earlier taken another MOOC (Algorithms, Part I, taught by Robert Sedgewick and Kevin Wayne of my alma mater, Princeton University -- highly recommended!) and downloaded a pair of Java libraries associated with their book (Algorithms, 4th ed.). The libraries are licensed under the GPLv3, and I wanted to use a convenience class from one in the new program. So I added it to the library list in my NetBeans project and serenely tried to import the class I wanted. And failed. Miserably.

After various unproductive dinking around (and swearing), I realized that the problem was that my new Java class was in a named package but the class I was importing was in the library's default package. Using named packages is the norm in Java -- NetBeans will issue a warning if you try to park a class in the default package -- but for reasons related to grading in their MOOC, Sedgewick and Wayne had put all their classes in the default package (and required classes written by students for homework to be in the default package). It turns out that Java will not let a class in any other package import from the default package. Classes in the default package can import from each other and from classes in named packages. I guess this is a security matter. Sedgewick and Wayne provide alternative versions of their libraries using named packages, and point out this very issue in the "Q & A" section of their download page, which I had read and immediately forgotten. Hence this note to myself (since I will inevitably forget again, in some other context).

Issue 2: "Could not find or load main class ..."


Once I resolved the aforementioned problem, I thought I was good to go -- no compilation errors -- and so I tried to run my program (emphasis on the word "tried"). NetBeans announce that it could not find or load my main class. The message is a trifle unhelpful, as it does not specify whether the issue is that the main class could not be found or that it could not be loaded.

Since I selected the main class from NetBeans's run configuration menu, it seemed apparent to me that the class could be found, and for the life of me I had no idea why it could not be loaded. So I was off on a Google search that found a fair number of (somewhat dated) pages with people asking for help and specifying the same error message. Unfortunately, none of the specifics were germane to me, and I did not find any responses that helped.

Eventually I correctly guessed this issue, and I'll record it here in case anyone else is searching for that error message. The problem in my case was the path to the NetBeans project. Recall that the program I was writing was for another MOOC. I had named the parent folder with the full name of the MOOC, including spaces and a colon (which I escaped). My operating system (Linux Mint) has no problem with the presence of the colon, but apparently NetBeans (or Ant, which NetBeans uses to process build scripts) did. As a veteran of DOS and survivor of early versions of Windows, I cut my teeth on 8.3 file names (and directory names), so I suppose I should know better than to get creative with folder names. Renaming the parent folder without any punctuation marks solved that problem.

Not an issue: Regex


This last item is not a problem, it's just a neat web site I found (in an answer on Quora) whose link I wish to preserve. Online regex tester and debugger provides a web application that lets you test regular expressions to see if they do what you think/want them to do. Very useful -- which I'd seen it sooner!

Sunday, October 12, 2014

The Reciprocal Normal Distribution

A recent question on OR-Exchange dealt with the reciprocal normal distribution. Specifically, if $k$ is a constant and $X$ is a Gaussian random variable, the distribution of $Y=k/X$ is reciprocal normal. The poster had questions about approximating the distribution of $Y$ with a Gaussian (normal) distribution.

This gave me a reason (excuse?) to tackle something on my to-do list: learning to use Shiny to create an interactive document containing statistical analysis (or at least statistical mumbo-jumbo). I won't repeat the full discussion here, but instead will link the Shiny document I created. It lets you tweak settings for an example of a reciprocal normal variable and judge for yourself how well various normal approximations fit. I'll just make a few short observations here:
  • No way does $Y$ actually have a normal distribution.
  • Dividing by $X$ suggests that you probably should be using a distribution with finite tails (e.g., a truncated normal distribution) for $X$. In particular, the original question had $X$ being speed of something, $k$ being (fixed) distance to travel and $Y$ being travel time. Unless the driver is fond of randomly jamming the gear shift into reverse, chances are $X$ should be nonnegative; and unless this vehicle wants to break all laws of physics, $X$ probably should have a finite upper bound (check local posted speed limits for suggestions). That said, I yield to the tendency of academics to prefer tractible/well-known approximations (e.g., normal) over realistic ones.
  • The coefficient of variation of $X$ will be a key factor in determining whether approximating the distribution of $Y$ with a normal distribution is "good enough for government work". The smaller the coefficient of variation, the less likely it is that $X$ wanders near zero, where bad things happen. In particular, the less likely it is that $X$ gets anywhere near zero, the less skewness $Y$ suffers.
  • There is no one obvious way to pick parameters (mean and standard deviation) for a normal approximation to $Y$. I've suggested a few in the Shiny application, and you can try them to see their effect.
I'd also like to give a shout-out to the tools I used to generate the interactive document, and to the folks at RStudio.com for providing free hosting at ShinyApps.io. The tool chain was:
  • R (version 3.1.1) to do the computations;
  • R Studio as the IDE for development (highly recommended);
  • R Markdown as the "language" for the document;
  • Shiny to handle the interactive parts;
  • various R packages/tools to generate the final product.
It's obvious that a lot of loving effort (and probably no small amount of swearing) has gone into the development of all those tools.

Thursday, October 2, 2014

Multiple Children - Again!

I thought I had exhausted this topic, but no such luck ...

As noted in a previous post ("When the OctoMom Solves MILPs"), CPLEX (and I believe most other integer programming solvers) are have a design limitation of at most two child nodes per parent node in the search tree. Personally, I don't consider that limitation a problem, but occasionally someone comes along wanting more children. In the aforementioned post, I described a way to work around the limitation by creating nodes that combined two or more of the intended children and then splitting those nodes. In a follow-up post ("Birthing a Litter in CPLEX"), I proved a test case (assigning students to teams so that average GPAs for all teams are comparable) and supplied some Java code. I'll repeat the key illustrations here for context.

The search tree our hypothetical user wants contains a child under the root for each choice for the number of teams to create.

What can actually be accomplished, using the workaround, looks like the following.

Note the presence of three "meta-nodes" (with team ranges marked alongside them in blue) in addition to the nodes from the first diagram.

A point was recently raised in a discussion about this on OR-Exchange: those extra meta-nodes require the solution of additional node LPs. My first reaction is that the extra pivots are not likely to be a problem, for two reasons. First, the default for CPLEX (and, again, I suspect most solvers) is to use dual simplex pivots to restore feasibility when solving the child node's LP (which differs from the parent's LP just by the cut(s) added to create the child node). Second, I suspect that some of the same pivots would occur if you were to solve the leaf node LPs directly. If a substantial portion of the pivots that solve the "6-7 teams" node would occur in both the "6 teams" node and the "7 teams" node, were those nodes spawned directly from the root, then solving the meta-node LP might actually be more efficient (by virtue of doing those pivots once rather than twice).

All that said, I got curious whether I could con (er, "induce") CPLEX into creating meta-nodes that were direct clones of their parents. The goal was a tree looking as follows, where the nodes marked with an asterisk (*) are direct clones of (identical to) their parents.

The answer, for the moment at least, is "yes". I have placed sample Java code in a Git repository open to the public for download. (There's also an issue tracker in the unlikely event that you should find a bug.) See the side panel for license information. It's based on the code from the second post mentioned above, and as such is slightly over-engineered (I did not take the time to eliminate features from the previous code that are not needed here).

The key lines, appearing in the branch callback, are the ones that create a child that is a direct clone of the parent:

// second child clones the parent with adjusted node data
info = new BranchInfo(min + 1, max);
id = makeBranch(new IloRange[0], obj, info);

The info variable holds "node data" that is attached to the clone node; variable id contains the node ID assigned by CPLEX to the clone node. Fair warning: the code relies on an "undocumented feature" (programmer oversight?) of the Java API, which means it could cease to work in some future version. The trick is to pass the makeBranch() method an empty (length 0) array of "cuts" to add. Note that passing null, rather than an empty array, results in an uncaught exception.

I have no idea whether other APIs have the same "feature".

Friday, September 26, 2014

Should You Trust Your Analytics Model?

The short answer is "no". At least, don't put too much faith in them.
"Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful."  (George E. P. Box)
It's also useful to remember that your model, no matter how sophisticated or how well supported by theory, is calibrated using data that might be a tad suspect.
"The government are very keen on amassing statistics. They collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But you must never forget that every one of these figures comes in the first instance from the chowky dar (village watchman in India), who just puts down what he damn pleases."  (Stamp's Law)
All that said, there was a very nice entry recently in the HBR Blog Network, "How to Tell If You Should Trust Your Statistical Models", that is well worth reading. The author is Theodoros Evgeniou, Professor of Decision Sciences and Technology Management at INSEAD. (Tip of the hat to +Aleksandr Blekh for providing the link on Google+.)

Tuesday, September 2, 2014

The Business of Moving Stuff

In my last post, I linked some articles about the role of operations research in planning/scheduling the movement of people. It occurred to me that the movement of packages and goods provides some more examples of what we OR types are up to. Fortuitously, I came across a link to a relevant article published earlier this summer in Fortune: "The shortest distance between two points? At UPS, it's complicated". The article discusses, in somewhat general terms, the use of OR models by UPS to route delivery vehicles. It contains an assessment of the impact of Orion, a routing/scheduling system developed by UPS:
By the end of 2013—after being applied to just 10,000 routes so far—Orion had already saved 1.5 million gallons of fuel and 14,000 metric tonnes of CO2 emissions.
The article omits any mention of the related problem of loading the vehicles. The order that packages are loaded (and unloaded) makes a difference in the time a driver spends at a delivery location, and of course time is money.

A related (but possibly even more complex) application is the operation of container ships and container terminals (ports). Deciding the order in which containers are loaded on a container ship (which in turn dictates the order in which they are unloaded) is complicated enough, since it impacts the sequencing of trucks making drop-offs or pickups if the containers move directly between truck and ship, and impacts the management of piles of containers in the yard when the containers do not move directly between truck and ship. Unfortunately, I was unable to turn up a "popular press" article describing the use of planning tools (OR) in container ship/terminal operations, although I found no shortage of technical papers, book chapters and promotional materials (by companies selling software or consulting services for port management). The best I could come up with is a Wikipedia entry which states that
In recent years methodological advances regarding container terminal operations have considerably improved. For a detailed description and a comprehensive list of references see, e.g., the operations research literature.
(Take a bow, OR people.) It then provides a couple of links to literature reviews.

Saturday, August 30, 2014

The Business of Moving People

Not infrequently, I am asked about my academic specialty. When I reply "operations research", as I usually do, I'm often met with a polite but blank stare. This happened to me a couple of times at a recent party. If the questioner inquires further, I'll try to give an example or two of what OR folks work on in the real world, while omitting any references to "integer programming models" or "Markovian decision processes".

Two recent articles in the popular press do a nice job of illustrating real-world problems to which operations research models and techniques are applied. Coincidentally, they both have to do with getting people from point A to point B, hence the title of the blog post. I thought I would link them here.

The first is an article at CityLab about bike sharing programs: "Balancing Bike-Share Stations Has Become a Serious Scientific Endeavor". Bicycle sharing services are a relatively recent innovation, primarily in urban environments. By allowing individuals to borrow or rent a bicycle (owned by the service) for commutes or to run errands, the services help drive down traffic congestion and air pollution, while presumably turning a profit (at least if privately owned). As with car rental services, bicycle sharing services allow you to pick up a bicycle at one location (station) and return it at another, a necessity if the bicycles are going to be used for commuting. (You do not want the bicycle out of service for eight or nine hours while the renter is at work, and the renter does not want to pay for entire day's use of the bicycle.) Also in common with car rental agencies is that this can result in some locations developing a surplus of available bicycles while others are starved. The CityLab article focuses on the twin problems of predicting where the imbalances will be and efficiently routing vehicles used to rebalance the stations (moving bicycles from overpopulated stations to underpopulated stations). Neither "analytics" nor "operations research" is mentioned in the article, but clearly that is what is going on.

The second article, appearing in the New Republic, has the rather amusing title "Why the $#&@! Did Your Airline Cancel Your Flight Today? They Had a Very Good Reason." The author, Amy Cohn, is an industrial and operations engineering professor at Unmentionable University in Ann Arbor. She gives a very persuasive argument, understandable I think by lay people, that airline flight cancellations are not motivated by concerns that the flight is underbooked and therefore unprofitable, but rather by a complex planning system (more operations research) reacting to problems and trying to find a solution that does as little damage as possible to the overall schedule, in the process hopefully inconveniencing as few passengers as possible. (Yes, I realize I just said something positive about someone from Ann Arbor. As soon as I post this, I will go do some act of charity in contrition.)

Of the two articles, I think I will be more likely to cite the first one when someone asks just what operations researchers do -- not because the second article is less compelling (or because its author has an unfortunate work address), but because associating "operations research" and "flight cancellations" may not have a positive impact on my popularity.

(Hat tip to @or_exchange, @HarlanH, @portlypoor and @Neil_Irwin for tweets leading me to those two articles.)

Update: Another tweet led me to a blog post by Jean-Francois Puget titled "Optimizing Real Time Decision Making". It uses a taxi service to illustrate how OR tools (optimization) can actually improve aggregate system response time for customers by intentionally delaying responses to some service requests (another people-moving example). Elevator dispatch algorithms provide a similar but not identical application (but good luck explaining to someone that there is actually a reason for them to wait for a car while another car sits idle).

(Hat tip to @dualnoise and @menkris for the tweet that led me to J-F's post.)

Thursday, August 28, 2014

A Musical Random Walk

I just read a blog post here at MSU about The Infinite Jukebox, a web tool that will analyze a song (either one you upload or one from their library) and map out links between segments of the song that have approximately the same beat. You then see a visualization of the song as a loop, with the links (arcs) added, forming a directed graph. (The visualization does not display arcs as arrows, though, just as, well, arcs.)

When you replay the song in The Infinite Jukebox, as it passes the tail node of each arc, it randomly chooses whether to keep going on the original path or cross the arc and continue playing, along the original path, from the head node. Thus the "infinite" part of the site's name -- the song will continue until you stop it.

I don't know the algorithm they use for playback, but it appears superficially to be a Markov process, so perhaps this really is a random walk over the song's graph.

If you want to try it out, I can recommend The Trashmen's version of "Surfin' Bird", which I was happy to discover is in the site's library. The randomly looped version sounds suspiciously like the original (which is what I expected). Although the visualization of the progress through the song is interesting, for the full multimedia experience you'll want to have the You Tube video playing (muted) on a second screen.

Sadly, their library does not include "Papa-Oom-Mow-Mow" by The Rivingtons, and I don't happen to own a copy, so I could not compare the action of The Infinite Jukebox on that with what it did to "Surfin' Bird". As I listened to the looped "Surfin' Bird", though, I couldn't help but think "and thus another enhanced interrogation technique is born".

Friday, August 15, 2014

Scheduling Instability

Fellow OR blogger Laura McLay recently wrote a post "in defense of model simplicity", which is definitely worth the read. It contains a slew of links to related material. As I read it, though, my contrarian nature had me thinking "yes ... as long as the model is not too simple".

A recent piece in the NY Times ("Working Anything but 9 to 5") had me thinking again about model simplicity. The gist of the Times article is that irregular (erratic, chaotic) work schedules for hourly employees, coupled with short notice of those schedules, creates stress for some families, particularly single-parent families where daycare needs to be arranged. Fairly deep into the article, I found someone saying what I was thinking:
Legislators and activists are now promoting proposals and laws to mitigate the scheduling problems. But those who manufacture and study scheduling software, including Mr. DeWitt of Kronos, advocate a more direct solution: for employers and managers to use the software to build in schedules with more accommodating core hours.

“The same technology could be used to create more stability and predictability,” said Zeynep Ton, a professor at M.I.T. who studies retail operations.
The multiperiod nature of scheduling models tends to make them a bit chewy, especially when you need to coordinate schedules of multiple individuals (or machines, or venues) across blocks of time while dealing with multiple constraints and possibly multiple conflicting criteria. As with most discrete optimization problems, simplicity of the model tends to pay dividends in reduced execution time. That said, models that take into account the cumulative schedules of workers do exist. In the U.S., the Federal Aviation Administration has rules limiting consecutive hours in the cockpit and rest times for pilots flying passenger routes. (If you think this is another example of the "nanny state", have a look at this article about an Indian airliner imitating a roller coaster.)

There are at least four dimensions in play that may be contributing to the high variance in schedules for some employees:
  1. Is the employer concerned enough about scheduling chaos to make it a policy to mitigate variance in schedules of individual employees?
  2. If so, is the policy being implemented?
  3. How does the employer assess tradeoffs between smoothing of schedules and other criteria (adequate staffing, staffing cost, matching staff skills with needs, ...)?
  4. Can/does the scheduling software (assuming software is used) implement variance-mitigation either as constraints or in the objective function?
The second dimension, in my opinion, is one best left to the management folks (my erstwhile colleagues). The fourth dimension clearly belongs to the OR crowd: we need to make provision for those sorts of constraints/criteria, even at the risk of making models more complicated (cough Laura cough),  and we need to communicate clearly to the users that those provisions exist (and how to use them). I'm inclined to think that the first and third dimensions straddle the disciplines of OR (or, if you prefer, analytics) and management. The management folks can provide evidence-based arguments of how more worker-friendly schedules can improve employee performance and retention, while those of us in OR can perhaps help assess the trade-offs and quantify costs and benefits.

[Update: Starbucks, at least, appears to be implementing policy changes. Thanks to Tallys Yunes for this link: "Starbucks to Change Scheduling as Some Employees Struggle".]

I'll conclude with a wonderful anecdote, told to me long ago by a colleague from our statistics department, about model simplicity. He attended a colloquium by what would now be called a biomathematician (I don't think the category existed back then). The presenter had developed a model for arterial blood flow in the human body, using a system of nonlinear differential equations. The right-hand sides of the equations represented a forcing function. The system was too complex for the presenter to solve, so he simplified things by solving the homogeneous case (setting the right sides of the equations to zero). At the conclusion of the talk, after the usual round of applause, the presenter asked if there were any questions. He got at least one: "If the forcing terms are all zero, doesn't that mean the patient is dead?" It does -- the simpler model that he had solved required that the body's heart not beat.

Simplicity in models is good, but arguably not worth dying for.

Thursday, August 14, 2014

Mythbuntu: The Upgrade from Hell

I foolishly let Mythbuntu update to version 14.04 overnight a few days ago. The installer ran into problems (which I could not decipher) regarding updating the MythTV database. I let it upload two separate bug reports and did my best to soldier on. When the installation was finally over, the back end would not load, which means it was doing a pretty good impression of a (software) brick.

I got a message telling me that the database schema was several versions out of date, and instructing me to run either mythtv-setup or mythbackend in order to update the database. I tried both (from a console) and both failed (with mythtv-setup trying to start mythbackend). Eventually, I discovered the first (of multiple) problems.

Problem 1: Time zone data


Somewhere along the line I got a message that MySQL, the database program used by MythTV, was lacking time zone data. I followed some instructions I found, and fixed it by running the command

mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -uroot -p mysql

supplying a password when prompted. After that, I was able to run mythtv-setup and mythfilldatabase. Apparently mythtv-setup automatically updated/fixed the database.

Problem 2: Security PIN


In mythtv-setup (general settings) I discovered that the security PIN was blank. According to the prompt, that will prevent any front end from connecting to the back end. Per the prompt, I changed it to 0000, allowing all front ends to connect. (I only have one, on the same machine as the back end, so that's fine.)

The IP address for the back end was correct, but since I have no remote front ends, I changed both the local and master addresses to 127.0.0.1. That was probably unnecessary, but I mention it just in case it had any unintended effect. [Update: In hindsight, I realized that might interfere with my using MythWeb to program recording schedules from other devices on my home LAN, so I switched both the local and master addresses back to the machine's static unrouteable IP on the local LAN, with no harm done.]

Problem 3: No Back End Service


Even after successfully running the setup program, the front end still could not connect to the back end .. because the back end was not running ... because it failed silently every time it was started. The good news was that I could start the back end manually (running mythbackend in a console), after which the front end could connect. It came as quite a relief to see that recordings, channel information and recording schedules had all survived the upgrade. Still, you can't do scheduled recordings very effectively is someone has to log in and start the back end manually each time.

Fortunately, I was able to get help from Thomas Mashos in the Mythbuntu Community on Google+. I could start the back end manually because my personal MythTV configuration file (~/.mythtv/config.xml) survived the upgrade intact. The system configuration file (/etc/mythtv/config.xml, also symlinked to /home/mythtv/.mythtv/config.xml) must have been replaced by what amounted to a placeholder. The host, user name and database name fields were empty; the password field was filled in incorrectly (or maybe that was the correct password for a null user access a null database on a null system?). I edited those four entries to match my config file and (woo-hoo!) the back end resumed playing nicely.

Problem 4: No Sound


Once the back end was running, I played back part of a saved recording to verify that things were working okay. The good news was that the video played fine. The bad news was that there was no audio. This turned out not to be limited to MythTV: I couldn't get the system to make sounds, period. (In contrast, the system had no trouble getting me to make sounds. Very, very aggravated sounds.)

I searched hither and yon on Google, and discovered that a nontrivial number of users have (or had) sound problems, many of them as a result of the upgrade to Mythbuntu 14.04. I've lost count of the number of options I tried, including purging the alsa-base, linux-sound-base, pulseaudio and pulseaudio-module-X11 packages (all at the same time), reinstalling them (I even deleted the archived copies and forced the package manager to download them again), and of course rebooting.

Eventually I stumbled on a fix (woo-hoo!). I had previously poked around in the ALSA mixer (by running alsamixer in a terminal) and unmuted the obvious suspects. I gave it one more try, disabling Auto-Mute Mode (because the last thing I want right now is something muting the sound again) and playing with settings I didn't understand. It turns out that unmuting both "S/PDIF" and "S/PDIF Default PCM" restored the sound. I have no idea what they stand for, let alone why they were muted. From aplay -L, I can see that they're associated with my NVidia graphics card (which also handles sound). So are eight bazillion other things, and I'm not sure why this one is special.

Problem 4.5: No Sound After Reboot


One might hope that unmuting the two settings above would be a permanent fix. Not quite. After a reboot, "S/PDIF Default PCM" remains unmuted but <sigh>"S/PDIF" finds itself muted again</sigh>.

In alsamixer (in a terminal), I unmuted both, then ran sudo alsactl store to store the correct settings and added alsactl restore to /etc/rc.local, which should hypothetically be run each time the system boots (but not, I believe, during reboots). The command works fine from a terminal, but after a full shutdown/start I find S/PDIF once again clobbered; so either /etc/rc.local is not running or, more likely, it restores the correct sound settings before the evil program that's muting S/PDIF gets executed.

Just for grins, after a reboot (and without restoring the ALSA settings, i.e., with a silent PC), I went into MythTV's audio setup menu. It showed a sizable number of options that all mentioned the correct NVidia card and matched what I saw when I ran aplay -L. The default choice produced no sound. I switched to a different setting (from "dmix" to "front", as I recall), ran the sound test and it worked. Unfortunately, while the setting survived a reboot, sound did not. So I tried again, switching to "ALSA:iec958:CARD=NVidia,DEV=0", and that setting works. It has survived three restarts, so I'm a cautiously optimistic that my long ordeal (most of two days chewed up fixing this) is finally over -- pending the next upgrade, of course.

Problem 5: [update] no MythWeb


After posting this, I discovered that MythWeb was not running. This turns out to be a known bug, due to a difference in the Apache versions used by Mythbuntu 14.04 and its predecessor. The fix specified in the bug report, which I repeat here, worked and MythWeb now runs as expected.
$ sudo rm /etc/apache2/
$ sudo dpkg-reconfigure mythweb
$ sudo /etc/init.d/apache2 start


Thursday, August 7, 2014

Fewer Zeros

A question I saw online not too long ago caused me a flashback to my days teaching linear programming (LP) to masters students. The poster had developed an optimization model -- I can't recall if it was an LP, a quadratic program (QP) or a mixed-integer program (MIP) -- and had no problem solving it. He was, however, unhappy with the number of decision variables taking value zero in the optimal solution, and was quite insistent on being shown how to get another (optimal?) solution containing fewer zeros.

My first reaction took me back even further, to my high school years. If the optimal solution has a bunch of zeros, and it's the unique optimal solution, then that's that. My second recollection was my lecturing on post-optimal sensitivity analysis, and the wonders of multiple optima (which, in the case of LP, you can detect by examining the reduced costs of the variables that did not make it into the final simplex basis). Most people, understandably, are happy to yank the final solution off the printer (or dump it to a file, or whatever) and run with it. I gave two rationales for poking it a bit to see if there were other optima, or perhaps near-optima. Both rationales were illustrated by product mix models (deciding what products to manufacture, in what quantities, given resource constraints, unit profit margins, etc.).
  1. Not all "optimal" solutions are created equal. Unless you are dabbling in multiobjective optimization, the model optimizes a single criterion function. There may be other criteria that can be used to distinguish between multiple optimal (or near-optimal) solutions. For instance, does the solution you found require a massive revamping of the production schedule, while another optimal solution lurks that is a minor tweak of the current schedule? Does the optimal solution found end production of the product your boss developed or championed back when, the one that got her promoted to her current position, while another optimal solution continues to produce it?
  2. Does your optimal solution kill off some products and, if so, are there other optimal/near-optimal solutions that continue to make them? Zeroing out production of PCs in favor of laptops and tablets may make sense given the current market (as reflected in the current values of demand and revenue parameters in your model), but the marketing weenies may tell you that, should PCs suddenly become hot, you will have a harder time climbing back into the market if you have not retained at least some presence. You may also lose some design or manufacturing competence for a particular product type if you stop making it entirely.
The second rationale is consistent with the original poster's desire for fewer zeros. I'm pretty sure that, given enough time (and enough beers), I could come up with similar rationales in contexts unrelated to production scheduling.

So how does one reduce the number of zeros in the solution? I'm assuming, for simplicity, that you would ideally like none of the variables to be zero; it's easy to tweak what follows to reflect a desire to make some variables positive combined with an indifference to the fate of other variables.

The first question you need to ask yourself is what you are willing to give up in trade, because it's entirely possible you are staring at the unique optimal solution (or that other optima don't do much better as far as number of zeros is concerned). Let's say that your original problem was to maximize $f(x)$ subject to $x\in X$, where $X$ is the original feasible region (defined by various constraints that I don't need to write done here). Let's say further that $x^*\in\Re^n$ is the optimal solution and $f^*=f(x^*)$ is its objective value. You decide that you are willing to sacrifice up to $\epsilon \ge 0$ from the objective value to get a solution with fewer zeros, or maybe up to $\epsilon$ for each variable whose value switches from zero to positive, or something along those lines. (If $\epsilon = 0$, you'd better hope there is an alternate optimum.)

The second question you need to ask yourself is, for each variable $x_i$, what is the minimum value ($L_i > 0$) that you would be willing to accept. Producing 50 truckloads of corn flakes per week combined with one box of wheat flakes is not a whole lot different from just making corn flakes.

The methods discussed below do not exhaust all possibilities; they're just what come to mind at the moment.

Method 1: Utopia


We can add the constraint $f(x)\ge f^*-\epsilon$ and pick a new objective function designed to push $x$ toward having fewer zeros. One possibility is to aim for a "utopia point", an idealized solution you likely will never reach. So consider the following problem (which remains an LP if $f()$ is linear, and is quadratically constrained but convex if $f()$ is concave -- anything else and you're on your own).
\[ \begin{array}{lrcl} \textrm{minimize} & \sum_{i=1}^{n} & w_{i}(y_{i}+z_{i})\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & f(x) & \ge & f^{*}-\epsilon\\ & x_i - y_i + z_i & = & u_i \ (i\in 1,\dots,n) \\ & y,z & \ge & 0 \\ \end{array} \] The utopia point $u\in \Re^n$ is a vector of ideal values (presumably positive) for the $x$ variables; the weights $w_i > 0$ reflect your priorities for getting close the utopian value $u_i$ for each $x_i$. (Auxiliary variables $y$ and $z$ are used to get the absolute difference between $x$ and $u$.) The objective minimizes $\left\Vert x - u \right\Vert_1$, the $L_1$ norm of the difference; you can use a different norm ($L_0$ stays linear, $L_2$ results in a quadratic program) if you wish.

Method 2: Bounds


A second, perhaps simpler, approach is to assert a strictly positive lower bound $L_i > 0$ on every variable $x_i$.
\[ \begin{array}{lrcl} \textrm{maximize} & f(x)\\ \textrm{s.t.} & x & \in & X\subset\Re^{n}\\ & x & \ge & L\\ \end{array} \]There is a danger that this could make the problem infeasible (if you are too aggressive in the choice of $L$). Assessing the trade-off between objective value ($f$) and bounds ($L$) is largely a matter of trial and error. Dual variable values (LP) or KKT multipliers (QP) may allow you to incrementally adjust the bounds until a satisfactory outcome is achieved.

Method 3: Binary Variables


Yet another possibility is to introduce binary variables reflecting which original variables are "turned on", and then optimize the selection of those variables. Let $z_i\in\{0,1\}$ take the value 1 if $x_i$ is "sufficiently" positive and 0 otherwise. Suppose that we also have weights $w_i > 0$ reflecting the perceived importance of "turning on" each variable $x_i$. (If you just want as many nonzeros as possible, set $w_i = 1$ for all $i$.) Consider the following model:\[ \begin{array}{lrcl} \textrm{maximize} & \sum_{i=1}^{n} & w_{i}z_{i}\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & f(x) & \ge & f^{*}-\epsilon\\ & x_i  & \ge & L_i z_i \ (i\in 1,\dots,n) \\ & z & \in & \{0,1\}^n \\ \end{array} \]We maximize the aggregate value of the choice of $x$ variables taking (sufficiently) positive value, while maintaining feasibility and taking an acceptable hit in the value of the original objective function. The computational cost is that we have converted our original problem, which might have been a relatively easy to solve LP or QP, into either a mixed-integer linear program or a mixed-integer quadratically constrained program, either of which will be harder to solve.

Method 4: Binary Variables Again


A variation of the previous approach is to maximize $f$ subject to a restriction on the number of zeros in the solution:\[ \begin{array}{lrcl} \textrm{maximize} & f(x)\\ \textrm{s. t.} & x & \in & X\subset\Re^{n}\\ & x_i  & \ge & L_i z_i \ (i\in 1,\dots,n) \\ & \sum_{i=1}^{n} w_{i}z_{i} & \ge & K \\ & z & \in & \{0,1\}^n \\ \end{array} \]where the weights $w_i$ again indicate relative importance of getting strictly positive values (all 1 if you just want to achieve a certain number of nonzeros) and $K$ is the minimum value/minimum count of nonzeros that you are willing to accept. Once again, assessing the trade-off between objective value and nonzeros is a trial and error process.

Wednesday, August 6, 2014

Updated Benders Example

Two years ago, I posted an example of how to implement Benders decomposition in CPLEX using the Java API. At the time, I believe the current version of CPLEX was 12.4; as of this writing, it is 12.6.0.1. Around version 12.5, IBM refactored the Java API for CPLEX and, in the process, made one or more non-backward-compatible changes that broke the sample code. This morning, I posted an updated version.

While I was at it, I decided to create a Git repository for the code on Bitbucket. You can access the repository at https://bitbucket.org/prubin/bendersexample. If you click the "Downloads" link in the navigation menu, you will be led to a page from which you can download the source code in a Zip archive. (There is no executable for this project, just the source.) There is also a link to an issue tracker in case you run into a bug (not that I ever write code with bugs in it) and wish to report it.

UPDATE: I moved the code to a new repository. The URL is https://gitlab.msu.edu/orobworld/BendersExample. Click "Files" in the left side navigation menu to get to the source code. There will be a link (upper right) on that page to download the source in a Zip archive.

The code and issue tracker should be accessible to anonymous users (no login required). If not, hopefully someone will let me know. As with the blog, the code is released under a rather permissive Creative Commons license.

Tuesday, July 29, 2014

New Linux Laptop

Earlier this month I decided to get a netbook, and of course I wanted to run Linux (preferably Linux Mint) on it. After shopping around, I settled on an Acer Aspire V5-131. I won't say it was my dream machine -- it has a conventional hard drive, whereas I would have preferred a solid-state drive -- but it satisfied my other key criteria:
  • size and weight are appropriate for a netbook;
  • it comes with Linpus Linux, ensuring that I could load a different Linux distribution without having to learn the new Windows 8 interface or fight through a lot of UEFI secure-boot obstacles (I think the BIOS supports UEFI, but I think it's off by default);
  • it has 4 GB of DDR3 RAM (a lot of the competing machines only had 2 GB); and
  • it was pretty cheap for a machine with its specs.
When I received the machine, I downloaded and ran UNetbootin (on a Windows box, as it happens), which allowed me to install a "live" (bootable) version of my preferred Linux distribution (Mint 17) to a USB drive. I use the Cinnamon version of Mint on my desktop, but I went with the Xfce version for the netbook because I figured it would put less strain on the processor.

From there, I just had to boot the Aspire, interrupt the boot sequence to get to the BIOS, change the BIOS to boot first from a USB drive, plug in the USB stick and reboot. That put me into Mint. After verifying that WiFi worked (there really wasn't much else to test), I chose the option to install Mint, and took defaults for pretty much everything. You can install Mint to coexist with another operating system, but all I wanted was Mint (no disrespect to Linpus intended). so I let it overwrite Linpus. Just like that, I had a working Mint netbook. There was no drama (possibly a first for me when installing/configuring a machine).

So far I've been quite happy with the Aspire. Some people seem be concerned about the bulge in the back caused by the battery. (The Aspire has better-than-typical battery life, as I understand things, so that bulge pays dividends.) The battery causes a slight downhill tilt to the keyboard, which I find comfortable.

The Aspire has a touchpad but does not have separate mouse buttons. Some of the reviews I read before purchasing seemed to think that was an issue, while other people said gestures (single finger tap for left button, double finger tap for right button) were good enough. After setting the machine up, I discovered that it does in fact have mouse buttons, or at least mouse button functionality. The touchpad appears to be hinged along the top edge and not anchored to the case along the bottom. If you push the front left (lower right) corner of the touchpad reasonably firmly, you get a left (right) mouse button click. For the left mouse button, I think it's easier to tap, but I find pushing the front right corner down to get the "hardware" right mouse button action much more accurate than two fingered taps (or tapping the lower right corner with one finger, which will generate a right mouse button event if are very precise).

The only issue I've encountered so far is that the touchpad is a bit too sensitive for my tastes. When I'm typing, I can generate a mouse event without meaning to, presumably because the angle of my hands brings my palm a tad too close to the pad. Flexing at the wrists eliminates the problem, but too much typing with flexed wrists is probably not good for the associated tendons.

Tuesday, July 8, 2014

Mint on a Stick

Due to a research project in which I'm currently engaged (and, trust me, you do not want to know the details), I find myself needing to wander into a public computer lab on our campus (where Windows rules the machines) and run a "virtual" Linux machine (because I'll be talking to a Linux server and, well, you really do not want to know the details). So I ended up using LinuxLive to create a USB flash drive from which I can run Linux Mint. Importantly, I can run it in a VM from Windows. Most USB installations of Linux distributions require that you boot from the USB, which in turn typically requires that you alter the order in which the PC polls devices looking for something to boot. That's not possible in my case; because the network police tend to lock down publicly accessible machines, I basically can't alter the boot sequence, and the boot sequence on the public machines does not include USB ports (let alone USB first). LinuxLive installs Oracle's VirtualBox. I just have to launch VirtualBox from Windows and then run my Mint machine in the VirtualBox VM ... in theory ...

After one full afternoon of futzing with this, I think I've got it working (knock on virtual wood), and I figured I'd write down a couple of the "gotchas" and their workarounds.

You install LinuxLive on a Windows machine and then supply it (besides the USB drive) either an .iso image or a bootable CD/DVD with your Linux distribution. It supports quite a few Linux distributions, including recent versions of Mint. The LinuxLive installer is very easy to use and the steps are documented quite well on the web site, so this should be a no-brainer. Except, um, the first time out it didn't work. I downloaded an .iso file for Mint 17 (the current version) and ran through the installation. Everything seemed to go fine, there were no error messages, and the installer told me it was done. When I tried to run Mint from the USB drive, though, I got an error message that the loader could not find vesamenu.c32. I located it manually (in the /syslinux folder), but it had a length of zero bytes, as did all but two files in that folder.

I was working in a campus computer lab, and by the time I tracked that problem down, I had to leave (and didn't have any media with me that could transport the .iso file). My next attempt was at home, using a CD with the .iso file for Mint 14. Small problem: the installer does not recognize Mint 14 as a known distribution (even though it is on the supported list) and consequently installs it as a generic Linux distribution. I cant' recall now if it did the same thing to me with Mint 17, but I think so. Among other things, this precluded me from setting up "persistence" (the ability to retain files on the USB stick between sessions), but I'm pretty sure I can work around that.

The installer told me it was done with no errors, and so it was time to test the USB stick. I had no trouble launching VirtualBox, where I found two virtual machines configured: "LinuxLive" (the production version of the Mint installation); and "Debug" (self-explanatory?). Starting the "LinuxLive" VM was an exercise in frustration. The good news was that it had no complaints about a missing vesamenu.c32. (Before trying to start it, I checked the driver and verified that this time the file had non-zero length.) The USB drive would flash, it would start to load something (couldn't really tell what from the largely blank display), and then ... nothing. It appeared to be frozen.

After assorted screwing around, I tried the "Debug" VM, where I at least got an error message: "This kernel requires an x86-64 CPU, but only detects an i686 CPU, unable to boot". Suffice it to say that my PC has an AMD 64-bit processor, and the Mint 14 distribution runs fine on it. After some searching, I discovered this was a consequence of the VM being set up for "other Linux" (i.e., fallout from the LinuxLive installer not recognize Mint 14). In the VirtualBox control program, I selected the VM and changed Settings > General > Basic > Version to "Ubuntu (64 bit)", which was as close as I could get to Mint (and, as it turns out, close enough). With that change, the Debug VM would boot.

The next problem was that the Mint asked for a user name and password. It turns out that, since I'm basically running a trial version, the user name is "mint" and the password is left empty. Once I figured that out, I could use the Debug VM ... but not the main (LinuxLive) VM.

Even after I switched the LinuxLive VM from "other Linux" to Ubuntu (64 bit), it would not boot fully. It would get to the point where the user name/password prompt occurs, then go dark, then reboot in an endless cycle, with the USB activity light flashing non-stop. So I compared the settings of the two VMs and discovered that, for unknown reasons, the Debug VM was configured to use 512 MB of base memory but the LinuxLive VM was configured for only 256 MB. I suspect the LinuxLive VM was being starved. At an rate, I changed Settings > System > Base Memory to 1024 MB for the LinuxLive VM and voila, it ran!

I'll screw around with adding persistence on another day.

Monday, June 23, 2014

NetBeans 8 Update Bugs

An update to the NetBeans IDE (8.0 patch 2) this morning seems to have introduced some problems that I was fortunately able to work around. I'll document them here in case anyone else runs into them.

For some context, when I opened NetBeans, it correctly showed three projects in the project navigator, the main project on which I am currently working (call it C) and two related library projects (A and B). My task for the morning was to make a minor tweak to B. Happily (for my sanity), I use Git for version control and keep everything backed up to a remote repository (Bitbucket).

The IDE informed me that there were three updates available, which I downloaded and installed, forcing a restart of the IDE. Unfortunately, I took no notes on which updates they were, so I can't point to a particular culprit. Following the restart, I made B my main project, made the necessary code changes, tested them, committed them and pushed the commit up to the repository. I'll note here that I changed the method I used to access the repository from HTTPS to SSH, which may be significant for the second and third bugs. Then I did a clean and build to get a new jar file. So far, so good.

Bug #1: Can't generate Javadoc


The first problem came when I went to regenerate the Javadoc files for B. The Generate Javadoc menu option was disabled (grayed out), and nothing I could do enabled it. I wasted a fair bit of time reading some very old bug reports that, as it turns out, were not germane, and generally futzing around before I finally did what I should have tried immediately: I shut down and reopened the IDE. Just like that, Generate Javadoc was enabled again, and I was able to generate the updated Javadoc files. I have no idea why the additional restart was required.

Bug #2: Project won't open


Okay, whatever, time to make C my main project again. Small problem: after the latest restart, C is no longer listed in the project navigator (although several of its source files, which had been open the entire time, are again open). Fine, I'll open it from the recent projects list. Oops, it's not listed there either (?!). So I navigate to it with File > Open Project, try to open it and discover the following stuffed into the project name field:
Error in project.xml: The content of elements must consist of well-formed character data or markup.
Looking at project.xml in a text editor, I find what appears to be some Git-related comments stuffed into it (as if Git was trying to reconcile two conflicting versions of the file). Note that (a) this corruption must have occurred some time after I first opened the IDE, since C was listed in the project navigator then (and still was after the first post-update restart), and (b) up to that point I had not touched C, and certainly had not done any commits etc. relating to C.

Anyway, the fix was to open a terminal window in the project main folder and run git reset --hard to revert to the most recent version in the remote repository ... or so I thought.

Bug #3: Projects merged


After the reset, I was able to open project C (without having to restart the IDE) and make it my main project. I was surprised to discover that, on top of C's own source files, C now contained all the packages and files from B (?!). Again reverting to a terminal (after closing the IDE), I ran git remote -v and found that, on top of listing the various known URLs for pushing and fetching C, it listed the SSH URL for project B as being one of C's URLs. How it got there I have no idea. When I added the SSH URL for B, both B and C were open, but B was the main project, and I added the URL while pushing B, not C.

Lacking an elegant way to fix this, I opened Git's configuration file (.git/config) in a text editor, found the line for B's repository and deleted it, saved that, deleted all the source files for C from C's project folder (while holding my breath), and ran git reset --hard again. That got C back to a state where it contained only its own files.

So everything is back to normal, I've only wasted one entire morning (sheesh!), and I've had the wisdom of remote backups reinforced.


Friday, June 20, 2014

Is the Gender Gap in STEM Degrees Closing?

Two recent posts by Randal S. Olson relating to STEM degrees and gender caught my eye, and I thought I'd share them. The posts are:
I won't attempt to recapitulate their content here; if you're interested, please have a look at them. With Randal's permission, though, I will reproduce his charts. The second chart is mathematically redundant but useful for visualization.
Percentage of bachelor's degrees conferred to women, by major (US, 1970-2012) 
(Source: Randal S. Olson)
Percentage of bachelor's degrees conferred to men, by major (US, 1970-2012)
(Source: Randal S. Olson)

As Randal points out, the proportion of science and math baccalaureate degrees awarded to women has of late been around 40%. There's still a gap, especially since (if I recall correctly) women now comprise more than half the undergraduate population, but we have definitely made significant progress. I do have a wee bit of concern that the trend in math/statistics degrees was negative in the first decade of this millennium. Nonetheless, the gap in "S/M" degrees does not seem as bad as has sometimes been suggested. (Note that I was careful to use "/" and not "&" there.)

The "T/E" portion of STEM remains problematic, as signaled by the curves for Computer Science and Engineering, and there seems to be a fairly lengthy downward trend in the proportion of CS degrees awarded to women. (This is probably a more pressing concern for unwed heterosexual male CS majors than it is for me.) Various efforts to bolster female enrollments in "T/E", particularly engineering, are underway.

I'll end here with some related links.

Previous posts here:

Posts by Laura McLay (Punk Rock Operations Research blog):

  • Too numerous to list. Just visit the blog, click the word "women" in the tag cloud ... and brace yourself.

Cool toys for future female engineers:

Other places to "get your pink on":