# Oh, and we also want to solve it as an integer program.

from LP FAQ

Integer LP models are ones whose variables are constrained to take integer or whole number (as opposed to fractional) values. It may not be obvious that integer programming is a very much harder problem than ordinary linear programming, but that is nonetheless the case, in both theory and practice.

Integer models are known by a variety of names and abbreviations, according to the generality of the restrictions on their variables. Mixed integer (MILP or MIP) problems require only some of the variables to take integer values, whereas pure integer (ILP or IP) problems require all variables to be integer. Zero-one (or 0-1 or binary) MIPs or IPs restrict their integer variables to the values zero and one. (The latter are more common than you might expect, because many kinds of combinatorial and logical restrictions can be modeled through the use of zero-one variables.)

For the sake of generality, the following disucssion uses the term MIP to refer to any kind of integer LP problem; the other kinds can be viewed as special cases. MIP, in turn, is a particular member of the class of combinatorial or discrete optimization problems. In fact the problem of determining whether a MIP has an objective value less than a given target is a member of the class of "NP-complete" problems, all of which are very hard to solve (at least as far as anyone has been able to tell). Since any NP-complete problem is reducible to any other, virtually any combinatorial problem of interest can be attacked in principle by solving some equivalent MIP. This approach sometimes works well in practice, though it is by no means infallible.

People are sometimes surprised to learn that MIP problems are solved using floating point arithmetic. Most available general-purpose large-scale MIP codes use a procedure called "branch-and-bound" to search for an optimal integer solution by solving a sequence of related LP "relaxations" that allow some fractional values. Good codes for MIP distinguish themselves primarily by solving shorter sequences of LPs, and secondarily by solving the individual LPs faster. (The similarities between successive LPs in the "search tree" can be exploited to speed things up considerably.) Even more so than with regular LP, a costly commercial code may prove its value if your MIP model is difficult.

Another solution approach known generally as constraint logic programming (CLP) has drawn increasing interest of late. Having their roots in studies of logical inference in artificial intelligence, CLP codes typically do not proceed by solving any LPs. As a result, compared to branch-and-bound they search "harder" but faster through the tree of potential solutions. Their greatest advantage, however, lies in their ability to tailor the search to many constraint forms that can be converted only with difficulty to the form of an integer program; their greatest success tends to be with "highly combinatorial" optimization problems such as scheduling, sequencing, and assignment, where the construction of an equivalent IP would require the definition of large numbers of zero-one variables. More information and a list of available codes can be found in the Constraints FAQ.

Whatever your solution technique, you should be prepared to devote far more computer time and memory to solving a MIP problem than to solving the corresponding LP relaxation. (Or equivalently, you should be prepared to solve much smaller MIP problems than LP problems using a given amount of computer resources.) To further complicate matters, the difficulty of any particular MIP problem is hard to predict (in advance, at least!). Problems in no more than a hundred variables can be challenging, while others in tens of thousands of variables solve readily. The best explanations of why a particular MIP is difficult often rely on some insight into the system you are modeling, and even then tend to appear only after a lot of computational tests have been run. A related observation is that the way you formulate your model can be as important as the actual choice of solver.

Thus a MIP problem with hundreds of variables (or more) should be approached with a certain degree of caution and patience. A willingness to experiment with alternative formulations and with a MIP code's many search options often pays off in greatly improved performance. In the hardest cases, you may wish to abandon the goal of a provable optimum; by terminating a MIP code prematurely, you can often obtain a high-quality solution along with a provable upper bound on its distance from optimality. A solution whole objective value is within some fraction of 1% of optimal may be all that is required for your purposes. (Indeed, it may be an optimal solution. In contrast to methods for ordinary LP, procedures for MIP may not be able to prove a solution to be optimal until long after they have found it.)

Once one accepts that large MIP models are not typically solved to a proved optimal solution, that opens up a broad area of approximate methods, probabilistic methods and heuristics, as well as modifications to B&B. See also the brief discussion of Genetic Algorithms and Simulated Annealing in the Nonlinear Programming FAQ.

A major exception to this somewhat gloomy outlook is that there are certain models whose LP solution always turns out to be integer, assuming the input data is integer to start with. In general these models have a "unimodular" constraint matrix of some sort, but by far the best-known and most widely used models of this kind are the so-called pure network flow models. It turns out that such problems are best solved by specialized routines, usually based on the simplex method, that are much faster than any general-purpose LP methods.

See the LP FAQ list for additional details.