Date of this version: August 1, 1997
Q1. "What is Linear Programming?"
A: (For rigorous definitions and theory, which are beyond the scope of this document, the interested reader is referred to the many LP textbooks in print, a few of which are listed in the references section.)
A Linear Program (LP) is a problem that can be expressed as follows (the so-called Standard Form):
minimize cx subject to Ax = b x >= 0where x is the vector of variables to be solved for, A is a matrix of known coefficients, and c and b are vectors of known coefficients. The expression "cx" is called the objective function, and the equations "Ax=b" are called the constraints. All these entities must have consistent dimensions, of course, and you can add "transpose" symbols to taste. The matrix A is generally not square, hence you don't solve an LP by just inverting A. Usually A has more columns than rows, and Ax=b is therefore quite likely to be under-determined, leaving great latitude in the choice of x with which to minimize cx.
The word "Programming" is used here in the sense of "planning"; the necessary relationship to computer programming was incidental to the choice of name. Hence the phrase "LP program" to refer to a piece of software is not a redundancy, although I tend to use the term "code" instead of "program" to avoid the possible ambiguity.
Although all linear programs can be put into the Standard Form, in practice it may not be necessary to do so. For example, although the Standard Form requires all variables to be non-negative, most good LP software allows general bounds l <= x <= u, where l and u are vectors of known lower and upper bounds. Individual elements of these bounds vectors can even be infinity and/or minus-infinity. This allows a variable to be without an explicit upper or lower bound, although of course the constraints in the A-matrix will need to put implied limits on the variable or else the problem may have no finite solution. Similarly, good software allows b1 <= Ax <= b2 for arbitrary b1, b2; the user need not hide inequality constraints by the inclusion of explicit "slack" variables, nor write Ax >= b1 and Ax <= b2 as two separate constraints. Also, LP software can handle maximization problems just as easily as minimization (in effect, the vector c is just multiplied by -1).
The importance of linear programming derives in part from its many applications (see further below) and in part from the existence of good general-purpose techniques for finding optimal solutions. These techniques take as input only an LP in the above Standard Form, and determine a solution without reference to any information concerning the LP's origins or special structure. They are fast and reliable over a substantial range of problem sizes and applications.
Two families of solution techniques are in wide use today. Both visit a progressively improving series of trial solutions, until a solution is reached that satisfies the conditions for an optimum. Simplex methods, introduced by Dantzig about 50 years ago, visit "basic" solutions computed by fixing enough of the variables at their bounds to reduce the constraints Ax = b to a square system, which can be solved for unique values of the remaining variables. Basic solutions represent extreme boundary points of the feasible region defined by Ax = b, x >= 0, and the simplex method can be viewed as moving from one such point to another along the edges of the boundary. Barrier or interior-point methods, by contrast, visit points within the interior of the feasible region. These methods derive from techniques for nonlinear programming that were developed and popularized in the 1960s by Fiacco and McCormick, but their application to linear programming dates back only to Karmarkar's innovative analysis in 1984.
The related problem of integer programming (or integer linear programming, strictly speaking) requires some or all of the variables to take integer (whole number) values. Integer programs (IPs) often have the advantage of being more realistic than LPs, but the disadvantage of being much harder to solve. The most widely used general-purpose techniques for solving IPs use the solutions to a series of LPs to manage the search for integer solutions and to prove optimality. Thus most IP software is built upon LP software, and this FAQ applies to problems of both kinds.
Linear and integer programming have proved valuable for modeling many and diverse types of problems in planning, routing, scheduling, assignment, and design. Industries that make use of LP and its extensions include transportation, energy, telecommunications, and manufacturing of many kinds. A sampling of applications can be found in many LP textbooks, in books on LP software systems, and among the application cases in the journal Interfaces.
Q2. "Where is there good software to solve LP problems?"
A: Thanks to the advances in computing of the past decade, linear programs in a few thousand variables and constraints are nowadays viewed as "small". Problems having tens or hundreds of thousands of continuous variables are regularly solved; tractable integer programs are necessarily smaller, but are still commonly in the hundreds or thousands of variables and constraints. The computers of choice for linear and integer programming applications are Pentium-based PCs and the several varieties of Unix workstations.
There is more to linear programming than optimal solutions and number-crunching, however. This can be appreciated by observing that modern LP software comes in two related but very different kinds of packages:
Most modeling systems support a variety of algorithmic codes, while the more popular codes can be used with many different modeling systems. Because packages of the two kinds are often bundled for convenience of marketing or operation, the distinction between them is sometimes obscured, but it is important to keep in mind when attempting to sort through the many alternatives available.
Large-scale LP algorithmic codes rely on general-structure sparse matrix techniques and numerous other refinements developed through years of experience. The fastest and most reliable codes thus represent considerable development effort, and tend to be expensive except in very limited demonstration or "student" versions. Those codes that are free -- to all, or at least for research and teaching -- tend to be somewhat less robust, though they are still useful for many problems. The ability of a code to solve any particular class of problems cannot easily be predicted from problem size alone; some experimentation is usually necessary to establish difficulty.
Large-scale LP modeling systems are commercial products virtually without exception, and tend to be as expensive as the commercial algorithmic codes (again with the exception of small demo versions). They vary so greatly in design and capability that a description in words is adequate only to make a preliminary decision among them; your ultimate choice is best guided by using each candidate to formulate a model of interest.
Listed below are summary descriptions of available free codes, and a tabulation of many commercial codes and modeling systems for linear (and integer) programming. A list of free demos of commercial software appears at the end of this section.
Another useful source of information is the Optimization Software Guide by Jorge More' and Stephen Wright, available from SIAM Books. It contains references to about 75 available software packages (not all of them just LP), and goes into more detail than is possible in this FAQ; see in particular the sections on "linear programming" and on "modeling languages and optimization systems." An updated Web version of this book is available on the NEOS Guide. Another good soruce of feature summaries and contact information is the Linear Programming Software Survey compiled by OR/MS Today (which also has the largest selection of advertisements for optimization software). Much information can also be obtained through the web sites of optimization software developers, many of which are identified in the writeup and tables below.
To provide some idea of the relative performance of LP codes, a Web page of pointers to benchmarks for optimization software is being compiled by Hans Mittelmann of Arizona State University. It currently includes tests of several public-domain simplex and interior-point implementations. When evaluating any performance comparison, however, whether performed by a customer, vendor, or disinterested third party, keep in mind that all high-quality codes provide options that offer superior performance on certain difficult kinds of LP or IP problems. Benchmark studies of the "default settings" of codes will fail to reflect the power of the optional settings that are available.
Based on the simplex method:
There is an ftp-able code, written in C, called lp_solve that its author (Michel Berkelaar, email at michel@es.ele.tue.nl) says has solved models with up to 30,000 variables and 50,000 constraints. The author requests that people retrieve it from ftp://ftp.es.ele.tue.nl/pub/lp_solve (numerical address at last check: 131.155.20.126). There is an older version to be found in the Usenet archives, but it contains bugs that have been fixed in the meantime, and hence is unsupported. The author also made available a program that converts data files from MPS-format into lp_solve's own input format; it's in the same directory, in file mps2eq_0.2.tar.Z. The documentation states that it is not public domain, and the author wants to discuss it with would-be commercial users. As an editorial opinion, I must state that difficult models will give lp_solve trouble; it's not as good as a commercial code. But for someone who isn't sure what kind of LP code is needed, it represents a reasonable first try.
LP-Optimizer is a simplex-based code for linear and integer programs, written by Markus Weidenauer (nc-weidenma@netcologne.de). Free Borland Pascal 7.0 source is available for downloading, as are executables for DOS and OS/2.
SoPlex is an object-oriented implementation of the primal and dual simplex algorithms, developed by Roland Wunderling. Source code is available free for research uses at noncommercial and academic institutions.
Among the SLATEC library routines is a Fortran sparse implementation of the simplex method, SPLP, at ftp://netlib2.cs.utk.edu/slatec/src/splp.f. Its documentation states that it can solve LP models of "at most a few thousand constraints and variables".
Based on interior-point methods:
The Optimization Technology Center at Argonne and Northwestern has developed the interior-point code PCx. This code can be downloaded directly from the PCx home page; it is freely available, except that you must contact Argonne if you want to include it in a product for resale. A Windows 95/NT version of PCx was announced in April 1997, and is available under the same conditions as the original. (If you want to solve an LP without downloading a code to your own machine, you can execute PCx remotely through the NEOS Server.)
A Fortran 77 interior-point code, BPMPD, has been developed by Csaba Mszros (meszaros@sztaki.hu) at the Computer and Automation Research Institute of the Hungarian Academy of Sciences. It is available as source code, as a Windows95/NT executable (which is also extended to solve convex quadratic problems), and in a DLL version for Windows.
Jacek Gondzio (gondzio@divsun.unige.ch) has made source for his interior point LP solver HOPDM available at http://ecolu-info.unige.ch/~logilab/software/hopdm.html. Additionally, several papers devoted to HOPDM code are available at this site. It uses a higher order primal-dual predictor-corrector logarithmic barrier algorithm, and according to David Gay, it "seems to work well in limited testing. For example, it happily solves all of the examples in netlib's lp/data directory." Prof. Gondzio notes that problem size is limited only by available memory, and on a virtual memory system it has been used to solve models with hundreds of thousand of constraints and variables. An older version of the source code is kept in netlib's opt directory: ftp://netlib.bell-labs.com/netlib/opt/hopdm.shar.Z
Other software of interest:
A web-based service by a group at Berkeley called Interactive Linear Programming appears to be useful for solving small models that can be entered by hand. Along similar lines, the NEOS Guide offers a Java-based Simplex Tool, which demonstrates the workings of the simplex method on small user-entered problems and is especially useful for educational purposes.
The Systems Analysis Laboratory at Seoul National University offers Linear Programming software (both Simplex and Barrier) at http://orly1.snu.ac.kr/Software.html
Will Naylor (naylor@mti.sgi.com) has a collection of software he calls
WNLIB.
Routines of interest include
- simplex method for linear programming: contains anti-cycling and numerical
stability hacks. No optimization for sparse matrix.
- transportation problem/assignment problem routine: optimization for
sparse matrix.
Read the INSTALL.txt file for further information. WNLIB also
contains routines pertaining to nonlinear optimization.
The next several suggestions are for public-domain codes that are severely limited by the algorithm they use (tableau Simplex); they may be OK for models with (on the order of) 100 variables and constraints, but it's unlikely they will be satisfactory for larger models. In the words of Matt Saltzman (mjs@clemson.edu):
For Macintosh users there is a free package called LinPro that is available at ftp://ftp.ari.net/MacSciTech/programming/. Some users have reported that it performs well, while one correspondent informs me he had trouble getting it to solve any problems at all; perhaps this code is sensitive to memory size of the machine. It comes with a "large example" of 100 variables, which gives a hint of its design limits. It seems to be slower than commercial codes, but that should not be a surprise (or a criticism of a free code). LinPro has its own input format and does not support MPS format.
Walter C. Riley (73700.776@compuserve.com) writes:
Stephen F. Gale (sfgale@freenet.calgary.ab.ca) writes:
The following suggestions may represent low-cost ways of solving LPs if you already have certain software available to you.
If your models prove to be too difficult for free or add-on software to handle, then you may have to consider acquiring a commercial LP code. Dozens of such codes are on the market. There are many considerations in selecting an LP code. Speed is important, but LP is complex enough that different codes go faster on different models; you won't find a "Consumer Reports" article to say with certainty which code is THE fastest. I usually suggest getting benchmark results for your particular type of model if speed is paramount to you. Benchmarking can also help determine whether a given code has sufficient numerical stability for your kind of models.
Other questions you should answer: Can you use a stand-alone code, or do you need a code that can be used as a callable library, or do you require source code? Do you want the flexibility of a code that runs on many platforms and/or operating systems, or do you want code that's tuned to your particular hardware architecture (in which case your hardware vendor may have suggestions)? Is the choice of algorithm (Simplex, Interior-Point) important to you? Do you need an interface to a spreadsheet code? Is the purchase price an overriding concern? If you are at a university, is the software offered at an academic discount? How much hotline support do you think you'll need? There is usually a large difference in LP codes, in performance (speed, numerical stability, adaptability to computer architectures) and in features, as you climb the price scale.
In the following table is a condensed version of a survey of LP software that appeared in the June 1992 issue of OR/MS Today (a publication of INFORMS) and that has subsequently been updated in the October 1995 and April 1997 issues. Consult the full survey for more detailed information, or click on the product names to browse their developers' web pages.
The table is in two parts, the first consisting of packages that are primarily algorithmic codes, and the second containing modeling systems. Product names are linked to product or developer web sites where known.
Under "Platform" is an indication of common environments in which the code runs, with the choices being PC-DOS and/or versions of Microsoft Windows (PC), Macintosh OS (M), and Unix on various computer types (U). For other possibilities, check the full survey or contact the vendor.
Even more so than usual, I emphasize that you must use this information at your own risk. I cannot guarantee that every entry is completely correct and up-to-date, but I will gladly correct any mistakes that are pointed out to me.
Key to Features: S=Simplex I=Interior-Point or Barrier Q=Quadratic G=General-Nonlinear M=MIP N=Network V=Visualization
Solver Product | Features | Platform | Phone | E-mail address |
CPLEX | SIMNQ | PC M U | 702-831-7744 | info@cplex.com |
C-WHIZ | SM | PC U | 703-412-3201 | ketronms@erols.com |
FortMP | SIMQ | PC U | 630-971-2337 +44 1895-256484 | naginfo@nag.com hossein@unicom.co.uk |
HI-PLEX | S | PC U | +44 171-594-8334 | i.maros@ic.ac.uk |
HS/LP | SM | PC | 201-627-1424 | info@haverly.com |
ILOG Planner | M | PC U | 415-390-9000 | info@ilog.com |
LAMPS | SM | PC U | +44 181-870-8882 | info@amsoft.demon.co.uk |
LINDO | SMQ | PC | 312-988-7422 | info@lindo.com |
LOQO | GI | PC U | 609-258-0876 | rvdb@princeton.edu |
LPS-867 | SM | PC U | 609-737-6800 | info@main.aae.com |
LS-XLSOL | SM | PC | 702-831-0300 | info@frontsys.com |
MINOS | SQG | PC | 415-962-8719 | mike@sol-michael.stanford.edu |
MINTO | M | U | 404-894-6287 | martin.savelsbergh@isye.gatech.edu |
MPSIII | SMN | PC U | 703-412-3201 | ketronms@erols.com |
OSL | SIMNQ | PC U | 914-433-4740 | osl@vnet.ibm.com |
SAS/OR | SMNGQ | PC M U | 919-677-8000 | saseph@unx.sas.com |
SCICONIC | SM | PC U | +44 1908-284188 | msukwt03.gztltm@eds.com |
SOPT | SIMGQ | PC U | 908-264-4700 | saitech@monmouth.com |
XA | SM | PC M U | 818-441-1565 | sunsetsw@ix.netcom.com |
XPRESS-MP | SIMQ | PC M | 202-887-0296 +44 1604-858993 | info@dash.co.uk |
Modeling Product | Platform | Phone | E-mail address |
AIMMS | PC | +31 23-5350935 | info@paragon.nl |
AMPL | PC U | 702-322-7600 | info@ampl.com |
ANALYZE | PC | 303-796-7830 | hgreenbe@carbon.cudenver.edu |
DecisionPRO | PC | 919-859-4101 | vginfo@vanguardsw.com |
DATAFORM | PC U | 703-412-3201 | ketronms@erols.com |
GAMS | PC U | 202-342-0180 | sales@gams.com |
LINGO | PC U | 800-441-2378 | info@lindo.com |
MathPro | PC U | 202-887-0296 | mathpro@erols.com |
MIMI | PC U | 908-464-8300 | info@chesapeake.com |
MODLER | PC U | 303-796-7830 | hgreenbe@carbon.cudenver.edu |
MPL | PC | 703-522-7900 | info@maximal-usa.com |
OMNI | PC U | 201-627-1424 | info@haverly.com |
VMP | PC U | 301-622-4319 | j-welch@sundown-vmp.com |
What's Best! | PC M U | 800-441-2378 | info@lindo.com |
XPRESS-MP | PC M | 202-887-0296 +44 1604-858993 | info@dash.co.uk |
Downloadable free demos include:
Q3. "Oh, and we also want to solve it as an integer program."
A: Integer LP models are ones whose variables are constrained to take integer or whole number (as opposed to fractional) values. It may not be obvious that integer programming is a very much harder problem than ordinary linear programming, but that is nonetheless the case, in both theory and practice.
Integer models are known by a variety of names and abbreviations, according to the generality of the restrictions on their variables. Mixed integer (MILP or MIP) problems require only some of the variables to take integer values, whereas pure integer (ILP or IP) problems require all variables to be integer. Zero-one (or 0-1 or binary) MIPs or IPs restrict their integer variables to the values zero and one. (The latter are more common than you might expect, because many kinds of combinatorial and logical restrictions can be modeled through the use of zero-one variables.)
For the sake of generality, the following disucssion uses the term MIP to refer to any kind of integer LP problem; the other kinds can be viewed as special cases. MIP, in turn, is a particular member of the class of combinatorial or discrete optimization problems. In fact the problem of determining whether a MIP has an objective value less than a given target is a member of the class of "NP-complete" problems, all of which are very hard to solve (at least as far as anyone has been able to tell). Since any NP-complete problem is reducible to any other, virtually any combinatorial problem of interest can be attacked in principle by solving some equivalent MIP. This approach sometimes works well in practice, though it is by no means infallible.
People are sometimes surprised to learn that MIP problems are solved using floating point arithmetic. Most available general-purpose large-scale MIP codes use a procedure called "branch-and-bound" to search for an optimal integer solution by solving a sequence of related LP "relaxations" that allow some fractional values. Good codes for MIP distinguish themselves primarily by solving shorter sequences of LPs, and secondarily by solving the individual LPs faster. (The similarities between successive LPs in the "search tree" can be exploited to speed things up considerably.) Even more so than with regular LP, a costly commercial code may prove its value if your MIP model is difficult.
Another solution approach known generally as constraint logic programming (CLP) has drawn increasing interest of late. Having their roots in studies of logical inference in artificial intelligence, CLP codes typically do not proceed by solving any LPs. As a result, compared to branch-and-bound they search "harder" but faster through the tree of potential solutions. Their greatest advantage, however, lies in their ability to tailor the search to many constraint forms that can be converted only with difficulty to the form of an integer program; their greatest success tends to be with "highly combinatorial" optimization problems such as scheduling, sequencing, and assignment, where the construction of an equivalent IP would require the definition of large numbers of zero-one variables. More information and a list of available codes can be found in the Constraints FAQ (also posted to the newsgroup comp.constraints).
Whatever your solution technique, you should be prepared to devote far more computer time and memory to solving a MIP problem than to solving the corresponding LP relaxation. (Or equivalently, you should be prepared to solve much smaller MIP problems than LP problems using a given amount of computer resources.) To further complicate matters, the difficulty of any particular MIP problem is hard to predict (in advance, at least!). Problems in no more than a hundred variables can be challenging, while others in tens of thousands of variables solve readily. The best explanations of why a particular MIP is difficult often rely on some insight into the system you are modeling, and even then tend to appear only after a lot of computational tests have been run. A related observation is that the way you formulate your model can be as important as the actual choice of solver.
Thus a MIP problem with hundreds of variables (or more) should be approached with a certain degree of caution and patience. A willingness to experiment with alternative formulations and with a MIP code's many search options often pays off in greatly improved performance. In the hardest cases, you may wish to abandon the goal of a provable optimum; by terminating a MIP code prematurely, you can often obtain a high-quality solution along with a provable upper bound on its distance from optimality. A solution whole objective value is within some fraction of 1% of optimal may be all that is required for your purposes. (Indeed, it may be an optimal solution. In contrast to methods for ordinary LP, procedures for MIP may not be able to prove a solution to be optimal until long after they have found it.)
Once one accepts that large MIP models are not typically solved to a proved optimal solution, that opens up a broad area of approximate methods, probabilistic methods and heuristics, as well as modifications to B&B. See [Balas] which contains a useful heuristic for 0-1 MIP models. See also the brief discussion of Genetic Algorithms and Simulated Annealing in the Nonlinear Programming FAQ.
A major exception to this somewhat gloomy outlook is that there are certain models whose LP solution always turns out to be integer, assuming the input data is integer to start with. In general these models have a "unimodular" constraint matrix of some sort, but by far the best-known and most widely used models of this kind are the so-called pure network flow models. It turns out that such problems are best solved by specialized routines, usually based on the simplex method, that are much faster than any general-purpose LP methods. See the section on Network models for further information.
Commercial MIP codes are listed with the commercial LP codes and modeling systems above. The April 1994 issue of OR/MS Today contains a survey of MIP codes, which largely overlaps the content of the earlier survey on LP: "Survey: Mixed Integer Programming" by Matthew Saltzman, pp 42-51. The following are notes on some publicly available codes for MIP problems.
ftp://ftp.mpi-sb.mpg.de/pub/guide/staff/barth/opbdp/opbdp.tar.Zalong with a Postscript-format technical report (in file mpii952002.ps) describing the techniques used.
Q4. "I wrote an optimization code. Where are some test models?"
A: If you want to try out your code on some real-world LP models, there is a very nice collection of small-to-medium-size ones, with a few that are rather large, at ftp://netlib2.cs.utk.edu/lp/data, popularly known as the Netlib collection (although Netlib consists of much more than just LP). These files (after you uncompress them) are in a format called MPS, which is described in another section of this document. Note that, when you receive a model, it may be compressed both with the Unix utility (use `uncompress` if the file name ends in .Z) AND with an LP-specific program (grab either emps.f or emps.c at the same time you download the model, then compile/run the program to reverse the compression).
Also on netlib is a collection of infeasible LP models, located in ftp://netlib2.cs.utk.edu/lp/infeas.
There is a collection of MIP models, called MIPLIB, housed at Rice University in http://www.caam.rice.edu/~bixby/miplib/miplib.html. FTP users can use ftp://ftp.caam.rice.edu/pub/people/bixby/miplib. Or, send an email message containing "send catalog" to softlib@rice.edu , to get started, if you can't access the files by other means.
There's a Travelling Salesman Problem library (TSPLIB) in ftp://softlib.cs.rice.edu/pub/tsplib. (Alternate address: ftp://elib.zib-berlin.de/pub/mp-testdata/tsp.) A Web version is at http://www.iwr.uni-heidelberg.de/iwr/comopt/soft/TSPLIB95/TSPLIB.html.
There is a collection of network-flow codes and models at ftp://dimacs.rutgers.edu/pub/netflow. Another network generator is called NETGEN and is available at ftp://netlib2.cs.utk.edu/lp/generators.
The commercial modeling language GAMS comes with about 150 test models, which you might be able to test your code with. AIMMS also comes with some test models.
There is a collection called MP-TESTDATA available at Konrad-Zuse-Zentrum fuer Informations-technik Berlin (ZIB) in ftp://elib.zib-berlin.de/pub/mp-testdata. This directory contains various subdirectories, each of which has a file named "index" containing further information. Indexed at this writing are: assign, cluster, lp, ip, matching, maxflow, mincost, set-parti, steiner-tree, tsp, vehicle-rout, and generators.
John Beasley maintains the OR-Lib, at ftp://mscmga.ms.ic.ac.uk/pub/, which contains various optimization test problems. There is an index in ftp://mscmga.ms.ic.ac.uk/pub/info.txt. WWW access now available at http://mscmga.ms.ic.ac.uk/. Have a look in the Journal of the Operational Research Society, Volume 41, Number 11, Pages 1069-72. If you can't access these resources, send e-mail to umtsk99@vaxa.cc.imperial.ac.uk to get started. Information about test problems can be obtained by emailing o.rlibrary@ic.ac.uk with the email message being the file name for the problem areas you are interested in, or just the word "info".
A: MPS format was named after an early IBM LP product and has emerged as a de facto standard ASCII medium among most of the commercial LP codes. Essentially all commercial LP codes accept this format, but if you are using public domain software and have MPS files, you may need to write your own reader routine for this. It's not too hard. See also the comment regarding the lp_solve code, in another section of this document, for the availability of an MPS reader.
The main things to know about MPS format are that it is column oriented (as opposed to entering the model as equations), and everything (variables, rows, etc.) gets a name. MPS format is described in more detail in [Murtagh]. A brief description of MPS format is available at ftp://softlib.cs.rice.edu/pub/miplib
MPS is an old format, so it is set up as though you were using punch cards, and is not free format. Fields start in column 1, 5, 15, 25, 40 and 50. Sections of an MPS file are marked by so-called header cards, which are distinguished by their starting in column 1. Although it is typical to use upper-case throughout the file (like I said, MPS has long historical roots), many MPS-readers will accept mixed-case for anything except the header cards, and some allow mixed-case anywhere. The names that you choose for the individual entities (constraints or variables) are not important to the solver; you should pick names that are meaningful to you, or will be easy for a post-processing code to read.
Here is a little sample model written in MPS format (explained in more detail below):
NAME TESTPROB ROWS N COST L LIM1 G LIM2 E MYEQN COLUMNS XONE COST 1 LIM1 1 XONE LIM2 1 YTWO COST 4 LIM1 1 YTWO MYEQN -1 ZTHREE COST 9 LIM2 1 ZTHREE MYEQN 1 RHS RHS1 LIM1 5 LIM2 10 RHS1 MYEQN 7 BOUNDS UP BND1 XONE 4 LO BND1 YTWO -1 UP BND1 YTWO 1 ENDATA
For comparison, here is the same model written out in an equation-oriented format:
Optimize COST: XONE + 4 YTWO + 9 ZTHREE Subject To LIM1: XONE + YTWO < = 5 LIM2: XONE + ZTHREE > = 10 MYEQN: - YTWO + ZTHREE = 7 Bounds 0 < = XONE < = 4 -1 < = YTWO < = 1 End
Strangely, there is nothing in MPS format that specifies the direction of optimization. And there really is no standard "default" direction; some LP codes will maximize if you don't specify otherwise, others will minimize, and still others put safety first and have no default and require you to specify it somewhere in a control program or by a calling parameter. If you have a model formulated for minimization and the code you are using insists on maximization (or vice versa), it may be easy to convert: just multiply all the coefficients in your objective function by (-1). The optimal value of the objective function will then be the negative of the true value, but the values of the variables themselves will be correct.
The NAME card can have anything you want, starting in column 15. The ROWS section defines the names of all the constraints; entries in column 2 or 3 are E for equality rows, L for less-than ( <= ) rows, G for greater-than ( >= ) rows, and N for non-constraining rows (the first of which would be interpreted as the objective function). The order of the rows named in this section is unimportant.
The largest part of the file is in the COLUMNS section, which is the place where the entries of the A-matrix are put. All entries for a given column must be placed consecutively, although within a column the order of the entries (rows) is irrelevant. Rows not mentioned for a column are implied to have a coefficient of zero.
The RHS section allows one or more right-hand-side vectors to be defined; most people don't bother having more than one. In the above example, the name of the RHS vector is RHS1, and has non-zero values in all 3 of the constraint rows of the problem. Rows not mentioned in an RHS vector would be assumed to have a right-hand-side of zero.
The optional BOUNDS section lets you put lower and upper bounds on individual variables (no * wild cards, unfortunately), instead of having to define extra rows in the matrix. All the bounds that have a given name in column 5 are taken together as a set. Variables not mentioned in a given BOUNDS set are taken to be non-negative (lower bound zero, no upper bound). A bound of type UP means an upper bound is applied to the variable. A bound of type LO means a lower bound is applied. A bound type of FX ("fixed") means that the variable has upper and lower bounds equal to a single value. A bound type of FR ("free") means the variable has neither lower nor upper bounds.
There is another optional section called RANGES that I won't go into here. The final card must be ENDATA, and yes, it is spelled funny.
Broadening the question slightly to cover front-ends in general, it's worth noting that people frequently use spreadsheet programs as an interface to some LP solvers.
Another approach is to apply auxiliary algorithms that identify constraints or groups of constraints that can be considered to "cause" the infeasibility in an LP. A software system called ANALYZE was developed by Harvey Greenberg to provide computer-assisted analysis, including rule-based intelligence; he has also compiled a bibliography of more than 400 references on the subject of model analysis. A system based on the MINOS solver, called MINOS(IIS), available from John Chinneck (chinneck@sce.carleton.ca), can also be used to identify a so-called Irreducible Infeasible Subset; the IIS feature is now available in CPLEX (command "display iis"), OSL, and LINDO (command "debug"). As a final comment, commercial codes sometimes have other built-in features to help track infeasibilities.
ADBASE is a package that computes all efficient (i.e., nondominated) extreme points of a multiple objective linear program. It is available without charge for research and instructional purposes. If someone has a genuine need for such a code, they should send a request to: Ralph E. Steuer, Faculty of Management Science, Brooks Hall, University of Georgia, Athens, GA 30602-6255.
Other approaches that have worked are:
A code in C called cdd is available at ftp://ifor13.ethz.ch/pub/fukuda/cdd and is written by Komei Fukuda; download the file named cdd-***.tar.Z, where *** is the version number (choose the most recent version, which at last check was 056). It solves the problem stated above, as well as that of enumerating all vertices. There is also a C++ version at this site (version 073).
A code in C called rs, by David Avis, is available at ftp://mutt.cs.mcgill.ca/pub/C/README and implements the reverse search method.
Ken Clarkson has written a program called Hull which is an ANSI C program that computes the convex hull of a point set in general dimension. The input is a list of points, and the output is a list of facets of the convex hull of the points, each facet presented as a list of its vertices. It can be downloaded from ftp://netlib.bell-labs.com/netlib/voronoi/hull.shar.Z.
There is a directory in ftp://elib.zib-berlin.de/pub/mathprog/polyth that contains pointers to some tools for such problems.
Nina Amenta has a list of computational geometry software at http://www.geom.umn.edu/locate/cglist.
Other algorithms for such problems are described in [Swart], [Seidel], and [Avis]. Such topics are said to be discussed in [Schrijver] (page 224), [Chvatal] (chapter 18), [Balinski], and [Mattheis] as well. Part of the method described in [Avis], to enumerate vertices, is implemented in a Mathematica package called VertexEnum.m at ftp://cs.sunysb.edu/pub/Combinatorica, in file VE041.Z.
Jeffrey Horn (horn@cs.wisc.edu) has compiled a bibliography of papers relating to research on parallel B&B. There is an survey article by Gendron and Crainic in the journal Operations Research, Vol. 42 (1994), No. 6, pp. 1042-1066.
If your particular model is a good candidate for decomposition (see Decomposition, above) then that form of parallelism could also be very useful, but you'll have to implement it yourself. Here's what I say to people who write parallel LP solvers as class projects:
You are probably working with the tableau form of the Simplex method. This method works well for small models, but it is inefficient for most real-world models because such models are usually <1% dense. Sparse matrix methods dominate here. It may well be true that you can get good parallel speedups with your code, but I'd wager that by the time you get to problems with 1000 rows, any parallel-dense LP code will be slower than a single- processor sparse code. And, worse yet, I think it's generally accepted that no one currently knows how to do a good (i.e., scalable) parallel sparse LP code. I wouldn't be harping on this point, except that most people's interest in parallelism is because of the promise of scalability, in which case large-scale considerations are important. Writing even a single-processor large-scale LP code is a multi-year project, realistically. The point is, don't get too enthralled by speedups in your code, unless there's something to what you are doing that I haven't guessed.
The network linear programming problem is to minimize the (linear) total cost of flows along all arcs of a network, subject to conservation of flow at each node, and upper and/or lower bounds on the flow along each arc. This is a special case of the general linear programming problem. The transportation problem is an even more special case in which the network is bipartite: all arcs run from nodes in one subset to the nodes in a disjoint subset. A variety of other well-known network problems, including shortest path problems, maximum flow problems, and certain assignment problems, can also be modeled and solved as network linear programs. Details are presented in many books on linear programming and operations research.
Network linear programs can be solved 10 to 100 times faster than general linear programs of the same size, by use of specialized optimization algorithms. Some commercial LP solvers include a version of the network simplex method for this purpose. That method has the nice property that, if it is given integer flow data, it will return optimal flows that are integral. Integer network LPs can thus be solved efficiently without resort to complex integer programming software.
Unfortunately, many different network problems of practical interest do not have a formulation as a network LP. These include network LPs with additional linear "side constraints" (such as multicommodity flow problems) as well as problems of network routing and design that have completely different kinds of constraints. In principle, nearly all of these network problems can be modeled as integer programs. Some "easy" cases can be solved much more efficiently by specialized network algorithms, however, while other "hard" ones are so difficult that they require specialized methods that may or may not involve some integer programming. Contrary to many people's intuition, the statement of a hard problem may be only marginally more complicated than the statement of some easy problem.
A canonical example of a hard network problem is the "traveling salesman" problem of finding a shortest tour through a network that visits each node once. A canonical easy problem not obviously equivalent to a linear program is the "minimum spanning tree" problem to find a least-cost collection of arcs that connect all the nodes. But if instead you want to connect only some given subset of nodes (the "Steiner tree" problem) then you are faced with a hard problem. These and many other network problems are described in some of the references below.
Software for network optimization is thus in a much more fragmented state than is general-purpose software for linear programming. Here are some places to look for software that might serve your purposes:
Recourse Problems are staged problems wherein one alteranates decisions with realizations of stochastic data. The objective is to minimize total expected costs of all decisions. The main sources of code (not necessarily public domain) depend on how the data is distributed and how many stages (decision points) are in the problem. For discretely distributed multistage problems, a good package called MSLiP is available from Gus Gassman (gassmann@ac.dal.ca, written up in Math. Prog. 47,407-423) Also, for not huge discretely distributed problems, a deterministic equivalent can be formed which can be solved with a standard solver. STOPGEN, available via anonymous FTP from this author is a program which forms deterministic equiv. MPS files from stopro problems in standard format (Birge, et. al., COAL newsletter 17). The most recent program for continuously distributed data is BRAIN, by K. Frauendorfer (frauendorfer@sgcl1.unisg.ch, written up in detail in the author's monograph ``Stochastic Two-Stage Programming'', Lecture Notes in Economics & Math. Systems #392 (Springer-Verlag).
CCP problems are not usually staged, and have a constraint of the form Pr( Ax <= b ) >= alpha. The solvability of CCP problems depends on the distribution of the data (A &/v b). I don't know of any public domain codes for CCP probs., but you can get an idea of how to approach the problem by reading Chapter 5 by Prof. A. Prekopa (prekopa@cancer.rutgers.edu) Y. Ermoliev, and R. J-B. Wets, eds., Numerical Techniques for Stochastic Optimization (Series in Comp. Math. 10, Springer-Verlag, 1988).
Both Springer Verlag texts mentioned above are good introductory references to Stochastic Programming. This list of codes is far from comprehensive, but should serve as a good starting point.
For a MIP model with both integer and continuous variables, you could get a limited amount of information by fixing the integer variables at their optimal values, re-solving the model as an LP, and doing standard post-optimal analyses on the remaining continuous variables; but this tells you nothing about the integer variables, which presumably are the ones of interest. Another MIP approach would be to choose the coefficients of your model that are of the most interest, and generate "scenarios" using values within a stated range created by a random number generator. Perhaps five or ten scenarios would be sufficient; you would solve each of them, and by some means compare, contrast, or average the answers that are obtained. Noting patterns in the solutions, for instance, may give you an idea of what solutions might be most stable. A third approach would be to consider a goal-programming formulation; perhaps your desire to see post-optimal analysis is an indication that some important aspect is missing from your model.
The simplest answer to the problem of degeneracy/cycling is often to "get a better optimizer", i.e. one with stronger pricing algorithms, and a better selection of features. However, obviously that is not always an option (money!), and even the best LP codes can run into degeneracy on certain models. Besides, they say it's a poor workman who blames his tools.
So, when one cannot change the optimizer, it's expedient to change the model. Not drastically, of course, but a little "noise" can usually help to break the ties that occur during the Simplex method. A procedure that can work nicely is to add, to the values in the RHS, random values roughly six orders of magnitude smaller. Depending on your model's formulation, such a perturbation may not even seriously affect the quality of the solution values. However, if you want to switch back to the original formulation, the final solution basis for the perturbed model should be a useful starting point for a "cleanup" optimization phase. (Depending on the code you are using, this may take some ingenuity to do, however.)
Another helpful tactic: if your optimization code has more than one solution algorithm, you can alternate among them. When one algorithm gets stuck, begin again with another algorithm, using the most recent basis as a starting point. For instance, alternating between a primal and a dual method can move the solution away from a nasty point of degeneracy. Using partial pricing can be a useful tactic against true cycling, as it tends to reorder the columns. And of course Interior Point algorithms are much less affected by (though not totally immune to) degeneracy. Unfortunately, the optimizers richest in alternate algorithms and features also tend to be least prone to problems with degeneracy in the first place.
Q7. "What references and web links are there in this field?"
A: What follows here is an idiosyncratic list, a few books that I like, or have been recommended on the net, or are recent. I have *not* reviewed them all.
Regarding the common question of the choice of textbook for a college LP course, it's difficult to give a blanket answer because of the variety of topics that can be emphasized: brief overview of algorithms, deeper study of algorithms, theorems and proofs, complexity theory, efficient linear algebra, modeling techniques, solution analysis, and so on. A small and unscientific poll of ORCS-L mailing list readers in 1993 uncovered a consensus that [Chvatal] was in most ways pretty good, at least for an algorithmically oriented class; of course, some new candidate texts have been published in the meantime. For a class in modeling, a book about a commercial code would be useful (LINDO, AMPL, GAMS were suggested), especially if the students are going to use such a code; and I have always had a fondness for the book by [Williams].
Q8. "How do I access the Netlib server?"
A: If you have FTP access, you can try "ftp netlib2.cs.utk.edu", using "anonymous" as the Name, and your email address as the Password. Do a "cd (dir)" where (dir) is whatever directory was mentioned, and look around, then do a "get (filename)" on anything that seems interesting. There often will be a "README" file, which you would want to look at first. Another FTP site is netlib.bell-labs.com although you will first need to do "cd netlib" before you can cd to the (dir) you are interested in. Alternatively, you can reach an e-mail server via "netlib@ornl.gov", to which you can send a message saying "send index from (dir)"; follow the instructions you receive. This is a list of sites mirroring the netlib repository:Q9. "Who maintains this FAQ list?"
A: This list was established by John W. Gregory (ashbury@skypoint.com), and is currently being maintained by Robert Fourer (4er@iems.nwu.edu) and the Optimization Technology Center.This article is Copyright 1997 by Robert Fourer and John W. Gregory. It may be freely redistributed in its entirety provided that this copyright notice is not removed. It may not be sold for profit or incorporated in commercial documents without the written permission of the copyright holder. Permission is expressly granted for this document to be made available for file transfer from installations offering unrestricted anonymous file transfer on the Internet.
The material in this document does not reflect any official position taken by any organization. While all information in this article is believed to be correct at the time of writing, it is provided "as is" with no warranty implied.
If you wish to cite this FAQ formally (hey, someone actually asked for this), you may use:
Fourer, Robert (4er@iems.nwu.edu) and Gregory, John W. (ashbury@skypoint.com), "Linear Programming FAQ" (1997). World Wide Web http://www.mcs.anl.gov/home/otc/ faq/linear-programming-faq.html, Usenet sci.answers, anonymous FTP /pub/usenet/sci.answers/ linear-programming-faq from rtfm.mit.edu.There's a mail server on rtfm.mit.edu, so if you don't have FTP privileges, you can send an e-mail message to mail-server@rtfm.mit.edu containing:
send usenet/sci.answers/linear-programming-faqas the body of the message to receive the latest version (it is posted on the first working day of each month). This FAQ is cross-posted to news.answers and sci.op-research.
Suggestions, corrections, topics you'd like to see covered, and additional material are all solicited. Send email to 4er@iems.nwu.edu.
END linear-programming-faq