12 March 2010 View Comments

Example of the Matlab Precision Issue


I noticed that there are many people comparing Matlab with Maple, Mathematica etc. to find out what’s the best Maths package for them. But many of them didn’t realize the limitations of Matlab. I put up an article about the Birthday Problem in probability, If you try to solve it analytically like:

[cc lang='matlab' ]
1 – factorial(365)/factorial(336)/365^30
[/cc]

You will have trouble getting the result you want. You may solve it using its symbolic toolbox. But, not everyone has the symbolic toolbox (or other access to Maple, or another symbolic math package). If you really need more than 15 digits of accuracy in a factorial, Matlab (without the symbokic toolbox) is not the appropriate tool to get them.

Notwithstanding that, read the help a little more closely. It says that factorial is accurate for N<=21, and that for larger N, the first 15 digits are good. Those are two separate statements. You need to look at the reasoning behind them. For N=21, Matlab gives: 5.109094217170944e+019 (if you’ve specified that the display should be “format long e”) and Maple gives 51090942171709440000. These are of course the same number, so Matlab is indeed accurate at this level. But look at the size of those numbers There are sixteen non-zero significant digits, followed by some zeros (The zeros are actually significant too, but let’s ignore that for now.). What about factorial(22)? Matlab gives 1.124000727777608e+021 and Maple gives 1124000727777607680000. Now the Maple result has eighteen significant digits and some zeros. Matlab is the same for fifteen of those digits (as it says in the help) and the sixteenth is rounded off. In fact, it’s actually somewhat lucky that Matlab is exact for factorials bigger than factorial(18) up to factorial(21). These work because of the zeros at the end of the number (which are actually significant in that they are generated by the 2*5, 10, 4*15 and 20 multiplied in the factorial., but Matlab doesn’t care about that–just that they are zero). In fact, Matlab works in IEEE double precision arithmetic on most computers (unless you are using the symbolic toolbox, as shown by Natalia, above). In this format, only 15-16 digits of precision are stored. There are lots of applications where you will see this problem occurring, and anyone doing numerical analysis on a computer needs to be aware of this. (For example, try a=0;for i=1:10;a=a+0.1;end; what does “a” equal at the end of this? It should be 1, right? Try it: a==1 –> this gives 0 indicating that a does not in fact exactly equal 1, although it’s very close to it.)
When you’re dealing with integers (as you are when working with factorials), you can be sure that Matlab stores them exactly if they are less than 2^53 in magnitude. (Unless, as in the above example, the integer is generated from non-integer values.) The smallest factorial less than this is factorial(18), which is why I said it’s lucky that Matlab gives you three more freebies–if the ending numbers weren’t zeros as part of the nature of the calculation, they would have been lost. Matlab just can’t store more than 15-16 digits, so if you have more significant (non-zero) digits than that, as in factorials larger than factorial(21), the trailing numbers will be lost, and you end up with only 15-16 accurate digits, as the help says.

Another limitation in calculating factorials, is that IEEE double precision can’t store numbers greater than 1.7977e308 (or smaller than 2.2251e-308, but that’s not relevant to factorials), which means you can’t calculate factorials larger than factorial(170). An example of when you might want to do this is in calculating the number of combinations: n!/(k!*(n-k)!) Matlabs nchoosek function has a neat bit of code it in which allows it to give results even if n is greater than 170, so long as the result is still less than 1.7977e308 (although it still only has no more than 15 digits of accuracy). In general, though, you can make use of the fact that factorial(n) is equal to gamma(n+1). Matlab has a gammaln function which instead of calculating gamma gives the natural log of it. The nchoosek function (or the part of it for scalar inputs) could then be rewritten as exp(gammaln(n+1)-gammaln(k+1)-gammaln(n-k+1)). There’s only one problem with this–it may not give an exact integer (which is only relevant if the result should be an exact integer–i.e., it is less than 2^53). But you can just round it to the nearest integer and be pretty confident in its results. The same theory can be used for other calculations involving factorials, where the individual factorials are too big, but where the end result shouldn’t be.

[Credit: Mathwoks newsgroup]

Tags:
  • The limitations of IEEE arithmetic apply to many packages, not just MATLAB. As you say, Maple and Mathematica can employ arbitrary precision arithmetic to solve problems that IEEE arithmetic cannot.

    However, this extra precision comes at a cost and that cost is execution speed. IEEE arithmetic is implemented in hardware which means it's fast. VERY fast. Arbitrary precision arithmetic is implemented in software which makes it a LOT slower.

    So, it all comes down to a trade-off. If IEEE arithmetic is good enough to solve your problem (and for many problems, it will be) then you are well advised to use it.

    Cheers,
    Mike
blog comments powered by Disqus