Let s say I want to check whether two numbers a and b are equal. Because of imprecision with floating points, I know that instead of simply checking a == b, I usually want to pick some small number eps and check instead that abs(a - b) < eps.
But what do I do if I want to take into account floating point errors when checking that a > b? I m guessing that instead of simply
if (a > b) {
...
}
I want to do something like:
if ((a > b) || abs(a - b) < eps) {
...
}
Is this correct? How do I check that a is "approximately greater than" b?
You are asking how to calculate a correct result (whether one value is greater than another value) from incorrect input (some values that have errors in them). Obviously, this is impossible in general: Incorrect input produces incorrect output. However, in some specific situations, we can salvage something. The following discusses one situation.
Let’s suppose you have calculated some a and b that approximate the ideal values a and b , where a and b are the results you would have if the calculations were done with exact mathematics. Also suppose that we know error bounds ea and eb such that a – ea ≤ a ≤ a + ea and a – eb ≤ b ≤ b + eb . In other words, the calculated a and b lie within some intervals around a and b , respectively. (Depending on the operations performed, it is possible that errors could cause a or b to lie in some unconnected intervals, possibly not even containing a or b . But we will suppose you have “well behaved” errors.)
In that case, if a – ea > b + eb , then you can be certain that a > b .
However, suppose you test for this condition and return true if it holds. Then, whenever this returns true, you will know that a > b . However, when it returns false, you will not be sure that a > b is false. So, this test is good if you want to perform some action only when you are certain that a > b . But this causes you to miss performing the action in some cases when a > b .
Suppose you do not want to miss any of those cases. Then consider the condition a + ea > b – eb . If a > b , then this condition must be true. So, if you test for this condition and perform the desired action when it holds, then the action will always be performed when a > b . However, the action may also be performed some times when it is not true that a > b .
This shows that you have choices to make. If you have errors in your calculations, sometimes your application will do the wrong thing. You must choose:
- How acceptable it is for your application to perform the action when it is false that a > b . Is it always acceptable/unacceptable, or does it depend on how close a is to b ?
- How acceptable it is for your application to not perform the action when it is true that a > b . Is it always acceptable/unacceptable, or does it depend on how close a is to b ?
If you can find some satisfactory compromise, then you set your condition to some intermediate level, and you test for the condition a-b > e, for some e that lies between – ea – eb and + ea + eb , inclusive. If you cannot find a satisfactory compromise, then you need to improve the calculations of a and b to reduce the errors, or you need to redesign your program in some way.
Note: The final test in this scenario is a-b > e rather than a > b+e because there may be a small rounding error calculating b+e. There may also be a rounding error calculating a-b, but only if a and b are not near each other, in which case the difference, even with rounding, is much larger than e (unless your error interval is atrocious). In the cases where we care about precision, when a is near b, the calculation of a-b is exact.
http://stackoverflow.com/questions/13774876/account-for-floating-point-imprecision-when-testing-approximately-greater-than