127640 – Calculations are randomly incorrect when 'Precision as shown' is Off

Issue 127640 - Calculations are randomly incorrect when 'Precision as shown' is Off

Summary: Calculations are randomly incorrect when 'Precision as shown' is Off

Status:	UNCONFIRMED

Alias:	None

Product:	Calc
Classification:	Application
Component:	ui (show other issues)
Version:	4.1.4
Hardware:	PC Windows 8, 8.1

Importance:	P5 (lowest) Normal (vote)
Target Milestone:	---
Assignee:	AOO issues mailing list
QA Contact:

URL:
Keywords:

Depends on:
Blocks:

Reported:	2017-12-31 01:24 UTC by ChasVA
Modified:	2018-01-01 16:57 UTC (History)
CC List:	2 users (show)

See Also:
Issue Type:	DEFECT
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Example of when Calc randomly introduces a calculation error. (21.65 KB, application/vnd.oasis.opendocument.spreadsheet) 2017-12-31 01:24 UTC, ChasVA	no flags	Details
Program for analyzing E13 calculation in the prior attachment. (884 bytes, text/plain) 2017-12-31 14:18 UTC, Patricia Shanahan	no flags	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description ChasVA 2017-12-31 01:24:22 UTC

Created attachment 86301 [details]
Example of when Calc randomly introduces a calculation error.

I created a simple Ledger to monitor my bank account. In one column, I enter the End of Day (EOD) balance shown in my online account and compare it to the calculated balance of the ledger. If they match, both numbers should appear in Bold Blue type. 

Calc is apparently calculating and comparing two-decimal digits as digits with more than 20 decimals (the maximum allowed to be displayed), as the first error displays when the numbers match up to 20 decimals. 

There are more explanatory notes in the spreadsheet itself; however, the error continues for several rows and then corrects itself for subsequent rows.

"Precision as shown" is OFF. 

If it matters, Excel doesn't reproduce the error. In that program, "Precision as displayed" is OFF.

There is a simple workaround using ROUND(); but one shouldn't have to round numbers that are never more than two decimals to two decimals.

Comment 1 Patricia Shanahan 2017-12-31 14:15:21 UTC

Numbers in Apache OpenOffice are represented in IEEE 754 64-bit Binary Floating Point. The key word here is "Binary". All the exactly representable numbers are terminating binary fractions. That includes decimal fractions ending in .00, .25, .50, .75. It excludes all other decimal fractions with only two digits after the decimal point. Other values can be approximated very closely, resulting in rounding error.

OpenOffice and Excel do at least three things to hide rounding error:

1. Print non-zero digits that probably represent rounding error as zero.
2. Treat almost equal numbers as equal.
3. Reduce very slightly non-zero results of addition or subtraction to zero.

These behaviors affected the example spreadsheet, and make it harder to do detailed analysis. I calculated several numbers used in this comment in a Java program that I will attach to this report.

I found by experiment that forcing E13 to be exactly 1853.13 changes the FALSE answers to TRUE. Changing E12 does not affect them, so the problem starts in the calculation of E13.

The exact inputs to the E13 subtraction are:

1873.1199999999998908606357872486114501953125

and

19.989999999999998436805981327779591083526611328125

The exact sum of these numbers is:

1853.129999999999892423829805920831859111785888671875

but the closest representable number is:

1853.129999999999881765688769519329071044921875

The closest representable number to 1853.13 is 1853.1300000000001091393642127513885498046875, which rounds away from zero.

At this point three rounding errors are all operating in the same direction, towards zero. As input to the subsequent calculations, this results in a larger deviation than OpenOffice suppresses at E37 and E38.

I do not recommend trying to reverse engineer the Excel behavior.

The knowledge that all column E numbers are the results of addition and subtraction of numbers with two decimal digits after the decimal point is valuable information. I recommend changing the spreadsheet to round the result of the addition and subtraction in the column E calculation to two decimal places.

Comment 2 Patricia Shanahan 2017-12-31 14:18:43 UTC

Created attachment 86302 [details]
Program for analyzing E13 calculation in the prior attachment.

This is a Java program that prints some useful numbers for understanding the details of the E13 calculation. It uses BigDecimal, a fixed point decimal number class, for exact decimal printing and arithmetic.

Comment 3 ChasVA 2017-12-31 19:04:19 UTC

(In reply to Patricia Shanahan from comment #2)
> Created attachment 86302 [details]
> Program for analyzing E13 calculation in the prior attachment.
> 
> This is a Java program that prints some useful numbers for understanding the
> details of the E13 calculation. It uses BigDecimal, a fixed point decimal
> number class, for exact decimal printing and arithmetic.

(In reply to Patricia Shanahan from comment #1)
> Numbers in Apache OpenOffice are represented in IEEE 754 64-bit Binary
> Floating Point. The key word here is "Binary". All the exactly representable
> numbers are terminating binary fractions. That includes decimal fractions
> ending in .00, .25, .50, .75. It excludes all other decimal fractions with
> only two digits after the decimal point. Other values can be approximated
> very closely, resulting in rounding error.
> 
> OpenOffice and Excel do at least three things to hide rounding error:
> 
> 1. Print non-zero digits that probably represent rounding error as zero.
> 2. Treat almost equal numbers as equal.
> 3. Reduce very slightly non-zero results of addition or subtraction to zero.
> 
> These behaviors affected the example spreadsheet, and make it harder to do
> detailed analysis. I calculated several numbers used in this comment in a
> Java program that I will attach to this report.
> 
> I found by experiment that forcing E13 to be exactly 1853.13 changes the
> FALSE answers to TRUE. Changing E12 does not affect them, so the problem
> starts in the calculation of E13.
> 
> The exact inputs to the E13 subtraction are:
> 
> 1873.1199999999998908606357872486114501953125
> 
> and
> 
> 19.989999999999998436805981327779591083526611328125
> 
> The exact sum of these numbers is:
> 
> 1853.129999999999892423829805920831859111785888671875
> 
> but the closest representable number is:
> 
> 1853.129999999999881765688769519329071044921875
> 
> The closest representable number to 1853.13 is
> 1853.1300000000001091393642127513885498046875, which rounds away from zero.
> 
> At this point three rounding errors are all operating in the same direction,
> towards zero. As input to the subsequent calculations, this results in a
> larger deviation than OpenOffice suppresses at E37 and E38.
> 
> I do not recommend trying to reverse engineer the Excel behavior.
> 
> The knowledge that all column E numbers are the results of addition and
> subtraction of numbers with two decimal digits after the decimal point is
> valuable information. I recommend changing the spreadsheet to round the
> result of the addition and subtraction in the column E calculation to two
> decimal places.

Thank you for your analysis and testing. You've far exceeded my skill set and I appreciate your providing a deeper look into the inner workings of Calc.

I'm well aware that the issue stems from using Binary numbers for the calculations; I'm also aware that the workaround is to use the ROUND function to provide accurate numbers for both display and calculation.

I disagree with your recommendation about reverse engineering the Excel solution. Apparently Excel is somehow treating the rounding differently. Whether employing a reverse engineering method, or just tweaking the calculation method, OO should handle numbers with two decimal places with greater accuracy. Likely the most common form of spreadsheet in use employs calculations on numbers to two decimal places; spreadsheets tracking currency transaction surely make up a large, if not the largest, subset of spreadsheets.

For a spreadsheet using only numbers with two decimal places, it shouldn't be necessary to use the ROUND function as a workaround, nor should "Precision as shown" need to be selected. Had I not set up the conditional formatting as I did, I would never have known that Calc was seeing numbers that appear to be identical on screen as being unequal. Pity the user who doesn't know to use the ROUND function and bases other calculations on comparisons of numbers that appear to be equal but aren't. 

I've been using spreadsheets in one form or another since the 1989s, starting with CalcStar, which was bundled with WordStar by MicroPro. I've used Excel extensively in the past, and have also occasionally used SoftMaker's FreeOffice. I've never encountered this error before in any other program. So there's obviously a different way to handle the vagaries of the results of calculating in binary. I wish I knew more about coding so that I could propose some sort of patch. Absent an internal fix, perhaps "Precision as shown" should be selected as the default when installing Calc?

Thanks again for all your work.

Comment 4 Peter 2018-01-01 03:31:52 UTC

Probably OpenOffice uses a different internal prediction. I had the same issue with excel and SA5 (a statistical language)
If I remember it directly Excel used a fixed 9 precision while SAS uses 14 and produced different results.

My guess would be you deal the same issue.

Comment 5 Patricia Shanahan 2018-01-01 09:21:58 UTC

(In reply to Peter from comment #4)
> Probably OpenOffice uses a different internal prediction. I had the same
> issue with excel and SA5 (a statistical language)
> If I remember it directly Excel used a fixed 9 precision while SAS uses 14
> and produced different results.
> 
> My guess would be you deal the same issue.

According to "Floating-point arithmetic may give inaccurate results in Excel" at https://support.microsoft.com/en-us/help/78113/floating-point-arithmetic-may-give-inaccurate-results-in-excel, Excel uses IEEE 754 64-bit binary floating point, like OpenOffice. The article suggests the use of ROUND and/or "precision as displayed".

Comment 6 Patricia Shanahan 2018-01-01 16:57:10 UTC

In considering solutions, it is important to understand that deviating from IEEE 754 will generally reduce precision. The standard was very carefully designed by people far more expert than me in numerical analysis and floating point arithmetic. Deviations also result in calculations getting different results from other programs that follow the standard more closely.

The deviations OpenOffice, like Excel, makes to try to hide some of the realities of floating point arithmetic should be kept to a minimum, to limit the loss of precision and non-standard results.

Turning on "Precision as shown" by default would produce far bigger differences from how Excel behaves, and far greater risk of significant loss of precision, than anything we are currently doing. Many spreadsheets do not display inputs and intermediate results to the precision that is needed in dependent calculations.

The problem description puts a lot of emphasis on the fact that the numbers only need two decimal places. That is very important and valuable information that should affect how the numbers are managed. Unfortunately, that information is not available in the spreadsheet calculations. There are two ways it could be conveyed in the example spreadsheet, using ROUND or turning on "Precision as shown".