Issue 71158 - ODFF: Inconsistent GCD() and LCM() results for non-integers
Summary: ODFF: Inconsistent GCD() and LCM() results for non-integers
Status: CLOSED FIXED
Alias: None
Product: Calc
Classification: Application
Component: ui (show other issues)
Version: OOo 2.0.4
Hardware: All All
: P4 Trivial (vote)
Target Milestone: ---
Assignee: oc
QA Contact: issues@sc
URL:
Keywords: oooqa
Depends on:
Blocks: 72764
  Show dependency tree
 
Reported: 2006-11-03 15:41 UTC by drking
Modified: 2013-08-07 15:15 UTC (History)
3 users (show)

See Also:
Issue Type: PATCH
Latest Confirmation in: ---
Developer Difficulty: ---


Attachments
Current GCD behaviour for non-integers (9.98 KB, application/vnd.sun.xml.calc)
2007-01-03 08:41 UTC, drking
no flags Details
A patch for i71158 (14.22 KB, text/plain)
2008-01-18 07:47 UTC, lvyue
no flags Details
Another patch for i71158 (11.00 KB, text/plain)
2008-01-22 06:26 UTC, lvyue
no flags Details
A patch based on patch 2, and enhanced the handling of negative. (14.39 KB, text/plain)
2008-02-01 05:40 UTC, lvyue
no flags Details
GCD and LCM testcases with Excel result comparison (12.83 KB, application/vnd.oasis.opendocument.spreadsheet)
2008-05-08 21:06 UTC, ooo
no flags Details
TestCaseSpecification (27.86 KB, text/html)
2008-06-09 10:45 UTC, oc
no flags Details

Note You need to log in before you can comment on or make changes to this issue.
Description drking 2006-11-03 15:41:49 UTC
Example
=GCD(1.2;2.4) returns 1.2
=GCD(1.2;3.6) returns 0

You could say that anyone using GCD with non-integers deserves this....  but the
behaviour really ought to be defined consistently - returning an error or zero.
It should never return a non-integer, or we might have GCD(5;6) giving 1.2
Comment 1 dridgway 2007-01-02 01:03:11 UTC
Excel's documentation for GCD() states that Excel truncates to integers before
applying, this approach makes sense to me. 

Zero shouldn't be the answer unless both a and b are zero: see
http://en.wikipedia.org/wiki/Greatest_common_divisor.

Excel also fails on negative integers, but GCD seems well defined for negative
integers. However, Calc will return negative results sometimes; I'm not sure
that's a good idea. I'd suggest always returning the positive GCD, regardless of
the signs of the arguments.
Comment 2 drking 2007-01-02 06:47:34 UTC
Perhaps it would be useful to talk to whoever implemented this function, if 
that is possible, to find out the thinking behind it? There is another function 
GCD_ADD which should be the same as GCD in Excel; one of the differences with 
Calc GCD is that Calc GCD can handle negative numbers.

If there is to be a difference between Calc and Excel I suppose this is a good 
one, and returning a negative number when the arguments have mixed signs is at 
least defined, even if not in maths textbooks. I'm not very convinced though.

Note that Calc GCD(0,0) currently returns an error so this probably needs 
fixing too (Excel does manage a 0 here).

Also note that GCD_ADD(A1;A2) where:
A1="text", A2=2 returns 2;
A1=2, A2="text"  returns error;
A1="text", A2=-2 returns error;
A1="text", A2 blank returns 0; 
A1 blank, A2 blank returns 0;

so there is some inconsistency that should really be fixed. I think all these 
should return errors as in Excel. [Calc GCD(A1;A2) works OK here]

There seem to be different mathematical definitions of the GCD function. 
MathWorld http://mathworld.wolfram.com/GreatestCommonDivisor.html is usually 
reliable. But it talks about the Mathematica implementation that deals with non-
integers. That might be useful to imitate?
 
It seems to me that we could do with a maths specialist to help decide what is 
really wanted.

Comment 3 dridgway 2007-01-02 18:48:25 UTC
As for sign convention, I guess I don't have a big problem with the convention,
"mixed signs returns negative GCD, same sign returns positive". It has a certain
symmetry. To extend to an arbitrary number of arguments, use even/odd parity. I
think this is what the current implementation does -- if so, the documentation
needs to be clear.

As for nonintegers, Mathworld defines it for rationals, which makes mathematical
sense to me. Under that definition, GCD(1.2;2.4) = GCD(1.2;3.6) = 1.2. It may be
that the implementor was trying for something along these lines, but didn't get
all the way there.

I worry about the difficulties (both for the implementor and potential users) of
the definition on noninteger rationals, because rounding in the conversion
between binary floats and decimal representations may cause unexpected results.
Ditto for doing integer arithmetic algorithms on floats. We'd have to be very
careful about defining and testing the behavior, but without a clear idea of
who'd use it. This isn't a problem for symbolic math packages like Mathematica
which know about rational numbers.

Another source to check is the OpenFormula specification-in-development,
available from
http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula . 
It appears to follow the OO documentation, so it will likely need updating as well.
Comment 4 drking 2007-01-03 08:41:47 UTC
Created attachment 41844 [details]
Current GCD behaviour for non-integers
Comment 5 drking 2007-01-03 08:44:38 UTC
I think you are absolutely right about the difficulties with non-integers. The 
Mathematica manual uses fractions for its examples: GCD(1/2, 3/4) is 1/4 - and 
this is pretty sensible. But with floats we get very messy.

For instance the open-formula draft says that Calc GCD(18.1;24.1) returns zero. 
It doesn't. It returns 0.0000000000001 or something similar.

I've attached a file showing the odd results that Calc produces at present. I 
don't think it is seriously trying to evaluate rationals - it's more likely a 
byproduct of the method of calculating with integers. No user could/should have 
relied on this behaviour - so I'd suggest we would be perfectly safe to abandon 
any pretence and simply truncate any non-integer arguments as Excel does.

We'd then be left with whether to calculate with negative integers and mixed 
pos/neg integers. Unfortunately this behaviour could have been relied on by 
users, so I think we are stuck with it. I'd be pleased if you disagreed...

Gnumeric says its GCD function is the same as Excel, so it could be that Calc 
is then the only one marching in step.

I'd suggest that negative numbers are truncated = rounded towards zero so that 
GCD(3.5;12) and GCD(-3.5;-12) both return 3.

And if we clean up GCD_ADD so that it returns appropriate errors and mimics 
Excel by being Calc new GCD without the negative arguments will we be home and 
dry on this one?

Comment 6 ooo 2007-01-08 15:24:12 UTC
Grabbing issue.
Comment 7 ooo 2007-01-08 19:50:16 UTC
Of course a result of zero is never correct except maybe if both
arguments are zero. Note however that the visible zeros of the testcases
in fact are only rounded for the display value, despite the fact that
GCD for non-integers isn't defined.

Strictly spoken GCD(0;0) is undefined because of the infinite number of
common divisors, however, it can be convenient to define GCD(0;0)=0 as
Excel does.

As for the handling of negative arguments I'm undecided. Mathematically
the relation

|a*b| = GCD(a,b) * LCM(a,b)

is given, so a negative result would play well. Some sources define GCD
to be the largest positive integer, but neither that nor the
non-negative argument seem to be a mathematical restriction.

That relationship btw possibly is the reason why the functions were
implemented the way they are, working on non-integers and delivering the
"strange" results; in the testcases, calculating the LCM of the same
numbers and multiplying with the GCD produces the same amount as
multiplying the numbers. However, the algorithm for finding the GCD for
fractional numbers is not appropriate, as can easily be seen with
GCD(1.2;3.6)

GCD_ADD of course should strictly mimic the Excel behavior.

Even though a GCD of rational numbers isn't defined, a LCM of rational
numbers actually may make sense to compute the recurrence of a wave
form. To comply with the relationship given above a GCD should also be
able to handle it. So either enhance the GCD algorithm or change both to
truncate fractional arguments to integers.
Comment 8 drking 2007-01-08 23:24:15 UTC
Thank you.

LCM doesn't currently work with non-integers eg LCM(0.4;1.2) so you'd probably 
need to tackle that as well if you wanted to have GCD working with non-integers.

The other issue about negative arguments is when they are mixed, eg -12 and +15.
GCD and LCM return negative results - I guess as a matter of definition rather 
than mathematics?

That still obeys  |a*b| = GCD(a,b) * LCM(a,b)
but the | | is then crucial. The formula does appear sometimes as
a*b = GCD(a,b) * LCM(a,b)
in which case it fails
Comment 9 ooo 2007-01-09 15:39:52 UTC
Well, actually LCM(0.4;1.2) "sort of" works, it just doesn't give the result one
would expect, which is because it uses the broken GCD method to determine the
value. LCM(0.4;1.2) * GCD(0.4;1.2) gives 0.48 that is correct.

As for negative results if one of the arguments is negative, a GCD always
returns one of the possible factors. Even GCD(12;3) has two solutions, 4 and -4.
Since the "greatest" semantically somehow implies to choose the positive value,
and especially as some sources say to not regard factorization into negative
numbers and some sources say the result is a positive integer (natural number),
I tend to use that.

For the |a*b|=... relationship, most sources omit the absolute value there,
including the http://en.wikipedia.org/wiki/Greatest_common_divisor page.
However, I think that's wrong. Several sources talk of integer values, not
mentioning positive integers nor natural numbers. The German
http://de.wikipedia.org/wiki/Gr%C3%B6%C3%9Fter_gemeinsamer_Teiler page states
the relationship, but what finally convinced me was that the
"Bronstein/Semendjajew" standard book of mathematics does so as well, and also
mentions only integer numbers for arguments, nothing about positive values.

As a conclusion we should allow negative arguments, truncate fractional values
to integer (we might come up with a proper fractional algorithm later as an
addition if someone feels challenged :) and return always positive numbers.
Comment 10 drking 2007-01-10 05:35:51 UTC
er: "actually LCM(0.4;1.2) "sort of" works, it just doesn't give the result one
would expect"
Enjoyed that, thank you :)

"As a conclusion we should allow negative arguments, truncate fractional values
to integer"
OK

"(we might come up with a proper fractional algorithm later as an
addition if someone feels challenged :)"
Doesn't seem likely does it ;)

"and return always positive numbers."
This is a change in behaviour. That's fine by me because the definition is 
sensible, but might it cause trouble for previous users? It's useful that the 
current behaviour is undocumented. I guess this is your decision to make.

It would also be useful to have GCD(0;0) return zero by definition.

Many thanks.
Comment 11 Martin Hollmichel 2007-11-09 16:51:49 UTC
change target from 2.x to 3.x according to
http://wiki.services.openoffice.org/wiki/Target_3x
Comment 12 ooo 2007-11-14 16:15:44 UTC
ODFF relevant.
Comment 13 ooo 2008-01-15 18:39:52 UTC
For the records: a word on negative arguments, forgot to update this issue back
then:

We discussed that in the OASIS ODFF subcommittee. Though
mathematically valid, for interoperability with other spreadsheet
applications we will not allow negative arguments, but generate an error
in that case. See also
http://lists.oasis-open.org/archives/office-formula/200701/msg00013.html
and discussion thread.
Comment 14 lvyue 2008-01-18 07:47:41 UTC
Created attachment 50953 [details]
A patch for i71158
Comment 15 ooo 2008-01-22 00:26:56 UTC
Hi Lvhue,

The patch looks good in general, but all hunks didn't apply. It looks like you
created the diff against some already modified file where tab indents were
replaced by spaces. Using the patch --ignore-whitespace option helped. Please
submit patches only as diff against an original file revision as present in CVS.
Thanks.

Some detail:

approxFloor() should be called before the argument value is checked for <0.0
because precision rounding errors may lead to very small negative values where
approxFloor() would still result in 0.0

I didn't compile and test the modification yet in the application. Does it meet
all needs as specified in the ODFF draft? Especially the "Given GCD(a;0), a is
returned, so GCD(0;0) is 0.", also with more than two arguments? It looks like
it does, just want to make sure.

Thanks
  Eike
Comment 16 lvyue 2008-01-22 06:26:12 UTC
Created attachment 51067 [details]
Another patch for i71158
Comment 17 lvyue 2008-01-22 07:00:27 UTC
Hi Eike,

Thanks for your comments, I have modified the patch according to that.

For the detail, calling approxFloor() after checking for <0.0 is just because
I thought GCD(-0.0000000000000000000001)=0 looks a littlt strange. :)
I have modified that.

And GCD(a;0) will return a, GCD(0;0) returns 0.

Please check this patch, thanks.
Comment 18 lvyue 2008-02-01 05:40:06 UTC
Created attachment 51301 [details]
A patch based on patch 2, and enhanced the handling of negative.
Comment 19 ooo 2008-05-08 20:55:17 UTC
In cws odff03:

sc/source/core/tool/interpr5.cxx  1.30.20.1

Actually rewrote the current implementation (too different that the patch would
apply) such that it matches more or less the changes of the second patch dated
Jan-22, plus got rid of some clumsiness by replacing
if(first){statement;while(next){statement}} with
if(first){do{statement}while(next);}. Since GCD(0,a)=>a also the
if(bFirst){fy=...}else{fx=...} was superfluous and fx=... applies always. See
cvs diff if all this sounds gibberish to you ;-)

The approxFloor() precision problem is solved with the changes from issue 86775.
Comment 20 ooo 2008-05-08 21:06:12 UTC
Created attachment 53488 [details]
GCD and LCM testcases with Excel result comparison
Comment 21 ooo 2008-05-09 17:13:35 UTC
Reassigning to QA for verification.
Comment 22 drking 2008-05-10 05:16:58 UTC
@er
Thank you!
Comment 23 oc 2008-06-09 10:45:09 UTC
Created attachment 54335 [details]
TestCaseSpecification
Comment 24 oc 2008-06-09 10:51:06 UTC
verified in internal build cws_odff3
Comment 25 andreschnabel 2008-09-09 19:20:30 UTC
ok in OOo3.00RC1->closed