26565 – Natural Sort Option in Sort Dialog

Issue 26565 - Natural Sort Option in Sort Dialog

Summary: Natural Sort Option in Sort Dialog

Status:	CONFIRMED

Alias:	None

Product:	Calc
Classification:	Application
Component:	code (show other issues)
Version:	OOo 1.1.1
Hardware:	All All

Importance:	P3 Trivial with 1 vote (vote)
Target Milestone:	---
Assignee:	AOO issues mailing list
QA Contact:

URL:	http://kohei.us/ooo/nsort/
Keywords:

Depends on:
Blocks:	63864
	Show dependency tree

Reported:	2004-03-16 14:44 UTC by kyoshida
Modified:	2017-05-20 11:11 UTC (History)
CC List:	16 users (show)

See Also:
Issue Type:	PATCH
Latest Confirmation in:	---
Developer Difficulty:	---

Attachments
Preliminary implementation against 111fix2 (26.44 KB, patch) 2004-05-11 18:18 UTC, kyoshida	no flags	Details \| Diff
Feature complete patch against OpenOffice 1.1.1 final (30.92 KB, patch) 2004-05-27 03:30 UTC, kyoshida	no flags	Details \| Diff
Removed one-line printf from global.cxx (30.86 KB, patch) 2004-05-27 03:58 UTC, kyoshida	no flags	Details \| Diff
Updated patch that removes the redundant XCharacterClassification instance (26.88 KB, patch) 2004-06-08 06:10 UTC, kyoshida	no flags	Details \| Diff
patch re-issued for OpenOffice_1_1_3 (27.04 KB, patch) 2004-10-07 02:35 UTC, kyoshida	no flags	Details \| Diff
Patch for SRC680_m84 (22.08 KB, patch) 2005-03-13 04:24 UTC, kyoshida	no flags	Details \| Diff
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.

Description kyoshida 2004-03-16 14:44:03 UTC

When cells containing string-prefixed numbers (e.g. A34) are sorted using the
current sort function found under Tool - Sort, the result is often not what one
would normally expect because the cell contents are compared as strings despite
the presence of the numeric component.  For instance, when cells containing a
series of A1, A2, A3, ... ,A19, A20 are sorted using the regular sorting
algorithm, the result will be A1, A10, A11, ... ,A19, A2, A20, A3, A4, ..., A8,
A9.  This is because the numeric part of a string is evaluated digit by digit,
which inadvertently declares A2 to be greater than A19, A3 greater than A20, and
so on.  This is simply not the natural human way of doing the sort; therefore
the natural sorting algorithm needs to be introduced to solve this problem.

Comment 1 kyoshida 2004-05-11 18:18:37 UTC

Created attachment 15180 [details]
Preliminary implementation against 111fix2

Comment 2 kyoshida 2004-05-27 03:30:53 UTC

Created attachment 15491 [details]
Feature complete patch against OpenOffice 1.1.1 final

Comment 3 kyoshida 2004-05-27 03:38:28 UTC

The above patch enables case sensitivity and user defined sort list in natural
sort algorithm.  The number of helper functions has been reduced from three to
two by use of XCharacterClassification::parseAnyToken(...).  It also users
rtl::math::stringToDouble instead of rtl::OUString::toDouble in order to handle
conversion in multiple locales.  The reference to NativeNumberWrapper has been
removed as its use in the algorithm will degrade sorting performace to a great
degree.

As far as I'm concerned, this patch is ready to go.

Kohei

Comment 4 kyoshida 2004-05-27 03:58:49 UTC

Created attachment 15492 [details]
Removed one-line printf from global.cxx

Comment 5 kyoshida 2004-05-28 01:25:08 UTC

As Eike commented on dev@sc.ooo, I will try to use the existing 
ScGlobal::pCharClass instead of creating a whole new XCharacterClassification 
interface to implemnt the same algorithm.  Stay tuned.

Comment 6 kyoshida 2004-06-08 06:10:37 UTC

Created attachment 15728 [details]
Updated patch that removes the redundant XCharacterClassification instance

Comment 7 kyoshida 2004-06-08 14:18:24 UTC

This last patch will probably be the last patch for the 1.1 branch unless I 
find a bug within my code.  I plan to issue another patch for the 680 branch 
sometime in the future. 
 
Kohei

Comment 8 kyoshida 2004-06-30 15:30:46 UTC

This effort is already started.

Comment 9 kyoshida 2004-10-07 02:35:56 UTC

Created attachment 18188 [details]
patch re-issued for OpenOffice_1_1_3

Comment 10 frank 2004-10-07 07:54:59 UTC

Hi Kohei,

please open a new Issue and attach your patch to this, as OOo1.1.3 is currently
available, the target should be OOo1.1.4.

Frank

Comment 11 kyoshida 2004-10-07 14:35:26 UTC

Hi Frank,

Actually Niklas said that this feature will have to wait till post 2.0, which
means the target milestone is still a moving target (sorry about the pun ;). 
So, I prefer using this issue to hold the patch I just submitted.  The
re-issuance of this patch is mainly for those who want to try out this feature
in the stable branch.

If this is a problem, just let me know so I can go ahead and create a new issue
for the patch.  I'm okay either way. :)

Kohei

Comment 12 frank 2004-10-08 10:35:26 UTC

Hi Kohei,

as the decission was made to not include the patch for now, there is no need
anymore for a new Issue with target 1.1.4.

Thanks.

Frank

Comment 13 soren 2005-01-27 21:23:49 UTC

Hi Kohei,

I have been taking a look at your work on "natural" sorting in the spreadsheet. 
I have a couple of observations that may be of interest: 

1. The "natural" sort you are implementing always reads numbers as floating 
point. It may be preferable to be able to read numbers as integers as well 
(regarding decimal points etc. as common text). Imagine for instance that you 
want to read a sequence of OpenOffice source-file version numbers, e.g. 1.8 1.9, 
1.10 1.11, which would still be sorted "unnaturally", as 1.10, 1.11, 1.8, 1.9. 
Also, when trying to sort other types of version numbers, e.g. 1.1.3, 1.1.4 etc. 
the comparison algorithm would probably throw an exception when trying to read 
the strings as floating point numbers.

2. If I have read your changes correctly, the new comparison function will 
always use the standard locale decimal points and thousand separators when 
parsing floting point numbers. Maybe it should be possible for the user to 
change this, for instance when importing files from other countries (the number 
formats change, but the strings do not).

Regards, Søren

Comment 14 kyoshida 2005-01-29 00:24:02 UTC

Hi Søren, thank you for your feedback.

I agree with you on both points.  With regard to 1, I should probably change it
so that, if there is more than one decimal point it will fall back to treating
the decimal separator as string.  If there is only one decimal point, then it
will be up to the user.

I will look into incorporating your observations when I port this code to the
680 base.

Regards,

Kohei

Comment 15 kyoshida 2005-03-13 04:24:23 UTC

Created attachment 23753 [details]
Patch for SRC680_m84

Comment 16 kyoshida 2005-03-13 04:26:47 UTC

New patch re-issued for SRC680_m84.  It also addresses Søren's first point,
which turned out to be due to an incorrect use of the parseAnyToken method.  It
should also yield a slightly better performance.

Comment 17 kyoshida 2005-08-19 05:30:12 UTC

Added the external project URL.

I also want to set the target milestone to 2.1 (or 3.0, whichever it will be),
but such is not available yet. :(

Comment 18 kyoshida 2005-08-25 19:54:17 UTC

Set target milestone to "not determined".  This will be changed as soon as a
numbered target milestone for the next non-micro release becomes available.

Comment 19 kyoshida 2005-08-30 20:26:32 UTC

Tenatively setting target milestone to 3.0, and adding Falko to CC.

Comment 20 kyoshida 2005-08-30 20:27:48 UTC

The spec file is available here:

http://specs.openoffice.org/calc/ease-of-use/natural_sort_algorithm.sxw

Comment 21 kami911 2006-03-05 15:18:28 UTC

We have working builds from 2.0.2 I will publise the links when it is
available... Could you target it for 2.0.3?

Comment 22 kyoshida 2006-03-06 15:03:33 UTC

setting target to 2.0.3 as kami_ requested.

Kohei

Comment 23 Uwe Fischer 2006-04-04 14:59:41 UTC

added "ufi" on cc for online help

Comment 24 pavel 2006-05-04 06:45:59 UTC

Setting the target means we have people assigned to work on the cws, QA etc.

Who will do all this?

Comment 25 kyoshida 2006-05-04 17:20:18 UTC

nn or er: IIRC this feature cannot be integrated because of the file format
issue.  Is this correct?  If so, should we abandon this feature?

Comment 26 mmeeks 2006-05-04 18:03:29 UTC

Stefan - an interesting patch with a spec [!?] blocking on file format work &
apparently interest from Sun ...

Comment 27 niklas.nebel 2006-05-04 18:12:37 UTC

In the future, there will certainly be features that require additional
information in the file format, so there will have to be a solution, and there's
no need to abandon this feature. As soon as we have a general procedure for file
format extensions, this one can be completed quite quickly.

Comment 28 kyoshida 2006-06-08 04:26:55 UTC

reassigning this issue to nn so that this issue will get properly tracked.

Comment 29 rail_ooo 2006-09-12 06:13:30 UTC

I think it will be great if this feature can sort IP addressess as well.
Example:

Before:
192.168.1.1
192.168.1.10
192.168.1.2

After:
192.168.1.1
192.168.1.2
192.168.1.10

Comment 30 niklas.nebel 2006-10-13 16:24:19 UTC

changing target

Comment 31 niklas.nebel 2007-02-26 12:50:11 UTC

Just an update to show this hasn't been forgotten: The file format extension has
been proposed to the OASIS TC.

Comment 32 niklas.nebel 2007-05-24 12:49:56 UTC

With the file format issue coming to a solution, we need agreement from User
Experience for this. Frank, can you take a look at the spec
(http://specs.openoffice.org/calc/ease-of-use/natural_sort_algorithm.sxw)?

Comment 33 kyoshida 2007-08-02 20:51:05 UTC

relevant ODF Change proposal.

http://lists.oasis-open.org/archives/office/200702/msg00047.html

Comment 34 niklas.nebel 2007-12-04 18:25:34 UTC

Target 3.0

Comment 35 niklas.nebel 2008-05-29 18:22:27 UTC

It's too late now for UI changes for 3.0.
I know this has been open far too long, and I'll try to get it into 3.1.

Comment 36 niklas.nebel 2009-10-14 17:04:23 UTC

adjusting target

Comment 37 kyoshida 2010-02-13 18:33:24 UTC

*** Issue 109227 has been marked as a duplicate of this issue. ***

Comment 38 niklas.nebel 2010-08-25 15:41:10 UTC

retargeting to 3.4 for time reasons

Comment 39 Martin Hollmichel 2011-03-15 12:50:25 UTC

set target to 3.x since not release relevant for 3.4.

Comment 40 Rob Weir 2013-03-11 14:58:27 UTC

I'm adding this comment to all open issues with Issue Type == PATCH.  We have 220 such issues, many of them quite old.  I apologize for that.  

We need your help in prioritizing which patches should be integrated into our next release, Apache OpenOffice 4.0.

If you have submitted a patch and think it is applicable for AOO 4.0, please respond with a comment to let us know.

On the other hand, if the patch is no longer relevant, please let us know that as well.

If you have any general questions or want to discuss this further, please send a note to our dev mailing list:  dev@openoffice.apache.org

Thanks!

-Rob

Comment 41 Marcus 2017-05-20 11:11:31 UTC

Reset assigne to the default "issues@openoffice.apache.org".