Apache OpenOffice (AOO) Bugzilla – Issue 69482
Breakiterator: mismatch of nextWord and getWordBoundary
Last modified: 2013-08-07 15:02:59 UTC
There are some oddities where nextWord and getBoundary do not match up as expected. (See macro in attached document)
Created attachment 39095 [details] Sample docuemtn with macro
.
Basically the macro starts to call nextWord for a given index and then calls getWordBoundary with the starting index returned by nextWord. In most cases one would expect that the word boundaries will be the same. This is not always as expected. Also there some unexpected results about the word boundary chosen. For example: - for index 0..2: the boundary is [3,6[ which puts "(11" in a word one would expect 11 to be a word on it's own - for index 3: nextWord returns [4,5[ which is only the single '1' instead of both of them. Also getWordBoundary results in [3,6[ whichis quit´unexpected since according to nextWord the Word should start at pos 4. (This one actualy resulted in Calc freezing because of non-optimal code in svx. See issue 69416) - for index 4: [5,6[ is returned (the second 1 in 11). Shouldn't that have been the '/' character? Also nextWord and getWordBoundary do not match up once again. - for index 5: ok - for index 6: returns [7, 9[ thus denotes "8)". Again the ')' should probably not included with the number. - for index 8: shoud have been [8,9[ that is the ')' char. At least if that char should not be included with the '8'. TL->Karl: Please have a look. Thanks!
TL: wrong owner, reassigning...
Summmary corrected.
TL->Karl: I just discussed the above mentioned Calc issue with FST and we agreed that it would be best if you fix this issue in the same CWS. Thus please fix it in CWS tl29.
fixed in cws tl29.
TL->Karl: The issue is fixed for the very version the macro checks in it's current form. But if you just modify the macro slightly e.g. by using other WordTypes (1, 2 and 3) or change the language by using "de-DE" as locale and "ab (11/8)" the problem still exists. Can you fix it in time for OOo 2.0.1?
Created attachment 39358 [details] new extended version of previous test
Hi Karl, can you fix the problem described by Thoas' latest comment in time for 2.1 ? I would like to have cws tl29 closed asap. Thanks Frank
Hi Frank, Sorry, I have fixed it last month, but forget to update issue. The updated file is i18npool/source/breakiterator/breakiterator_cjk.cxx. Karl.
fixed.
Hi Frank, I missed Thomas last comment about test for other language, I will take a look today. Thanks, Karl.
Hi Thomas, I just download window build from cws tl29 and test macro in BI2, I don't see the problem. Final result 'mismatchs found: 0'. Karl.
Hi, checked again and found fixed. Don't know what has been wrong on the test before. Frank
found fixed on master using Solaris, Windows and Linux build