Apache OpenOffice (AOO) Bugzilla – Issue 126720
no text imported from XSLX (xl/SharedStrings.xml instead of xl/sharedStrings.xml)
Last modified: 2023-01-07 17:23:18 UTC
Created attachment 85192 [details] two files with different results of import I have a lot of files created in MS Excel 2007 and saved as .xlsx. When I open them with OpenOffice 4.1.2 Calc there is no text in cells, but numbers and digits are there. I opened such file with MS Excel and saved it as .xls and it was opened correctly with Calc. Both of xlsx and xls will be attached
I have confirmed that the xlsx file does not import the text values with AOO 4.1.2. Also the xls file open correctly in AOO 4.1.2. The xlsx file opens in the Excel Viewer with all the text entries shown. System Configuration: Processor: Intel Core i5 CPU M560 @2.67GHz Installed Memory: 2.00 GB (1.6 usable) Operating System: Windows 7 Home Premium 64 bit Apache Open Office: AOO412m3(Build:9782) - Rev. 1709696 2015-10-21 09:53:29 (Mi, 21 Okt 2015) Language: en_US Additional Language Packs: None
Same issue as 127086, one of the files is named: xl/SharedStrings.xml instead of: xl/sharedStrings.xml If you rename it to a zip file, unzip it, change the filename, zip it back up, rename back to xslx, it opens perfectly, with all the text visible. We should treat OOXML filenames case-insensitively, like Excel and LibreOffice do.
*** Issue 127086 has been marked as a duplicate of this issue. ***
Where in the code does this problem occur, and how can we fix it? main/oox/source/xls/workbookfragment.cxx does this: ---snip--- // read the shared string table substream (requires finalized styles buffer) OUString aSstFragmentPath = getFragmentPathFromFirstType( CREATE_OFFICEDOC_RELATION_TYPE( "sharedStrings" ) ); if( aSstFragmentPath.getLength() > 0 ) importOoxFragment( new SharedStringsFragment( *this, aSstFragmentPath ) ); ---snip--- Debugging that code: Thread 1 hit Breakpoint 1, oox::xls::WorkbookFragment::finalizeImport (this=0x80dc9ed00) at source/xls/workbookfragment.cxx:208 208 if( aSstFragmentPath.getLength() > 0 ) (gdb) print dbg_dump(aSstFragmentPath) $1 = (const sal_Char *) 0x80a0ef168 "xl/sharedStrings.xml" Eventually we get as far as this, trying to open that xl/sharedStrings.xml: #0 OStorage::OpenStreamElement_Impl(rtl::OUString const&, int, unsigned char) (this=this@entry=0x80dca4bc0, aStreamName=..., nOpenMode=nOpenMode@entry=1, bEncr=bEncr@entry=0 '\000') at source/xstor/xstorage.cxx:2204 #1 0x000000080e0761d5 in OStorage::openStreamElement(rtl::OUString const&, int) (this=0x80dca4bc0, aStreamName=..., nOpenMode=1) at source/xstor/xstorage.cxx:2507 #2 0x000000080e076ab2 in non-virtual thunk to OStorage::openStreamElement(rtl::OUString const&, int) () at instsetoo_native/unxfbsdx/Apache_OpenOffice/installed/install/en-US/openoffice4/program/../program/libxstor.so #3 0x000000080e60c795 in oox::ZipStorage::implOpenInputStream(rtl::OUString const&) (this=<optimized out>, rElementName=...) at source/helper/zipstorage.cxx:171 #4 0x000000080e609cb9 in oox::StorageBase::openInputStream(rtl::OUString const&) (this=0x80dc4b030, rStreamName=...) at source/helper/storagebase.cxx:164 #5 0x000000080e609c70 in oox::StorageBase::openInputStream(rtl::OUString const&) (this=0x80db770f0, rStreamName=...) at source/helper/storagebase.cxx:160 #6 0x000000080e4f9889 in oox::core::FilterBase::openInputStream(rtl::OUString const&) const (this=<optimized out>, rStreamName=...) at source/core/filterbase.cxx:370 #7 0x000000080e50340f in oox::core::FragmentHandler::openFragmentStream() const (this=0x80dc73c00) at source/core/fragmenthandler.cxx:123 #8 0x000000080e5096c2 in oox::core::XmlFilterBase::importFragment(rtl::Reference<oox::core::FragmentHandler> const&) (this=0x80daff000, rxHandler=...) at source/core/xmlfilterbase.cxx:208 #9 0x000000080e70ecf3 in oox::xls::WorkbookFragment::finalizeImport() (this=0x80dca4b20) at source/xls/workbookfragment.cxx:209 Then an exception is thrown, because it's not found. Now where best to scan the zip file for names with different casing?
OStorage::openStreamElement() is in main/package, which isn't just used by OOXML but also by ODF (a breakpoint there gets hit many times while loading an ODF too), so I don't like making changes there for an OOXML-specific bug. oox::ZipStorage::implOpenInputStream() seems like a better place.
I've now patched oox::ZipStorage::implOpenInputStream() to do case insensitive filenames matching when case sensitive fails, and it gets this file to open successfully and all the text shows. Fixed by commit 0f42b9a04e21324973f03349bb2929327cf84a20. Resolving FIXED :). Thank you for your bug report and sample file!
Cherry-picked for AOO42X with: https://github.com/apache/openoffice/commit/bd3f92fa7151c22b06c065512cbefd13960d9f7c
Cherry-picked for AOO41X with: https://github.com/apache/openoffice/commit/25c6f4b735608c9ccf2d582718536ff7c9470ddd