summaryrefslogtreecommitdiff
path: root/sal
AgeCommit message (Collapse)AuthorFilesLines
2018-02-26Use long path prefix in osl_getFileStatusSamuel Mehrbrodt1-4/+12
When installing an extension e.g., paths can get very long and they hit the 255 char limit, thus the installation fails. So we need to prefix the path with the long file name prefix when its longer than MAX_PATH for windows api calls to succeed. Change-Id: Ie62644192ba40a9d4802772cd9837fc84fae947a Reviewed-on: https://gerrit.libreoffice.org/50079 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 50bf4eec6ff3cb3db23484331965f2e32cb0b5bc) Reviewed-on: https://gerrit.libreoffice.org/50228
2018-02-08ofz#6112 wrong start off sets for korean KSC5601 tableCaolán McNamara1-2/+2
Change-Id: If986352478f34f54015f1969c97c26e2ef05c06c Reviewed-on: https://gerrit.libreoffice.org/49445 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2018-02-04tdf#49134 tdf#114466 Transfer privilege to become foreground processMike Kaganski1-0/+8
... to already open soffice process from newly spawned one on Windows. When an application takes user input, a timeout is started during which other processes cannot create foreground windows that might steal focus, and thus interrupt user input. The timeout is defined by SPI_SETFOREGROUNDLOCKTIMEOUT (see SystemParametersInfo) and ForegroundLockTimeout registry setting (see https://technet.microsoft.com/en-us/library/cc957208). If an application that currently doesn't have right to become foreground tries to show popups in this interval, the popup will stay on background, and only flash in taskbar. The application that has the right to steal focus (see the list in https://msdn.microsoft.com/en-us/library/ms632668) may transfer its right to another process using AllowSetForegroundWindow function. So, the intended effect is this: 1. User interacts with some foreground process (e.g., Explorer); a timeout is started to prevent non-privileged processes from stealing focus; 2. As the result, the process launches a new soffice process, which has privilege to create foreground windows (as it is started by foreground process); 3. It communicates with already started soffice process, which is currently in background, and so doesn't have privilege to create foreground windows until timeout expires; 4. It transfers its right to the already started soffice process, and then issues the required commands that might lead to need to show popup windows. Change-Id: I4208665c2ae4106fa06e72269f4c3804af40d582 Reviewed-on: https://gerrit.libreoffice.org/48839 Reviewed-by: Michael Meeks <michael.meeks@collabora.com> Tested-by: Mike Kaganski <mike.kaganski@collabora.com> (cherry picked from commit 8d1d82dd63eada8faa2f6eb43ef900764a5fda62) Reviewed-on: https://gerrit.libreoffice.org/48853 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Andras Timar <andras.timar@collabora.com>
2018-01-17tdf#114939 sal: fix endMD5() off-by-oneMichael Stahl2-1/+93
Because of the odd non-standard rtl_digest_rawMD5() API that is apparently necessary for MS Office interop, and there not being any good reason for bug-compatibility here, just fix the bug. Change-Id: Iaa0f0af4e24a5ddb9113c1ebd126f9822b5af1f6 (cherry picked from commit 18b022cadfa590df9dbefe0433b58838bcc3d2af) Reviewed-on: https://gerrit.libreoffice.org/48019 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2017-11-23loplugin:simplifybool can't invert conditions involving float typesNoel Grandin1-1/+1
so revert some of the changes from commit 7a1c21e53fc4733a4bb52282ce0098fcc085ab0e loplugin:simplifybool for negation of comparison operator Change-Id: I937d575b86c1e418805d399b0dc16ae91876b4fe Reviewed-on: https://gerrit.libreoffice.org/45130 Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk> Tested-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-11-23loplugin:simplifybool for negation of comparison operatorNoel Grandin3-11/+11
Change-Id: Ie56daf560185274754afbc7a09c432b5c2793791 Reviewed-on: https://gerrit.libreoffice.org/45068 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-11-22ofz#4366 Divide-by-zeroStephan Bergmann1-2/+10
Change-Id: I3d0eb3bb6a69d09e71ce8bf91051f66e204eb0df Reviewed-on: https://gerrit.libreoffice.org/45098 Reviewed-by: Eike Rathke <erack@redhat.com> Tested-by: Jenkins <ci@libreoffice.org>
2017-11-22Make loplugin:unnecessaryparen warn about (x) ? ... : ... after allStephan Bergmann1-1/+1
...which had been left out because "lots of our code uses this style, which I'm loathe to bulk-fix as yet", but now in <https://gerrit.libreoffice.org/#/c/45060/1/> "use std::unique_ptr" would have caused an otherwise innocent-looking code change to trigger a loplugin:unnecessaryparen warning for pFormat = (pGrfObj) ? ... (barring a change to ignoreAllImplicit in compilerplugins/clang/unnecessaryparen.cxx similar to that in <https://gerrit.libreoffice.org/#/c/45083/2> "Make not warning about !! in loplugin:simplifybool consistent", which should also have caused the warning to disappear for the modified code, IIUC). Change-Id: I8bff0cc11bbb839ef06d07b8d9237f150804fec2 Reviewed-on: https://gerrit.libreoffice.org/45088 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2017-11-21Disable custom allocatorDennis Francis1-1/+1
This has big positive effect on software interpreter threading performance scaling. Change-Id: I8fbb6bf8f7ed410fd53278acee63bf65f13bac38
2017-11-21Fix typosAndrea Gelmini2-6/+6
Change-Id: I40b3a46d46f0586d086bdbe41876c088f8c1cb58 Reviewed-on: https://gerrit.libreoffice.org/45007 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Jens Carl <j.carl43@gmx.de>
2017-11-20look for =() in loplugin:unnecessaryparenNoel Grandin1-1/+1
Change-Id: I4f9b71ff7767e90987bb40358fc46ed5d1d571d0 Reviewed-on: https://gerrit.libreoffice.org/44944 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-11-13Fix typosAndrea Gelmini1-1/+1
Change-Id: I677512213d97d01832bebe162fbf7de2998bf4d0 Reviewed-on: https://gerrit.libreoffice.org/44664 Reviewed-by: Julien Nabet <serval2412@yahoo.fr> Tested-by: Julien Nabet <serval2412@yahoo.fr>
2017-11-13IsValidFilePath: fix correction of double path delimitersMike Kaganski1-1/+3
Wuthout this fix, it had always tried to replace in original path, thus second occurence of double path delimiter, like in C:\dir1\\dir2\dir3\\file would result in invalid path reported. Change-Id: I63ce97b620229601e18f10016a759275aceeec4d Reviewed-on: https://gerrit.libreoffice.org/44675 Reviewed-by: Stephan Bergmann <sbergman@redhat.com> Tested-by: Jenkins <ci@libreoffice.org>
2017-11-13Fix typosAndrea Gelmini1-2/+2
Change-Id: Ia544298334364ece3b3963a4adc00c5e01189b91 Reviewed-on: https://gerrit.libreoffice.org/44654 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mark Page <aptitude@btconnect.com>
2017-11-11Avoid using O[U]StringConcat lvalues containing dangling refs to temporariesStephan Bergmann2-0/+14
...in code accidentally using auto like > auto const aURL = uri->getUriReference() + "/" > + INetURLObject::encode( > m_sEmbeddedName, INetURLObject::PART_FPATH, > INetURLObject::EncodeMechanism::All); > > uno::Reference<uno::XInterface> xDataSource(xDatabaseContext->getByName(aURL), uno::UNO_QUERY); in <https://gerrit.libreoffice.org/#/c/44569/1> "Properly construct vnd.sun.star.pkg URL" did (causing hard to debug test failures there). So make functions taking O[U]StringConcat take those by rvalue reference. Unfortunately, that also needed adaption of various functions that just forward their arguments. And some code in sc/qa/unit/ucalc_formula.cxx used CPPUNIT_ASSERT_EQUAL on OUStringConcat arguments in cases where that happened to actually compile (because the structure of the two OUStringConcats was identical), which needed adaption too (but which would arguably better use CPPUNIT_ASSERT_EQUAL_MESSAGE, anyway). Change-Id: I8994d932aaedb2a491c7c81c167e93379d4fb6e3 Reviewed-on: https://gerrit.libreoffice.org/44608 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2017-11-10rtl: change nullptr comparisonChris Sherlock1-2/+2
Change-Id: I0c356b100b732e259646d4ebb5d0aedd4dc4bcdd Reviewed-on: https://gerrit.libreoffice.org/44302 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Bartosz Kosiorek <gang65@poczta.onet.pl>
2017-11-04Make Windows error reporting more robustMike Kaganski1-7/+2
https://msdn.microsoft.com/en-us/library/ms679351 describes that "it is unsafe to take an arbitrary system error code returned from an API and use FORMAT_MESSAGE_FROM_SYSTEM without FORMAT_MESSAGE_IGNORE_INSERTS" Previously in case when an error string would contain inserts, function returned error, so the error message wasn't shown (at least it didn't crash, thanks to nullptr as the function's last argument). As the function may fail, we now pre-nullify the buffer pointer to avoid dereferencing uninitialized pointer later (though at least for some Windows versions, the function nullifies the pointer in case of FORMAT_MESSAGE_ALLOCATE_BUFFER, but there's no explicit guarantee of this). Also release of allocated buffer is changed to recommended use of HeapFree. The code that doesn't make use of OUString is left directly calling FormatMessage, to avoid introducing new dependencies. Where it makes sense, we now use WindowsErrorString from <comphelper/windowserrorstring.hxx> Change-Id: I834c08eb6d92987e7d3d01e2c36ec55e42aea848 Reviewed-on: https://gerrit.libreoffice.org/44206 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-11-02Avoid warning thrashStephan Bergmann1-2/+1
...between "error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]" (in some builds, for the original code) and "error: fallthrough annotation in unreachable code [-Werror,-Wimplicit-fallthrough]" (in other builds, after adding SAL_FALLTHROUGH). Change-Id: I75bbefd69c81f43f22117f8ad110237de24e18af
2017-11-02Clean up oslTranslateFileErrorStephan Bergmann2-105/+53
Change-Id: I004bcba518ead9c149f1671d62aa94986a38945d
2017-11-02coverity#1420539: dead codeStephan Bergmann1-7/+3
...after 0b413caadfbe68b278ca5ba33b6d204687586ea9 "loplugin:constantparam in sal,sax" Change-Id: Idf000bd1e0a261eac9ec0afbd2fb6f4c4ef8c7dc
2017-11-01-I$(dir $(3)) in gb_CObject__command_pattern is no longer neededStephan Bergmann1-1/+1
...at least in com_GCC_class.mk (com_MSC_class.mk will be addressed in a follow- up commit), after the recent loplugin:includeform clean-up. Two static libraries built from external sources needed adjustment, two compilerplugin tests needed adjustment (which wasn't found by loplugin:includeform, by design), and one more adjustment in sal/textenc/generate/. Change-Id: Idad5ae355a02ae130369a9a45b5f5925ab48ffef Reviewed-on: https://gerrit.libreoffice.org/44174 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2017-10-31loplugin:constantparam in sal,saxNoel Grandin7-66/+57
Change-Id: I7ca2fd05d1cf61f9038c529a853e72fedb1c9ed0 Reviewed-on: https://gerrit.libreoffice.org/44087 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-10-29tdf#94695 Replace gethostbyaddr with getnameinfoArkadiy Illarionov1-43/+67
Change-Id: I7ac99a6f470998364e9e43b749a0914d8a3fc096 Reviewed-on: https://gerrit.libreoffice.org/42769 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Thorsten Behrens <Thorsten.Behrens@CIB.de> Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2017-10-27loplugin:includeform: sal (Windows)Stephan Bergmann13-21/+21
Change-Id: Id60bcfadbfdf4b37f276159b12360347bde30a2e
2017-10-27-Werror,-Wtautological-constant-compare (clang-cl)Stephan Bergmann2-20/+41
...fixed similarly to recent fixes to sal/osl/unx/file.cxx Change-Id: I2c82366095e156cd0085a8f60f54f8c822655dcc
2017-10-26cid#1420316: Try silence CONSTANT_EXPRESSION_RESULTStephan Bergmann1-0/+1
(or does the comment need to go exactly at the line before the one containing the "return" token?) Change-Id: I5f92e59ee05d0b5ba3d6bda775e6ca02f185dbe8
2017-10-25ofz#3789 Integer-overflowCaolán McNamara1-1/+4
with input 69e9223372036854775807 Change-Id: Iaf5a85170d144be2db5604340d784a8982754e08 Reviewed-on: https://gerrit.libreoffice.org/43815 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2017-10-25Also check whether negative uOffset exceeds std::numeric_limits<off_t>::min()Stephan Bergmann1-6/+10
...in the one case where uOffset is of signed type sal_Int64 Change-Id: I626f747f70fb720bcc5a4ee10fbce30fc0673ae9
2017-10-24Make testUtf8StringLiterals work when char is unsignedStephan Bergmann1-3/+3
...as is reportedly the case for Linux AArch64 Change-Id: I7e11c42f4437c8aad9dd734603fa7e0d458c9754
2017-10-24loplugin:includeform: sal (macOS)Stephan Bergmann1-1/+1
Change-Id: Ic4d73e1fc328aecd897163b71edf0b4c9b8a090b
2017-10-24loplugin:includeform: compat.cxx reduxStephan Bergmann1-6/+6
Change-Id: I935c5b4a219f3b673f62d8788504de85b53a47ca
2017-10-23loplugin:includeform: salStephan Bergmann122-371/+371
Change-Id: I539ca8b9dee5edc5fc2282a2b9b0ffd78bad8b11
2017-10-23overload std::hash for OUString and OStringNoel Grandin1-2/+1
no need to explicitly specify it anymore Change-Id: I6ad9259cce77201fdd75152533f5151aae83e9ec Reviewed-on: https://gerrit.libreoffice.org/43567 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-10-20Comment some values that 16 doesn't workEike Rathke1-2/+2
One of them was removed with e6611cc2ef5960e9f32c56da44fafd02446f53e6, reintroduce to prevent running into a dead end again. 16 or 17 digits will need some different approach. Change-Id: Iec213ed857121e323e13ee83763c51fa563c794f
2017-10-20coverity#1419948 silence Operands don't affect resultCaolán McNamara1-1/+1
Change-Id: I68c42f6d9c6cfda90446a07ecd6bbf02abb22af9 Reviewed-on: https://gerrit.libreoffice.org/43593 Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2017-10-20loplugin:constmethod in codemaker,registry,storeNoel Grandin1-1/+1
Change-Id: Ie75875974f054ff79bd64b1c261e79e2b78eb7fc Reviewed-on: https://gerrit.libreoffice.org/43540 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-10-19tdf#113211: fix calculations with big integersMike Kaganski1-1/+52
... and munbers with few fractional bits Change-Id: I86c3e8021e803fed498fae768ded9c9e5337c8bd Reviewed-on: https://gerrit.libreoffice.org/43477 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Eike Rathke <erack@redhat.com>
2017-10-16Rephrase checks for exceeding off_t limitsStephan Bergmann1-10/+11
...in a way that, by using a function template, avoids new Clang 6 -Werror,-Wtautological-constant-compare when sal_Int64 and off_t are the same type. Change-Id: I06d4feb8ba8cb163e5208f107c28959755ff5633
2017-10-16Fix overflowStephan Bergmann1-2/+4
Change-Id: I1ddd1f7d749a99395fbd7be433db5884536bf7a7
2017-10-05Rename and move SAL_U/W to o3tl::toU/WMike Kaganski13-173/+186
Previosly (since commit 9ac98e6e3488e434bf4864ecfb13a121784f640b) it was expected to gradually remove SAL_U/W usage in Windows code by replacing with reinterpret_cast or changing to some bettertypes. But as it's useful to make use of fact that LibreOffice and Windows use compatible representation of strings, this commit puts these functions to a better-suited o3tl, and recommends that the functions be consistently used throughout Windows-specific code to reflect the compatibility and keep the casts safe. Change-Id: I2f7c65606d0e2d0c01a00f08812bb4ab7659c5f6 Reviewed-on: https://gerrit.libreoffice.org/43150 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-10-04lookupProfile: drop unreachable codeMike Kaganski1-207/+150
The check for -userid switch has never worked - its condition was broken since initial commit 9399c662f36c385b0c705eb34e636a9aec450282 Since this wasn't caught earlier, the functionality is unused (and was deprecated anyway - see https://wiki.openoffice.org/wiki/Framework/Article/Command_Line_Arguments), this commit just drops the code. Change-Id: Iae79f9cb7db454d72b11fb3954ebc456c7207d96 Reviewed-on: https://gerrit.libreoffice.org/43123 Reviewed-by: Stephan Bergmann <sbergman@redhat.com> Tested-by: Jenkins <ci@libreoffice.org>
2017-10-03Replace more reinterpret_cast with SAL_W/SAL_UMike Kaganski3-15/+13
Change-Id: Ia632e4083222ad9e7f17c2ad0d0825f189c700cc Reviewed-on: https://gerrit.libreoffice.org/43071 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-10-03new loplugin:blockblockNoel Grandin1-14/+12
Change-Id: I7b68b70fa4c7234e8882f7627026959a596968fd Reviewed-on: https://gerrit.libreoffice.org/43025 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Noel Grandin <noel.grandin@collabora.co.uk>
2017-09-30Use explicit function names for fooA/fooW WinAPI; prefer fooWMike Kaganski1-5/+9
We should only use generic foo function name when it takes params that are also dependent on UNICODE define, like LoadCursor( nullptr, IDC_ARROW ) where IDC_ARROW is defined in MSVC headers synchronised with LoadCursor definition. We should always use Unicode API for any file paths operations, because otherwise we will get "?" for any character in path that is not in current non-unicode codepage, which will result in failed file operations. Change-Id: I3a7f453ca0f893002d8a9764318919709fd8b633 Reviewed-on: https://gerrit.libreoffice.org/42935 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-09-28Missing includes (--disable-pch)Stephan Bergmann1-0/+1
Change-Id: Iaa87663255f815e4f837df25d5338439d79c70dd
2017-09-28Warn about missing text converterStephan Bergmann1-0/+3
(probably because the encoding is RTL_TEXTENCODING_DONTKNOW; that may still happen in various places, so can't use anything stronger than SAL_WARN here) Change-Id: I75993c9ad629104055c171389ed7708649747d9b
2017-09-27SAL: use more Unicode on WindowsMike Kaganski14-116/+183
Change-Id: I9f54c8e8c4e617cc1ed6b436ca8c162d381ecab3 Reviewed-on: https://gerrit.libreoffice.org/42828 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
2017-09-26ofz#3186: wrong starting offset for JOHAB 0x6D blockCaolán McNamara1-1/+1
Change-Id: I4de6d9d781b2f2313d8fd338b34dcb31434efe91 Reviewed-on: https://gerrit.libreoffice.org/41638 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Caolán McNamara <caolanm@redhat.com> Tested-by: Caolán McNamara <caolanm@redhat.com>
2017-09-25Fix typosAndrea Gelmini1-1/+1
Change-Id: I879a52820d78d9151ef64dd21612379f617f66e2 Reviewed-on: https://gerrit.libreoffice.org/42726 Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com> Tested-by: Tamás Zolnai <tamas.zolnai@collabora.com>
2017-09-24Map Windows code page 42 to RTL_TEXTENCODING_SYMBOLStephan Bergmann2-0/+3
<https://msdn.microsoft.com/en-us/library/windows/desktop/ dd374130(v=vs.85).aspx> "WideCharToMultiByte function" suggests that there now is CP_SYMBOL, "Windows 2000: Symbol code page (42)." And a little test program on Windows indicates that our RTL_TEXTENCODING_SYMBOL is working the same way as CP_SYMBOL, where MultiByteToWideChar maps 00..1F to U+0000..1F and 20..FF to U+F020..F0FF. At least CppunitTest_writerfilter_rtftok, when testing writerfilter/qa/cppunittests/rtftok/data/pass/EDB-18940-1.rtf, goes into case RTF_FCHARSET in RTFDocumentImpl::dispatchValue (writerfilter/source/rtftok/rtfdispatchvalue.cxx) with nParam matching aRTFEncodings[2] (i.e., a mapping from charset 2 to codepage 42, see writerfilter/source/rtftok/rtfcharsets.cxx), then passes 42 into rtl_getTextEncodingFromWindowsCodePage and obtains an unhelpful RTL_TEXTENCODING_DONTKNOW. testFdo72031 (sw/qa/extras/rtfexport/rtfexport2.cxx, CppunitTest_sw_rtfexport2) needed to be adapted, as the circled plus from the Symbol font is now internally represented as U+F0C5, not (somewhat bogusly) as U+00C5 (aka LATIN CAPTIAL LETTER A WITH RING ABOVE). But, when displayed with the Symobl font, the glyph that is actually shown remains the circled plus. Turns out changing rtl_getTextEncodingFromWindowsCodePage would start to make CppunitTest_sw_rtfimport fail: Sep 20 15:49:24 <sberg> vmiklos, with <https://gerrit.libreoffice.org/#/c/42477/>, testN823675 (sw/qa/extras/rtfimport/rtfimport.cxx) fails, the aFont.Name is not "Symbol"; sw/qa/extras/rtfimport/data/n823675.rtf contains a \fonttbl that specifies \f3 to have \fcharset2 (i.e., symbol font) and fontname "Symbol". However, RTFDocumentImpl::checkUnicode (writerfilter/source/rtftok/rtfdocumentimpl.cxx) converts m_aHexBuffer (containing "Symbol;") with nCurrentEncoding apparently being the encoding specified by \fcharset2 (i.e., now RTL_TEXTENCODING_SYMBOL instead of old RTL_TEXTENCODING_DONTKNOW), so the resulting OUString is garbage (instead of the byte-for-byte conversion to Unicode "Symbol;" that RTL_TEXTENCODING_DONTKNOW would do there); do you know whether such \fonttbl fontnames should actually be interpreted in the given \fcharset? Sep 20 15:49:24 <IZBot> gerrit: »Map Windows code page 42 to RTL_TEXTENCODING_SYMBOL« by Stephan Bergmann for master [NEW] Sep 20 15:51:15 <vmiklos> sberg: let me check if the spec covers that Sep 20 15:54:29 <mst_> sberg: i think the name is typically encoded in the font's encoding but probably they have to make a (likely undocumented) exception for symbol encoding Sep 20 15:57:46 <vmiklos> sberg: the spec only says that \fcharset is about the encoding of the content using that font, i don't see it described what would be the encoding of the font name itself Sep 20 15:58:51 <vmiklos> sberg: i'm not sure about if that encoding should or should not affect the encoding of the font name in general, but indeed at least for 2 (symbol encoding) you're right, Word doesn't encoding the font name with that encoding, either. Sep 20 15:59:30 <sberg> vmiklos, mst_, at the top of page 14 of Word2007RTFSpec9.docx I see "Note that runs of text marked with a particular font index (see \fN in the Font Table section) use the codepage for that font as given by \cpgN or implied by \fcharsetN, unless they use Unicode RTF described in the following section." Would that match what mst_ says? Sep 20 15:59:33 <vmiklos> so if it helps you case to handle at as e.g. ascii, just for that encoding, i think there would be no problem with that. Sep 20 16:00:07 <vmiklos> sberg: that still talks about the content using the font, not the strings (font names) in the font table itself, i think. Sep 20 16:01:17 <sberg> vmiklos, what's the control word to select such a font, also \fN? I don't see any such in n823675.rtf Sep 20 16:02:16 <mikekaganski> loircbot: e.g. \af3 Sep 20 16:02:31 <mikekaganski> sberg: ^ Sep 20 16:02:47 <mst_> 04d5a280beeeb6e056df68395dc9c3b3a674361b Sep 20 16:02:50 <IZBot> core - related: fdo#77979: writerfilter RTF import: read encoded font name - http://cgit.freedesktop.org/libreoffice/core/commit/?id=04d5a280beeeb6e056df68395dc9c3b3a674361b Sep 20 16:02:52 <mst_> sberg: ^ Sep 20 16:04:05 <sberg> mst_, thanks; so there's likely an (implicit?) exception for \fcharset2, as you say Sep 20 16:04:33 <mst_> that's most plausible, our own font code is full of exceptions for "symbol fonts" too Sep 20 16:05:19 <sberg> mikekaganski, ENOCONTEXT Sep 20 16:05:36 <mikekaganski> sberg: [17:01:16] sberg: vmiklos, what's the control word to select such a font, also \fN? I don't see any such in n823675.rtf Sep 20 16:06:32 <sberg> mikekaganski, so you say selection is done with \af3 instead of \f3? Sep 20 16:06:40 <mikekaganski> sberg: yes, in that case Sep 20 16:07:34 <mst_> i think there are several different keywords that apply fonts, but can't remember the whole list Sep 20 16:08:10 <mst_> \fN shoudl be one of them though Sep 20 16:22:18 <sberg> vmiklos, so who generated that sw/qa/extras/rtfimport/data/n823675.rtf, was it manually created and lacks a \cpgN before "Symbol"? Sep 20 16:29:17 <sberg> vmiklos, (after further reading of the RTF spec): disregard the "and lacks a \cpgN before 'Symbol'" part of my above question Sep 20 16:30:27 <mst_> sberg: i suggest not reading too much about encoding in RTF, it gets pretty lovecraftian pretty fast... Sep 20 16:32:58 <vmiklos> sberg: given how short that bugdoc is, i'm pretty sure i cut it down manually to something readable from a multi-MB real bugdoc Sep 20 16:33:07 <sberg> mst_, do you have a recommendation how I could get that "don't use symbol font encoding to read a symbol font's name" into writerfilter/source/rtftok/rtfdocumentimpl.cxx? RTFDocumentImpl::checkUnicode lacks the context to tell whether it is using m_aStates.top().nCurrentEncoding to convert a fontname, and the caller of checkUnicode (at the end of RTFDocumentImpl::resolveChars in this case) appears to lack the context, too Sep 20 16:33:12 <mst_> various Old Ones from The Time Before Unicode and their Backward Compatibility Tentacles etc. Sep 20 16:34:59 <sberg> vmiklos, anyway, that "so there's likely an (implicit?) exception for \fcharset2" hypothesis sounds sane, so we should probably implement it (if only you or mst_ can give me a good hint how...) Sep 20 16:35:13 <vmiklos> sberg: looking for a code pointer Sep 20 16:36:05 <mst_> sberg: m_aStates.top().eDestination == Destination::FONTENTRY should be the relevant check? Sep 20 16:36:17 <vmiklos> sberg: RTFDocumentImpl::text() is where the text is taken, Destination::FONTENTRY is the state on the parser stack which is a font entry in the font table. so to detect "your case" during decoding a byte array into a string, m_aStates.top().eDestination == Destination::FONTENTRY is what you want Sep 20 16:36:35 <vmiklos> ah good, two independent matching hints are promising ;) Sep 20 16:37:35 <sberg> mst_, vmiklos, ah; but what also looks dodgy is that checkUnicode operates there on "Symbol;" including the closing ";" of the full <fontinfo>, not just the <fontname> part of the <fontinfo> Sep 20 16:39:24 <vmiklos> sberg: i think we already assume that the only "token" in the font entry destination that is not bound to a control world (\foo) is the font name Sep 20 16:40:52 <vmiklos> sberg: writerfilter/source/rtftok/rtfdocumentimpl.cxx:1237 is where we simply strip away the trailing semicolon, there is no further separation between the font name and other character content inside the destination (apart from the control words and their arguments) Sep 20 16:42:18 <sberg> vmiklos, OK, thanks; I'll just pretend I haven't seen those dodgy details :) ...so I'm switching to (somewhat arbitrarily) RTL_TEXTENCODING_MS_1252 there now Change-Id: Iebd1bcecb7fa71c489798154d3356062b052775e Reviewed-on: https://gerrit.libreoffice.org/42477 Reviewed-by: Stephan Bergmann <sbergman@redhat.com> Tested-by: Stephan Bergmann <sbergman@redhat.com>