diff options
author | Rohit Deshmukh <rohit.deshmukh@synerzip.com> | 2013-12-06 15:42:53 +0530 |
---|---|---|
committer | Eike Rathke <erack@redhat.com> | 2014-01-08 19:38:45 +0000 |
commit | d8fd15875901d584a4bbcc07c927fa20332e4841 (patch) | |
tree | 2150e13c4e8c246a495c1e4046f9e7e6eccadf63 /i18npool/source/breakiterator | |
parent | 45b72633d1bea5e75a27f5fd93e91071e04c050c (diff) |
fdo#72219: Fix for corruption of symbols in docx
Issue:
OUString uses UTF-16, so for a Unicode surrogate character there are 2
values stored, not just 1.
So we are getting assert failure in "rtl_uString_iterateCodePoints" method.
erAck: Underlying cause was that the dictionary breakiterator misused UTF-16 positions as Unicode code point positions.
Change-Id: I923485f56c2d879b63687adaea2b489a3479991c
Reviewed-on: https://gerrit.libreoffice.org/6955
Reviewed-by: Eike Rathke <erack@redhat.com>
Tested-by: Eike Rathke <erack@redhat.com>
Diffstat (limited to 'i18npool/source/breakiterator')
-rw-r--r-- | i18npool/source/breakiterator/xdictionary.cxx | 6 |
1 files changed, 4 insertions, 2 deletions
diff --git a/i18npool/source/breakiterator/xdictionary.cxx b/i18npool/source/breakiterator/xdictionary.cxx index 1200535f38cf..ab2dfd9a94e8 100644 --- a/i18npool/source/breakiterator/xdictionary.cxx +++ b/i18npool/source/breakiterator/xdictionary.cxx @@ -387,9 +387,11 @@ Boundary xdictionary::getWordBoundary(const OUString& rText, sal_Int32 anyPos, s if (u_isWhitespace(ch)) i--; } + boundary.endPos = boundary.startPos; - rText.iterateCodePoints(&boundary.endPos, aCache.wordboundary[i]); - rText.iterateCodePoints(&boundary.startPos, aCache.wordboundary[i-1]); + boundary.endPos += aCache.wordboundary[i]; + boundary.startPos += aCache.wordboundary[i-1]; + } else { boundary.startPos = anyPos; if (anyPos < len) rText.iterateCodePoints(&anyPos, 1); |