Age | Commit message (Collapse) | Author | Files | Lines |
|
ECMA-376-1:2016 states that w:dir is functionally equivalent to LRE/RLE+PDF
pair around the enclosed runs. So this patch does just that.
Change-Id: Ibf9775338cc38a3bbc38a42a33fc64ae787b478f
Reviewed-on: https://gerrit.libreoffice.org/59643
Tested-by: Jenkins
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
Reviewed-on: https://gerrit.libreoffice.org/59672
Reviewed-by: Aron Budea <aron.budea@collabora.com>
Tested-by: Aron Budea <aron.budea@collabora.com>
(cherry picked from commit 348a1e11045ca8d9dbceab43a68d44dbde3f922c)
|
|
In docx a colour value is represented as a 6-digit hex RGB value, or
alternatively the word "auto" to represent automatic colour.
- Add support for reading the value "auto" as COL_AUTO. Previously
this would be read as if it were a hex value, stopping at the
letter 'u' which is not a valid hex digit, resulting in the colour
0x00000A - a very dark blue, which looks close enough to black that
it went unnoticed for a long time :-)
- Remove code which tried to handle this wrong 0x00000A value,
including the constant OOXML_COLOR_AUTO, as it is no longer needed
and will cause surprises for anyone who really wanted this exact
shade of dark blue
- Fix unit tests that were checking for 0x00000A
Reviewed-on: https://gerrit.libreoffice.org/50995
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
Reviewed-on: https://gerrit.libreoffice.org/51461
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit 3967aebca94be9ceea3e36b43f7f53589473ad4e)
Change-Id: I6000070341931147ff9341ad6281cd3b53c02b46
(cherry picked from commit ccef956c4f11ac6c0612a0d22845d02743c91039)
|
|
Reviewed-on: https://gerrit.libreoffice.org/40930
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com>
(cherry picked from commit 4a764319cbad4e2589cc105145ac27defbf49ff6)
Change-Id: Iebf2ff65fcec3231acfc962fb2f1abc2ed2dc67a
Avoid warning in OleHandler
Related to ActiveX controls.
Change-Id: Ief7ee67ca8e4f086a1d5e0400d0eaf3ebc8cdaaf
Reviewed-on: https://gerrit.libreoffice.org/40934
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Tamás Zolnai <tamas.zolnai@collabora.com>
(cherry picked from commit 368b583b992f2e9cad46c2362c9529a07c36d7a9)
Reviewed-on: https://gerrit.libreoffice.org/41483
Reviewed-by: Andras Timar <andras.timar@collabora.com>
Tested-by: Andras Timar <andras.timar@collabora.com>
|
|
Word allows <w:tbl> to be direct child of <w:p>, which is illegal
according to ECMA-376-1:2016.
This allows for import the data in such tables (previously, this text
was simply dropped, causing dataloss) - bug-to-bug compatibility
with Word.
Change-Id: I19c17ab19915ea46685727c635476fe5df593212
Reviewed-on: https://gerrit.libreoffice.org/40909
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit 67a61e54531801645d51ad89aac30064b8c4b4e8)
Reviewed-on: https://gerrit.libreoffice.org/40949
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
A better fix follows
This reverts commit 0eb0c7308ad57f4a20b5691d450b5185e52475f6.
Change-Id: If36f73c580d96445086d8ab3d87fff6a76cd8b6a
Reviewed-on: https://gerrit.libreoffice.org/40948
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
According to ISO/IEC 29500-1:2016(E) 17.6.17), the final <w:sectPr>
must be the last child element of the body element. Also, this is
enforced in schema for CT_Body complex type (Annex A. (normative)
Schemas – W3C XML Schema, A.1 WordprocessingML, page 3866), where
sectPr is a part of <xsd:sequence>, and thus *must* stay at specific
place in sequence, namely being the last element, and be at most one
instance.
However, real-life documents (generated by some third-party software)
have sectPr before other body contents. Unfortunately, MS Word seems
to allow this standards-violating content, and thus encourages
creation of non-standard documents by third-party generators.
This patch doesn't assume that current final (body-level) sectPr is
the last body element, and does not mark current paragraph as last
section's paragraph. Thus, current section (possibly started after
previous paragraph-level sectPr) is continued after final sectPr is
closed.
Change-Id: I8e88288bc6659d77d17986514b3b4fe16a5b45d9
Reviewed-on: https://gerrit.libreoffice.org/40161
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit 4b4cd502806cfc9c9cc9754b8aae18a2c2632cdc)
Reviewed-on: https://gerrit.libreoffice.org/40216
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
This allows for import the data in such tables (previously, this text
was simply dropped, causing dataloss). Layout problems are not fixed
yet.
Change-Id: Id7422adfe0998d1e2adcd4bf0b0e0a1dd7ed37bf
Reviewed-on: https://gerrit.libreoffice.org/40105
Reviewed-by: Aron Budea <aron.budea@collabora.com>
Tested-by: Aron Budea <aron.budea@collabora.com>
|
|
Change-Id: Ida55015363cac3ae29b82a60a9b9a5f1b39086a2
Reviewed-on: https://gerrit.libreoffice.org/39675
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit f95f0ce163743706a3670c6e33593023c22af2ff)
Reviewed-on: https://gerrit.libreoffice.org/39677
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
LibreOffice doesn't accept <w:br> element as a child of <w:body>.
ECMA-376-1:2016 17.3.3.1 describes br as element of a run content,
and points to CT_Br in §A.1.
CT_Br may appear only as part of EG_RunInnerContent.
In turn, EG_RunInnerContent may appear only inside CT_R.
So, using <w:br> outside of <w:r> produces ill-formed OOXML.
Open XML SDK 2.5 Productivity Tool for Microsoft Office confirms that,
showing OpenXmlUnknownElement error.
However, Word accepts it as direct child of <w:body>. It behaves as if
the <w:br> were used as first element in first run of the following
<w:p> (thus creating page break after next paragraph).
Another Word bug that provokes third-parties to create ill-formed
documents, and requires LibreOffice to be bug-to-bug compatible.
This commit makes the following changes:
1. Registers a dedicated complex type CT_Br_OutOfOrder to handle those
unusual breaks, with corresponding handler function.
2. In the handler function, saves the gathered property set to parser
state to use later in next paragraph group handler.
This reproduces Word behaviour.
Change-Id: I5df6927e2de9266b58f87807319ad1c4977e45a7
Reviewed-on: https://gerrit.libreoffice.org/39168
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit a4a1467bc47b81ad68ecad0d5e2e163670582919)
Reviewed-on: https://gerrit.libreoffice.org/39303
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
w:ST_HpsMeasure is defined in ECMA-376 5th ed. Part 1, 17.18.42 as
This simple type specifies that its contents contain either:
* A positive whole number, whose contents consist of a measurement in
half-points (equivalent to 1/144th of an inch), or
* A positive decimal number immediately followed by a unit identifier.
...
This simple type is a union of the following types:
* The ST_PositiveUniversalMeasure simple type (§22.9.2.12).
* The ST_UnsignedDecimalNumber simple type (§22.9.2.16).
This patch generalizes OOXMLUniversalMeasureValue to handle standard-
defined units, and introduces two typedefed specifications:
OOXMLTwipsMeasureValue (which is used where UniversalMeasure was
previously used), and new OOXMLHpsMeasureValue.
Unit test included.
Reviewed-on: https://gerrit.libreoffice.org/38562
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
(cherry picked from commit ea890b1d4bcd6dd59db9f52dce1609c020804e24)
Change-Id: Iccc6d46f717cb618381baf89dfd3e4bbb844b4af
Reviewed-on: https://gerrit.libreoffice.org/38591
Reviewed-by: Mike Kaganski <mike.kaganski@collabora.com>
Tested-by: Mike Kaganski <mike.kaganski@collabora.com>
|
|
* add only autoTxT gallery type
* new test with other types of entries
Change-Id: Ibf7751c73dcf3b6ebd69eec5f4931dbeaaf098c8
Reviewed-on: https://gerrit.libreoffice.org/37425
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Szymon Kłos <szymon.klos@collabora.com>
Tested-by: Szymon Kłos <szymon.klos@collabora.com>
(cherry picked from commit a470d16208a78ae6893d199b3b6bc77a8559b06a)
Reviewed-on: https://gerrit.libreoffice.org/37460
|
|
+ extended model to parse <docPartPr> and <name> marks
+ names are inserted to the document before content
of each entry
+ SwDOCXReader interprets first paragraph of each section
as a name
Change-Id: Ib7de152ba1c6bea4f4665f98d321019c3f68863e
Reviewed-on: https://gerrit.libreoffice.org/37124
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
|
|
+ each entry is placed in a separate section
+ extended model and dmapper to react on docPart mark
Change-Id: I7e5213a09ae7352d1d09369bd0a209b6d4e18e82
Reviewed-on: https://gerrit.libreoffice.org/37107
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Szymon Kłos <szymon.klos@collabora.com>
|
|
- passing "ReadGlossaries" flag to the WriterFilter
- if set - WriterFilter reads glossary document
instead of the main content
- updated model.xml to read docParts and docPart nodes
- SwDOCXReader adds document content as an AutoText
entry
Change-Id: I9a0cc91c793d6accc8461e1c3aca791c5997d497
Reviewed-on: https://gerrit.libreoffice.org/36753
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Szymon Kłos <szymon.klos@collabora.com>
Tested-by: Szymon Kłos <szymon.klos@collabora.com>
|
|
When there are multiple sections in a document, every <w:p> element
triggers a handleLastParagraphInSection() call, and that's how the
previous section is ended and the next one is started if necessary. In
case the section contains no paragraphs at all, the section was lost on
import. Fix this by also calling handleLastParagraphInSection() on
<w:sectPr> as well.
It's not a problem if there are both <w:p> and <w:sectPr> in a section
(which is the usual situation) as only the first call closes the
previous section / starts the next one.
(cherry picked from commit 6603947329a7b372a173a3c60e013e532d0bc5cf)
Change-Id: I64f2c403dcb2ceca76d444ab06df3052235d2795
Reviewed-on: https://gerrit.libreoffice.org/34718
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Christian Lohmaier <lohmaier+LibreOffice@googlemail.com>
(cherry picked from commit 1e88c10327642e6867db5708e3fd0fb7065bc74c)
|
|
Every time a comment is referred, the whole comment stream is parsed
but only the referred comment is extracted. But the symbol is always
processed so it is added to all the comments.
Change-Id: I3264de2d011ff188ef64f6500ae426cde0106c16
Reviewed-on: https://gerrit.libreoffice.org/31584
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Michael Stahl <mstahl@redhat.com>
(cherry picked from commit 3caf89200c8fa7b38d6c340b666ca6cc8c2eb766)
Reviewed-on: https://gerrit.libreoffice.org/31759
|
|
The bug document has a normal table, then its C1 cell starts with a
nested table, which is floating. The problem is that converting the
nested table to a textframe invalidates the start text range of the C1
cell in the outer table we store, so the conversion of the outer table
from text to table fails.
This never worked, so to avoid the regression just don't convert inner
floating tables to textframes when they're anchored at the cell start.
A more general fix in the future can be addressing the actual
invalidation of the cell start/end text ranges, and then this specific
fix will not be necessary anymore.
Change-Id: I12cefa41977cf719b07b0fb3ef9ec423c17ef3b1
Reviewed-on: https://gerrit.libreoffice.org/30685
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
|
|
The image hyperlink is a resource id in the document and needs to be translated into real URL.
First I define a new type CT_Hyperlink_URL in the model and associate it with an action handleHyperlinkURL.
In OOXMLFastContextHandlerProperties::handleHyperlinkURL I dispatch it to OOXMLHyperlinkURLHandler to translate resource id to real URL then set the PropertySet with real URL.
Then the correct URL will be captured while resolving GraphicImport, which will be stored in GraphicImport_Impl::sHyperlinkURL as an OUString.
Finally the property will be set in the GraphicImport::applyName if the length of the sHyperlinkURL is not 0.
Also adds a test file image-hyperlink.docx and a test in ooxmlimport.cxx.
Change-Id: I6194b9cc6bcc1bfaa033ab05e94836fe96e33f14
Reviewed-on: https://gerrit.libreoffice.org/28432
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
|
|
This is similar to the w:gridBefore handling code introduced in commit
cf33af732ed0d3d553bb74636e3b14c55d44c153 (handle w:gridBefore by faking
cells (fdo#38414), 2014-04-23), except that the fake cells here are
inserted after the real ones, not before.
Change-Id: I4c03bd49e52016a58e0e002ae85dede6a96e5f55
Reviewed-on: https://gerrit.libreoffice.org/28487
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
Tested-by: Jenkins <ci@libreoffice.org>
|
|
Change-Id: I813ca0510b6cfc26c307c510f3511c01c0f65c85
|
|
Change-Id: I9a5940027423ff0791fa7da0b79b617412ce6b86
Reviewed-on: https://gerrit.libreoffice.org/21209
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
|
|
Change-Id: I2c408a25880ad0e87f0b5a246a350e45c8378ce5
|
|
RTF import, export, and ooxml export for ruby text are implemented.
tdf#49073 - FILEOPEN: Furigana (ruby text) and characters with them are
missing in opened .docx files.
tdf#50786 - [TASK, METABUG] FILEOPEN, FILESAVE, FORMATTING : Japanese
ruby-character handling is broken
Change-Id: I4a5c30bad180241e3344e9da7efe7da4369fb325
Reviewed-on: https://gerrit.libreoffice.org/17241
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Michael Stahl <mstahl@redhat.com>
|
|
Fix the issue caused by wrong assumption about symbol chracter
and symbol font attributes order in writerfilter. Also allow
symbols to be displayed if user's language is not Western.
Change-Id: I602d9fbfa79c33c90f655dbf5ee22738b6391ae6
Reviewed-on: https://gerrit.libreoffice.org/16543
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Michael Stahl <mstahl@redhat.com>
|
|
Change-Id: I1af1d6bc150c16a2c6b0fe788a41c8c18caee6c6
|
|
These were added by commit cfc4650c8594334edecc3b50ca54461f6bee2d43
(Added some teaks to 'model.xml', 2014-09-16), but the matching dmapper
part is missing, so they aren't useful in practice, and cause a crash on
import of crashtest's File_953.docx.
Change-Id: I3d1c138534a37dc9ba500f1134ca4bb9ebae0e96
|
|
Change-Id: I44630ebc4395b86ae4f44c85d596b589a93b54b0
Reviewed-on: https://gerrit.libreoffice.org/15159
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Miklos Vajna <vmiklos@collabora.co.uk>
|
|
We can't do anything sensible with these CustomXML elements but now we
have to handle them because.
(regression from 9dbf817fe5c5253fba0831aefa17575ae0ba3af1)
Change-Id: If4247890ff9961a77434587802670d28608a7922
|
|
If a field is fixed, mark it as such and parse value to seed it.
This is the other half of the docx filter improvement for fdo#59886.
Reviewed on:
https://gerrit.libreoffice.org/13431
Change-Id: Id00c454921cd386589e04b9572f4040898625a6f
|
|
Change-Id: Ie4f4182e92dfd06b283dc86f5bfd611d7842a504
|
|
Change-Id: I57ca4ef567126321ab745c8d1d7290b66df23c05
|
|
Change-Id: I61a81bf1aab604d27441630dfb5d55f657211410
|
|
Again no need to adapt dmapper/rtftok for these, see commit
020f46d17065b8b00365eab7a809ce980ebfb59a (Use constants for ST_Em
values, 2014-10-07).
Change-Id: Ie67f7a4d251525b5f8799cf613bea56ad82f7a57
|
|
Change-Id: I307d7833fb5556c5509edd698b4b5ecd7b7a5fb3
|
|
No need to adapt dmapper/rtftok for this one, as those do not handle
<w:em> ATM.
Change-Id: I88da1d0dae804e3d054b7d4158a81cb64cc4b600
|
|
Change-Id: If8fbccf946f589abead0803b7ecbc63ecfc656b2
|
|
Change-Id: Ie0f83fd7111942912b0abd61473e654cc2f02360
|
|
Change-Id: Idd277a770a42d33a9c92f41f0452039eba60b6ce
|
|
Redlines changing formatting of runs and paragraphs are valid for the entire
run/paragraph, not just their existence in the XML. So store them
in the matching contexts, which will care of it, instead of the endtrackchange
stuff.
Change-Id: Ie583e4be14e8df95829852bfbbbe25aa0684f02e
|
|
propagateCharacterPropertiesAsSet sends the properties only when ending
a text run (or maybe starting another one, I'm not quite sure), so it breaks
ordering by sending them later then expected (although it worked in many cases).
It's a question if propagateCharacterPropertiesAsSet is to be used by anything
actually, since it seems rather broken to use it in the ooxml frontend.
Using sendPropertiesWithId sends the properties properly at the right time,
as one would expect. I don't know why dmapper can't simply handle this on its
own, as I think it does handle entering and leaving other elements, but
spending more time on it with this overdesigned abomination, oh well.
Change-Id: Ie36c5f933ea3e6d789ea8f9e4ee3b60a5d1c920c
|
|
Change-Id: I3ac385b8f21409b5083b1224652283fec8bb2fa4
|
|
Change-Id: Icbce660a7f6678ae6c48ec03d8fc63c67f169072
|
|
Change-Id: I9fe7909bb8f6174ac05edb340a7d5606f077679d
|
|
Change-Id: Ib7d2ecfa2c5bcbda55859144af6b55bc8ef09c3d
|
|
Change-Id: Idfb554f2ee77e0315b3ed69c4fae8ad4e8e87b3f
|
|
Change-Id: I2c6576f8142a8fe3808a156f2047fe425f769cd9
|
|
Change-Id: If4226bbe1124ca21893840558559b2b0e24440d3
|
|
There were two problems here:
1) The CT_SdtContentCell handlers didn't emit the usual
NS_ooxml::LN_CT_SdtBlock_sdtContent /
NS_ooxml::LN_CT_SdtBlock_sdtEndContent tokens, so the dropdown control
was not created (and then was created with the wrong anchor).
2) In case the SDT was around the cell, the newline character was also
added to the text of the currently selected entry, resulting in an
invalid argument of SwXText::convertToTable(), so no table was created.
Change-Id: I4806626181f40c6d26ff7b25f5dbb863967d8077
|
|
It was typically the same as the "name" one, just with a .rng suffix,
i.e. redundant.
Change-Id: I8abb296b2ee963e214971ff748fd5b079320dfa9
|
|
Change-Id: I564bf16b0c8a6c12d43cf2e38b4ab07dfc96dfc5
|