Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
When centering vertically we calculate the y offset based on the height of the text and the annotation
When doing that we must ignore the border width, otherwise the text is offset downwards
|
|
The border reduces the available height, so take it into account for the height too, not only the width
|
|
When 'CIDSystemInfo' dictionary is absent or
has invalid content, instead of aborting the font
because we cannot read the character collection,
let's assume in that case character collection
to be "Adobe-Identity".
Fixes #1465 - Does not show text of Apple-edited PDFs
|
|
|
|
Starting with C++20, the std::string class has methods
starts_with and ends_with, which do the same thing.
Use those instead.
|
|
According to the specification, see NOTE 2 in
https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G7.3882161
it appears that the clipping path should be reset
when the restore (Q) operator is encountered.
Fixes #739
|
|
|
|
|
|
The old algorithm restarts the inner loop for the RHS word from the
beginning on each match, i.e. the worst case complexity approaches
O(N^3), while O(N^2) is obviously sufficient for a pairwise compare of
all words. Fortunately, O(N^2) is hardly ever happening, as the inner N
is limited by a) the maxBaseIdx, b) removing duplicates from the set.
For some pathological cases this changes the runtime from minutes to
seconds.
See poppler#1173.
|
|
Currently, the word characters are allocated as a struct of arrays,
e.g. text and charcode are allocated separately.
This causes some space (6 pointers, 6 malloc chunk management
words (size_t/flags), alignment, ...) and runtime overhead (6 allocs/
frees per word).
Changing this to an array of struct reduces this overhead. It also allows
to be more conservative with allocations, as resizing is less costly, i.e.
starting with a single character allocation instead of 16. It is also more
efficient, as most accesses affect multiple or all attributes, i.e.
values in the same or neighboring CPU cache lines.
Using a std::vector instead of separate raw arrays also reduces code
and manual data management.
The "charPos end index" and trailing "edge" attributes are no
longer stored as an additional entry entry in the array, but as dedicated
data members, `charPosEnd` and `edgeEnd`.
The memory saving is most notably for short words, but even for words
with 16 characters there are small savings, and still less allocations
(1 + 4 allocations instead of 6. Growing is fairly cheap, as the CharInfo
struct is trivially copyable.)
See poppler#1173.
|
|
emplace_back"
Says modernize-use-emplace
No need to pass the c, we will set it later so we can just use the
default constructed CharCodeToUnicodeString
|
|
This commit fixes the "across lines" text
search feature of TextPage::findText() when
the match happens from the last line of a
paragraph to the first line of next paragraph.
Includes tests for this bug.
Fixes #1475
Fixes https://gitlab.gnome.org/GNOME/evince/-/issues/2001
|
|
Redo the fix for issue #157 which is about doing
transparent selection for glyphless documents (eg.
tesseract scanned documents) because it stopped
working after commit 29f32a47
|
|
|
|
Some encrypted files which need repairing (see
links below) failed to open due to a regression
introduced in commit b3e86dbdba where an 'if
condition' was added that's hit by encrypted files
which need repairing.
The removal of this 'if condition' does not affect
the original buggy file that commit b3e86dbdba
targeted[1].
This commit also adds Qt5 and Qt6 tests for opening
an encrypted pdf file affected by this issue.
Fixes #1447
Fixes https://gitlab.gnome.org/GNOME/evince/-/issues/1889
Regression issue:
https://bugs.freedesktop.org/show_bug.cgi?id=14303
[1] which can be found in this duplicate:
https://bugs.freedesktop.org/show_bug.cgi?id=14399
|
|
|
|
Related to https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66523
|
|
Related to https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66523
|
|
Related to https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66523
|
|
|
|
|
|
|
|
|
|
|
|
otherwise it will result in broken output in Cairo backend.
Splash backend already works fine for this case because
it checks for singular matrix in Splash::drawImage().
This commit adds that check early in Gfx::doImage()
which fixes the Cairo backend and for Splash backend
means a perf improvement by avoiding lot of color
computation and image preparation done in
SplashOutputDev::draw{Image,ImageMask,MaskedImage,softMaskedImage}
prior to calling Splash::drawImage which is the one
that checks singular matrix and skips.
Note: singular matrix case is not mentioned in PDF spec
but Xpdf and other pdf readers de-facto do as in here
i.e. skip drawing an image when it has a singular (non
invertible) matrix.
Fixes issue #1114
|
|
|
|
A carefully crafted pdf file could lead to writing files in wrong places
of the file system by using pdfdetach.
Thanks to jwilk for spotting the issue.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1026908
|
|
|
|
Also remove a couple of unreferenced functions.
|
|
parsehex is quite a hot codepath when loading documents.
Various experiments shown that this let the compiler generate slightly
faster code.
|
|
While profiling document loading, a lot was hitting this function; try
let the compiler be smarter.
|
|
|
|
|
|
|
|
|
|
|
|
KDE bug #479732
|
|
|
|
|
|
There was one 'bytecounter increase' case
that was not imported from the Xpdf code.
That caused some JPEG streams fail to
render when hitting that codepath, like
the file 'p1.blank_with_poppler.pdf'
posted on issue #1319
Fixes #1319
|
|
|
|
Requires the users to think if their stream is compressed or not
and if it is a good idea to compress or not, rather than default to 'not
compressed'
|
|
|
|
|
|
|
|
actualText has an internal pointer to the TextPage it's writing to, so
if you called takeText and then continued to output more pages to the
TextOutputDev, their text would be written to the page you'd taken
rather than the new one.
|
|
|
|
layout text #2
Happens only if the first chracter we're asking to draw can't be drawn
with the given font and we need to find a new one and the given
available space is negative (as said this function must always layout at
least one character)
|
|
layout text
Happens only if the first chracter we're asking to draw can't be drawn
with the given font and we need to find a new one
|