summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-09-24Fix for infinite-loop testSøren Sandmann Pedersen1-1/+1
The infinite loop detected by "affine-test 212944861" is caused by an overflow in this expression: max_x = pixman_fixed_to_int (vx + (width - 1) * unit_x) + 1; where (width - 1) * unit_x doesn't fit in a signed int. This causes max_x to be too small so that this: src_width = 0 while (src_width < REPEAT_NORMAL_MIN_WIDTH && src_width <= max_x) src_width += src_image->bits.width; results in src_width being 0. Later on when src_width is used for repeat calculations, we get the infinite loop. By casting unit_x to int64_t, the expression no longer overflows and affine-test 212944861 and infinite-loop no longer loop forever.
2012-09-24test: Add inifinite-loop testSøren Sandmann Pedersen2-0/+40
This test demonstrates a bug where a certain transformation matrix can result in an infinite loop. It was extracted as a standalone version of "affine-test 212944861". If given the option -nf, the test program will not call fail_after() and therefore potentially run forever.
2012-09-24affine-test: Print out the transformation matrix when verboseSøren Sandmann Pedersen1-4/+12
Printing out the translation and scale is a bit misleading because the actual transformation matrix can be modified in various other ways. Instead simply print the whole transformation matrix that is actually used.
2012-09-24MIPS: DSPr2: Added OVER combiner and two new fast paths: - over_8888_8888 - ↵Nemanja Lukic2-0/+151
over_8888_8888_8888 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over_8888_8888 = L1: 19.61 L2: 17.10 M: 11.16 ( 59.20%) HT: 16.47 VT: 15.81 R: 14.82 RT: 8.90 ( 50Kops/s) over_8888_8888_8888 = L1: 13.56 L2: 11.22 M: 7.46 ( 79.18%) HT: 6.24 VT: 6.20 R: 6.11 RT: 3.95 ( 29Kops/s) Optimized: over_8888_8888 = L1: 46.42 L2: 36.70 M: 16.69 ( 88.57%) HT: 17.11 VT: 16.55 R: 15.31 RT: 9.48 ( 52Kops/s) over_8888_8888_8888 = L1: 26.06 L2: 22.53 M: 11.49 (121.91%) HT: 9.93 VT: 9.62 R: 9.19 RT: 5.75 ( 36Kops/s)
2012-09-24MIPS: DSPr2: Added fast-paths for OVER operation: - over_0565_n_0565 - ↵Nemanja Lukic2-0/+127
over_0565_8_0565 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over_0565_n_0565 = L1: 7.56 L2: 7.24 M: 6.16 ( 16.38%) HT: 4.01 VT: 3.84 R: 3.79 RT: 1.66 ( 18Kops/s) over_0565_8_0565 = L1: 7.43 L2: 7.05 M: 5.98 ( 23.85%) HT: 5.27 VT: 5.23 R: 5.09 RT: 3.14 ( 28Kops/s) Optimized: over_0565_n_0565 = L1: 15.47 L2: 14.52 M: 12.30 ( 32.65%) HT: 10.76 VT: 10.57 R: 10.27 RT: 6.63 ( 46Kops/s) over_0565_8_0565 = L1: 15.47 L2: 14.61 M: 11.78 ( 46.92%) HT: 10.00 VT: 9.84 R: 9.40 RT: 5.81 ( 43Kops/s)
2012-09-24MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_0565 - ↵Nemanja Lukic2-0/+123
over_8888_8_0565 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over_8888_n_0565 = L1: 8.95 L2: 8.33 M: 6.95 ( 27.74%) HT: 4.27 VT: 4.07 R: 4.01 RT: 1.74 ( 19Kops/s) over_8888_8_0565 = L1: 8.86 L2: 8.11 M: 6.72 ( 35.71%) HT: 5.68 VT: 5.62 R: 5.47 RT: 3.35 ( 30Kops/s) Optimized: over_8888_n_0565 = L1: 18.76 L2: 17.55 M: 13.11 ( 52.19%) HT: 11.35 VT: 11.10 R: 10.88 RT: 6.94 ( 47Kops/s) over_8888_8_0565 = L1: 18.14 L2: 16.79 M: 12.10 ( 64.25%) HT: 10.24 VT: 9.98 R: 9.63 RT: 5.89 ( 43Kops/s)
2012-09-24MIPS: DSPr2: Added fast-paths for OVER operation: - over_8888_n_8888 - ↵Nemanja Lukic3-0/+198
over_8888_8_8888 Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench results Referent (before): over_8888_n_8888 = L1: 9.92 L2: 11.27 M: 8.50 ( 45.23%) HT: 4.70 VT: 4.45 R: 4.49 RT: 1.85 ( 20Kops/s) over_8888_8_8888 = L1: 12.54 L2: 10.86 M: 8.18 ( 54.36%) HT: 6.53 VT: 6.45 R: 6.41 RT: 3.83 ( 33Kops/s) Optimized: over_8888_n_8888 = L1: 28.02 L2: 24.92 M: 14.72 ( 78.15%) HT: 13.03 VT: 12.65 R: 12.00 RT: 7.49 ( 49Kops/s) over_8888_8_8888 = L1: 26.92 L2: 23.93 M: 13.65 ( 90.58%) HT: 11.68 VT: 11.29 R: 10.56 RT: 6.37 ( 45Kops/s)
2012-09-22pixman-combine.c.template: Formatting clean-upsSøren Sandmann Pedersen1-31/+7
Various formatting fixes, and removal of some obsolete comments about strength reduction of operators.
2012-09-22Fix bugs in pixman-image.cSøren Sandmann Pedersen1-2/+2
In the checks for whether the transforms are rotation matrices "-1" and "1" were used instead of the correct -pixman_fixed_1 and pixman_fixed_1. Fixes test suite failure for rotate-test.
2012-09-22Add rotate-test.c test programSøren Sandmann Pedersen2-0/+112
This program exercises a bug in pixman-image.c where "-1" and "1" were used instead of the correct "- pixman_fixed_1" and "pixman_fixed_1". With the fast implementation enabled: % ./rotate-test rotate test failed! (checksum=35A01AAB, expected 03A24D51) Without it: % env PIXMAN_DISABLE=fast ./rotate-test pixman: Disabled fast implementation rotate test passed (checksum=03A24D51) V2: The first version didn't have lcg_srand (testnum) in test_transform().
2012-09-22Fix bugs in component alpha combiners for separable PDF operatorsSøren Sandmann Pedersen2-3/+3
In general, the component alpha version of an operator is supposed to do this: - multiply source with mask in all channels - multiply mask with source alpha in all channels - compute the regular operator in all channels using the mask value whenever source alpha is called for The first two steps are usually accomplished with the function combine_mask_ca(), but for operators where source alpha is not used, such as SRC, ADD and OUT, the simpler function combine_mask_value_ca(), which doesn't compute the new mask values, can be used. However, the PDF blend modes generally *do* make use of source alpha, so they can't use combine_mask_value_ca() as they do now. They have to use combine_mask_ca(). This patch fixes this in combine_multiply_ca() and the CA combiners generated by PDF_SEPARABLE_BLEND_MODE.
2012-09-22Fix bug in fast_composite_scaled_nearest()Søren Sandmann Pedersen1-1/+1
The fast_composite_scaled_nearest() function can be called when the format is x8b8g8r8. In that case pixels fetched in fetch_nearest() need to have their alpha channel set to 0xff. Fixes test suite failure in scaling-test. Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-22Add PIXMAN_x8b8g8r8 and PIXMAN_a8b8g8r8 formats to scaling-testSøren Sandmann Pedersen1-9/+31
Update the CRC values based on what the general implementation reports. This reveals a bug in the fast implementation: % env PIXMAN_DISABLE="mmx sse2" ./test/scaling-test pixman: Disabled mmx implementation pixman: Disabled sse2 implementation scaling test failed! (checksum=AA722B06, expected 03A23E0C) vs. % env PIXMAN_DISABLE="mmx sse2 fast" ./test/scaling-test pixman: Disabled fast implementation pixman: Disabled mmx implementation pixman: Disabled sse2 implementation scaling test passed (checksum=03A23E0C) Reviewed-by: Matt Turner <mattst88@gmail.com>
2012-09-19implementation: Rename delegate to fallbackSøren Sandmann Pedersen2-12/+12
At this point the chain of implementations has nothing to do with the delegation design pattern anymore, so rename the delegate pointer to 'fallback'.
2012-09-19_pixman_implementation_create(): Initialize implementation with memset()Søren Sandmann Pedersen1-24/+11
All the function pointers are NULL by default now, so we can just zero the struct. Also write the function a little more compactly.
2012-09-19Rename _pixman_lookup_composite_function() to ↵Søren Sandmann Pedersen6-128/+128
_pixman_implementation_lookup_composite() And move it into pixman-implementation.c which is where it belongs logically.
2012-09-19Move delegation of src/dest iter init into pixman-implementation.cSøren Sandmann Pedersen6-37/+49
Instead of relying on each implementation to delegate when an iterator can't be initialized, change the type of iterator initializers to boolean and make pixman-implementation.c do the delegation whenever an iterator initializer returns FALSE.
2012-09-19Move fill delegation into pixman-implementation.cSøren Sandmann Pedersen7-164/+71
As in the blt commit, do the delegation in pixman-implementation.c whenever the implementation fill returns FALSE instead of relying on each implementation to do it by itself. With this change there is no longer any reason for the implementations to have one fill function that delegates and one that actually blits, so consolidate those in the NEON, DSPr2, SSE2, and MMX implementations.
2012-09-19Move blt delegation into pixman-implementation.cSøren Sandmann Pedersen6-225/+81
Rather than require each individual implementation to do the delegation for blt, just do it in pixman-implementation.c whenever the implementation blt returns FALSE. With this change, there is no longer any reason for the implementations to have one blt function that delegates and one that actually blits, so consolidate those in the NEON, DSPr2, SSE2, and MMX implementations.
2012-09-19implementation: Write lookup_combiner() in a less convoluted way.Søren Sandmann Pedersen1-12/+23
Instead of initializing an array on the stack, just use a simple switch to select which set of combiners to look up in.
2012-09-15build: Remove useless DEP_CFLAGS/DEP_LIBS variablesMatt Turner3-22/+7
2012-09-15build: Improve win32 build systemAndrea Canciani4-11/+19
Handle cross-directory dependencies using PHONY targets and clean up some redundancies.
2012-09-15mmx: Fix x86 build on MSVCAndrea Canciani1-12/+13
The MSVC compiler is very strict about variable declarations after statements. Move all the declarations of each block before any statement in the same block to fix multiple instances of: pixman-mmx.c(xxxx) : error C2275: '__m64' : illegal use of this type as an expression
2012-08-29test/utils.c: Use pow(), not powf() in sRGB conversion routinesSøren Sandmann Pedersen1-2/+2
These functions are operating on double precision values, so use pow() instead of powf().
2012-08-26pixel_checker: Move sRGB conversion into get_limits()Søren Sandmann Pedersen1-15/+13
The sRGB conversion has to be done every time the limits are being computed. Without this fix, pixel_checker_get_min/max() will produce the wrong results when called from somewhere other than pixel_checker_check().
2012-08-25Remove obsolete TODO fileSøren Sandmann Pedersen1-271/+0
2012-08-19Remove pointless declaration of _pixman_image_get_scanline_generic_64()Søren Sandmann Pedersen1-4/+0
This declaration used to be necessary when _pixman_image_get_scanline_generic_64() referred to a structure that itself referred back to _pixman_image_get_scanline_generic_64().
2012-08-09demos: Add srgb_trap_test.cSøren Sandmann Pedersen2-0/+121
This demo program composites a bunch of trapezoids side by side with and without gamma aware compositing.
2012-08-09Make show_image() cope with more formatsSøren Sandmann Pedersen4-26/+37
This makes show_image() deal with more formats than just a8r8g8b8, in particular, a8r8g8b8_sRGB can now be handled. Images that are passed to show_image with a format of a8r8g8b8_sRGB are displayed without modification under the assumption that the monitor is approximately sRGB. Images with a format of a8r8g8b8 are also displayed without modification since many other users of show_image() have been generating essentially sRGB data with this format. Other formats are also assumed to be gamma compressed; these are converted to a8r8g8b8 before being displayed. With these changes, srgb-test.c doesn't need to do its own conversion anymore.
2012-08-09Define TIMER_BEGIN and TIMER_END even when timers are not enabledSøren Sandmann Pedersen1-0/+5
This allows code that uses these macros to build when timers are disabled.
2012-08-01Post-release version bump to 0.27.3Søren Sandmann Pedersen1-1/+1
2012-08-01Pre-release version bump to 0.27.2pixman-0.27.2Søren Sandmann Pedersen1-1/+1
2012-08-01Use angle brackets form of including config.hSebastian Bauer1-1/+1
2012-08-01Added HAVE_CONFIG_H check before including config.hSebastian Bauer1-1/+4
2012-07-31glyph-test: Avoid setting solid images as alpha maps.Søren Sandmann Pedersen1-2/+2
glyph-test would sometimes set a solid image as an alpha map, which is not allowed. When this happened and the debug spew was enabled, messages like this one would be generated: *** BUG *** In pixman_image_set_alpha_map: The expression !alpha_map || alpha_map->type == BITS was false Set a breakpoint on '_pixman_log_error' to debug Fix this by not passing the ALLOW_SOLID flag to create_image() when the resulting is to be used as an alpha map.
2012-07-31stress-test: Avoid overflows in clip rectanglesSøren Sandmann Pedersen1-0/+5
The rectangles in the clip region set in set_general_properties() would sometimes overflow, which would lead to messages like these: *** BUG *** In pixman_region32_union_rect: Invalid rectangle passed Set a breakpoint on '_pixman_log_error' to debug when the micro version number of pixman is even. Fix this by detecting the overflow and clamping such that the x2/y2 coordinates are less than INT32_MAX.
2012-07-31Add make-srgb.pl to EXTRA_DISTSøren Sandmann Pedersen1-0/+1
Otherwise make distcheck doesn't pass.
2012-07-30Add tests to validate new sRGB behaviorAntti S. Lankila4-6/+104
Composite checks random combinations of operations that now also have sRGB sources, masks and destinations, and stress-test validates the read/write primitives.
2012-07-30Add sRGB blending demo programAntti S. Lankila3-1/+100
Simple sRGB color blender test can be used to determine if the sRGB processing works as expected. It blends alpha ramps of purple and green together such that at midpoint of image, 50 % blend of both is realized. At that point, sRGB-aware processing yields a result close to #bbb rather than #888, which is the linear light blending result. The demo also contains the sample computation for sRGB premultiplied alpha.
2012-07-30Add support for sRGB surfacesAntti S. Lankila8-3/+272
sRGB format is defined as a new format type, PIXMAN_TYPE_ARGB_SRGB. One form of this type is provided, PIXMAN_a8r8g8b8_sRGB. Use of an sRGB format triggers wide processing, and the pixel fetch/store functions handle the relevant conversion between color spaces. Pixman itself is thought to compose in the linearized sRGB color space. sRGB conversion is tabularized. For sRGB to linear, we are using only 256 values because the current source format uses 8 bits per component precision. For linear to sRGB, it turns out that only 4096 brightness levels are required to generate all of the 256 sRGB color values, and therefore only 12 bits per component are considered during store. As a special case, a no-op sRGB->linear->sRGB conversion is constructed to be lossless by adjusting the sRGB->linear conversion table where necessary.
2012-07-29Remove unnecessary dst initializationAntti S. Lankila1-9/+0
The initialization work is already performed correctly in image_init().
2012-06-20Make pixman-mmx.c compile on x86-32 without optimizationSøren Sandmann Pedersen1-2/+11
When not optimizing, write _mm_shuffle_pi16() as a statement expression with inline assembly. That way we avoid __builtin_ia32_pshufw(), which is only available when compiling with -msse, while still allowing the non-optimizing gcc to understand that the second argument is a compile time constant. Tested-by: Knut Petersen <knut_petersen@t-online.de>
2012-06-20Cleanups and simplifications in x86 CPU feature detectionSøren Sandmann Pedersen1-191/+146
A new function pixman_cpuid() is added that runs the cpuid instruction and returns the results. On GCC this function uses inline assembly; on MSVC, the function calls the __cpuid intrinsic. There is also a new function called have_cpuid() which detects whether cpuid is available. On x86-64 and MSVC, it simply returns TRUE; on x86-32 bit, it checks whether the 22nd bit of eflags can be modified. On MSVC this does have the consequence that pixman will no longer work CPUS without cpuid (ie., older than 486 and some 486 models). These two functions together makes it possible to write a generic detect_cpu_features() in plain C. This function is then used in a new have_feature() function that checks whether a specific set of feature bits is available. Aside from the cleanups and simplifications, the main benefit from this patch is that pixman now can do feature detection on x86-64, so that newer instruction sets such as SSSE3 and SSE4.1 can be used. (And apparently the assumption that x86-64 CPUs always have MMX and SSE2 is no longer correct: Knight's Corner is x86-64, but doesn't have them). V2: Rename the constants in the getisax() code, as pointed out by Alan Coopersmith. Also reinstate the result variable and initialize features to 0. V3: Fixes for the fact that the upper 32 bits of a 64 bit register are zeroed whenever the corresponding 32 bit register is written to. V4: Fixes for the fact that in 32 bit mode, when gcc is not optimizing there were not enough registers available. The new code uses the "a", "b", "c", and "d" constraints instead, and has two separate versions for 32 and 64 bit modes.
2012-07-08Changed the style of two function headersSebastian Bauer1-7/+7
Declare functions *_inverse() and *_contains_rectangle() in the same way as the other functions are declared. This doesn't imply any semantic changes. It's just a unification of coding styles.
2012-07-08MIPS: DSPr2: Added more bilinear fast paths (without mask)Nemanja Lukic4-0/+466
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench -b Referent (before): src_8888_8888 = L1: 8.18 L2: 7.79 M: 6.32 ( 33.51%) HT: 5.78 VT: 5.70 R: 5.61 RT: 3.79 ( 29Kops/s) src_8888_0565 = L1: 6.90 L2: 7.14 M: 6.47 ( 25.75%) HT: 5.54 VT: 5.51 R: 5.46 RT: 3.53 ( 28Kops/s) src_0565_x888 = L1: 3.76 L2: 3.71 M: 3.37 ( 13.41%) HT: 3.26 VT: 3.22 R: 3.20 RT: 2.58 ( 23Kops/s) src_0565_0565 = L1: 3.59 L2: 3.56 M: 3.47 ( 9.19%) HT: 3.19 VT: 3.18 R: 3.16 RT: 2.46 ( 22Kops/s) over_8888_8888 = L1: 5.99 L2: 5.66 M: 4.95 ( 26.28%) HT: 4.40 VT: 4.38 R: 4.31 RT: 3.02 ( 26Kops/s) add_8888_8888 = L1: 6.84 L2: 6.39 M: 5.48 ( 29.09%) HT: 4.80 VT: 4.79 R: 4.70 RT: 3.20 ( 27Kops/s) Optimized: src_8888_8888 = L1: 18.27 L2: 16.69 M: 12.87 ( 68.25%) HT: 11.80 VT: 11.61 R: 10.60 RT: 7.05 ( 41Kops/s) src_8888_0565 = L1: 15.18 L2: 14.10 M: 11.75 ( 46.71%) HT: 10.64 VT: 10.50 R: 10.03 RT: 7.15 ( 41Kops/s) src_0565_x888 = L1: 10.45 L2: 9.96 M: 9.23 ( 36.72%) HT: 8.39 VT: 8.29 R: 8.02 RT: 5.75 ( 37Kops/s) src_0565_0565 = L1: 9.37 L2: 8.98 M: 8.50 ( 22.53%) HT: 7.71 VT: 7.66 R: 7.52 RT: 5.59 ( 37Kops/s) over_8888_8888 = L1: 12.21 L2: 11.01 M: 8.56 ( 45.36%) HT: 7.71 VT: 7.64 R: 7.43 RT: 5.51 ( 36Kops/s) add_8888_8888 = L1: 17.72 L2: 15.16 M: 10.78 ( 57.13%) HT: 9.46 VT: 9.30 R: 9.00 RT: 6.03 ( 38Kops/s)
2012-07-08MIPS: DSPr2: Added several bilinear fast paths with a8 maskNemanja Lukic3-0/+372
Performance numbers before/after on MIPS-74kc @ 1GHz: lowlevel-blt-bench -b Referent (before): src_8888_8_8888 = L1: 6.37 L2: 6.08 M: 5.46 ( 32.57%) HT: 4.64 VT: 4.61 R: 4.52 RT: 2.85 ( 23Kops/s) src_8888_8_0565 = L1: 5.89 L2: 5.66 M: 5.11 ( 23.71%) HT: 4.36 VT: 4.34 R: 4.26 RT: 2.71 ( 22Kops/s) src_0565_8_x888 = L1: 3.32 L2: 3.27 M: 3.17 ( 14.71%) HT: 2.86 VT: 2.84 R: 2.81 RT: 2.07 ( 19Kops/s) src_0565_8_0565 = L1: 3.19 L2: 3.15 M: 3.05 ( 10.11%) HT: 2.75 VT: 2.74 R: 2.71 RT: 2.00 ( 18Kops/s) over_8888_8_8888 = L1: 4.99 L2: 4.71 M: 4.11 ( 27.22%) HT: 3.59 VT: 3.58 R: 3.50 RT: 2.36 ( 21Kops/s) add_8888_8_8888 = L1: 5.60 L2: 5.26 M: 4.52 ( 29.95%) HT: 3.92 VT: 3.89 R: 3.80 RT: 2.49 ( 21Kops/s) Optimized: src_8888_8_8888 = L1: 13.19 L2: 12.13 M: 9.75 ( 58.22%) HT: 8.60 VT: 8.44 R: 7.90 RT: 5.06 ( 33Kops/s) src_8888_8_0565 = L1: 11.64 L2: 10.81 M: 9.18 ( 42.63%) HT: 8.04 VT: 7.90 R: 7.57 RT: 5.02 ( 32Kops/s) src_0565_8_x888 = L1: 8.34 L2: 7.95 M: 7.29 ( 33.85%) HT: 6.55 VT: 6.48 R: 6.25 RT: 4.35 ( 30Kops/s) src_0565_8_0565 = L1: 7.71 L2: 7.35 M: 6.90 ( 22.90%) HT: 6.14 VT: 6.10 R: 5.94 RT: 4.07 ( 29Kops/s) over_8888_8_8888 = L1: 9.73 L2: 8.99 M: 7.15 ( 47.41%) HT: 6.40 VT: 6.30 R: 6.11 RT: 4.28 ( 30Kops/s) add_8888_8_8888 = L1: 13.01 L2: 11.72 M: 8.70 ( 57.68%) HT: 7.59 VT: 7.46 R: 7.20 RT: 4.74 ( 32Kops/s)
2012-07-07Simplify CPU detection on PPC.Søren Sandmann Pedersen1-75/+38
Get rid of the initialized and have_vmx static variables in pixman-ppc.c There is no point to them since CPU detection only happens once per process. On Linux, just read /proc/self/auxv instead of generating the filename with getpid() and don't bother with the stack buffer. Instead just read the aux entries one by one.
2012-07-07Simplifications to ARM CPU detectionSøren Sandmann Pedersen1-157/+87
Organize pixman-arm.c such that each operating system/compiler exports a detect_cpu_features() function that returns a bitmask with the various features that we are interested in. A new function have_feature() then calls this function, caches the result, and return whether the given feature is available. The result is that all the pixman_have_arm_<feature> functions become redundant and can be deleted.
2012-07-07Simplify MIPS CPU detectionSøren Sandmann Pedersen1-35/+9
There is no reason to have pixman_have_<feature> functions when all they do is call pixman_have_mips_feature(). Instead rename pixman_have_mips_feature() to have_feature() and call it directly from _pixman_mips_get_implementations(). Also on non-Linux, just make have_feature() return FALSE.
2012-07-07Move the remaining bits of pixman-cpu into pixman-implementation.cSøren Sandmann Pedersen3-80/+51