summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-10-20Extend gradient-crash-testgradient-crashSøren Sandmann Pedersen2-43/+87
Test the gradients with various transformations, and test cases where the gradients are specified with two identical points.
2010-10-20Add enable_fp_exceptions() function in utils.[ch]Søren Sandmann Pedersen3-0/+37
This function enables floating point traps if possible.
2010-10-20test: Make composite test use some existing macros instead of defining its ownSøren Sandmann Pedersen2-30/+28
Also move the ARRAY_LENGTH macro into utils.h so it can be used elsewhere.
2010-10-12Add comments about errorsAndrea Canciani1-0/+42
Explain how errors are introduced in the computation performed for radial gradients.
2010-10-12Draw radial gradients with PDF semanticsAndrea Canciani2-198/+223
Change radial gradient computations and definition to reflect the radial gradients in PDF specifications (see section 8.7.4.5.4, Type 3 (Radial) Shadings of the PDF Reference Manual). Instead of having a valid interpolation parameter value for every point of the plane, define it only for points withing the area covered by the family of circles generated by interpolating or extrapolating the start and end circles. Points outside this area are now transparent black (rgba 0 0 0 0). Points within this area have the color assiciated with the maximum value of the interpolation parameter in that point (if multiple solutions exist within the range specified by the extend mode).
2010-10-11Plug leak in the alphamap test.Søren Sandmann Pedersen1-1/+15
The images are being created with non-NULL data, so we have to free it outselves. This is important because the Cygwin tinderbox is running out of memory and produces this: mmap failed on 20000 1507328 mmap failed on 40000 1507328 mmap failed on 20000 1507328 mmap failed on 40000 1507328 mmap failed on 40000 1507328 mmap failed on 40000 1507328 http://tinderbox.x.org/builds/2010-10-05-0014/logs/pixman/#check
2010-10-11Add no-op combiners for DST and the CA versions of the HSL operators.Søren Sandmann Pedersen1-10/+21
We already exit early for DST, but for the HSL operators with component alpha, we crash at the moment. Fix that by adding a dummy combine_dst() function.
2010-10-11test: Add some more colors to the color table in composite.cSøren Sandmann Pedersen1-1/+3
Specifically, add transparent black and superluminescent white with alpha = 0.
2010-10-11test: Parallize composite.c with OpenMPSøren Sandmann Pedersen1-4/+6
Each test uses the test number as the random number seed; if it didn't, all the threads would run the same tests since they would all start from the same seed.
2010-10-11test: Change composite so that it tests randomly generated imagesSøren Sandmann Pedersen2-98/+111
Previously this test would try to exhaustively test all combinations of formats and operators, which meant that it would take hours to run. Instead, generate images randomly and test compositing those. Cc: chris@chris-wilson.co.uk
2010-10-11test: Fix eval_diff() so that it provides useful error values.Søren Sandmann Pedersen1-31/+15
Previously, this function would evaluate the error under the assumption that the format was 565 or wider. This patch changes it to take the actual format into account. With that fixed, we can turn on testing for the rest of the formats. Cc: chris@chris-wilson.co.uk
2010-10-11test: Fix bug in color_correct() in composite.cSøren Sandmann Pedersen1-15/+25
This function was using the number of bits in a channel as if it were a mask, which lead to many spurious errors. With that fixed, we can turn on testing for all formats where all channels have 5 or more bits. Cc: chris@chris-wilson.co.uk
2010-10-11Remove broken optimizations in combine_disjoint_over_u()Søren Sandmann Pedersen2-10/+6
The first broken optimization is that it checks "a != 0x00" where it should check "s != 0x00". The other is that it skips the computation when alpha is 0xff. That is wrong because in the formula: min (1, (1 - Aa)/Ab) the render specification states that if Ab is 0, the quotient is defined to positive infinity. That is the case even if (1 - Aa) is 0.
2010-10-11ARM: restore fallback to ARMv6 implementation from NEON in the delegate chainSiarhei Siamashka1-2/+6
After fast path cache introduction, the overhead of having this fallback is insignificant. On the other hand, some of the ARM assembly optimizations (for example nearest neighbor scaling) do not need NEON.
2010-10-11Use more unrolling for scaled src_0565_0565 with nearest filterSiarhei Siamashka1-3/+48
Benchmark from Intel Core i7 860: == before == op=1, src_fmt=10020565, dst_fmt=10020565, speed=1335.29 MPix/s == after == op=1, src_fmt=10020565, dst_fmt=10020565, speed=1550.96 MPix/s == performance of nonscaled src_0565_0565 operation as a reference == op=1, src_fmt=10020565, dst_fmt=10020565, speed=2401.31 MPix/s Benchmark from ARM Cortex-A8: == before == op=1, src_fmt=10020565, dst_fmt=10020565, speed=81.79 MPix/s == after == op=1, src_fmt=10020565, dst_fmt=10020565, speed=89.55 MPix/s == performance of nonscaled src_0565_0565 operation as a reference == op=1, src_fmt=10020565, dst_fmt=10020565, speed=197.44 MPix/s
2010-10-04ARM: added 'neon_composite_out_reverse_8_0565' fast pathSiarhei Siamashka2-0/+53
== before == outrev_8_0565 = L1: 22.91 L2: 22.40 M: 18.75 ( 10.47%) HT: 12.62 VT: 12.22 R: 11.32 RT: 5.30 ( 58Kops/s) == after == outrev_8_0565 = L1: 176.27 L2: 151.70 M:108.79 ( 60.81%) HT: 50.43 VT: 37.16 R: 32.26 RT: 9.62 ( 97Kops/s)
2010-10-04ARM: added 'neon_composite_add_0565_8_0565' fast pathSiarhei Siamashka2-0/+56
== before == add_0565_8_0565 = L1: 14.05 L2: 14.03 M: 11.57 ( 12.94%) HT: 8.31 VT: 8.10 R: 7.47 RT: 3.64 ( 42Kops/s) == after == add_0565_8_0565 = L1: 123.36 L2: 94.70 M: 74.36 ( 83.15%) HT: 31.17 VT: 23.97 R: 21.06 RT: 6.42 ( 70Kops/s)
2010-10-04ARM: NEON: added forgotten cache preload for over_n_8888/over_n_0565Siarhei Siamashka1-0/+2
Prefetch provides up to 40-50% better performance when working with large images and/or when having lots of L2 cache misses on ARM Cortex-A8 @ 720MHz: == before == over_n_8888 = L1: 225.83 L2: 181.02 M: 55.57 ( 41.41%) HT: 38.96 VT: 36.92 R: 32.84 RT: 14.15 ( 123Kops/s) over_n_0565 = L1: 153.91 L2: 149.69 M: 83.17 ( 30.95%) HT: 50.41 VT: 49.15 R: 40.56 RT: 15.45 ( 131Kops/s) == after == over_n_8888 = L1: 222.39 L2: 170.95 M: 76.86 ( 57.27%) HT: 58.80 VT: 53.03 R: 45.51 RT: 14.13 ( 124Kops/s) over_n_0565 = L1: 151.87 L2: 149.54 M:125.63 ( 46.80%) HT: 67.85 VT: 57.54 R: 50.21 RT: 15.32 ( 130Kops/s)
2010-10-04Fix "syntax error: empty declaration" warnings.Mika Yrjola3-3/+7
These minor changes should fix a large number of macro declaration - related "syntax error: empty declaration" warnings which are seen while compiling the code with the Solaris Studio compiler.
2010-10-04Delete simple repeat codeSøren Sandmann Pedersen3-136/+21
This was supposedly an optimization, but it has pathological cases where it definitely isn't. For example a 1 x n image will cause it to have terrible memory access patterns and to generate a ton of modulus operations. Since no one has ever measured whether it actually is an improvement, and since it is doing the repeating at the wrong the stage in the pipeline, and since with the previous commit it can't be triggered anymore because we now require SAMPLES_COVER_CLIP for regular fast paths, just delete it.
2010-10-04Fix bug in FAST_PATH_STD_FAST_PATHSøren Sandmann Pedersen3-40/+30
The standard fast paths deal with two kinds of images: solids and bits. These two image types require different flags, but PIXMAN_STD_FAST_PATH uses the same ones for both. This patch makes it so that solid images just get the standard flags, while bits images must be untransformed contain the destination clip within the sample grid. This means that the old FAST_PATH_COVERS_CLIP flag is now not used anymore, so it can be deleted.
2010-09-29Some clean-ups in fence_malloc() and fence_free()Dmitri Vorobiev1-21/+6
This patch removes an unnecessary typecast of MAP_FAILED, replaces an erroneous free() by the correct munmap() in the error path for a failing mprotect(), and, finally, removes redundant calls to mprotect() that aren't necessary, because munmap() doesn't call for any specific memory protection.
2010-09-28Fix search-and-replace issue in lowlevel-blt-bench.cSøren Sandmann Pedersen1-1/+1
2010-09-28Rename all the fast paths with _8000 in their names to _8Søren Sandmann Pedersen8-68/+68
This inconsistent naming somehow survived the refactoring from a while back.
2010-09-27Remove cache prefetch code.Liu Xinyun1-659/+0
The performance is decreased with cache prefetch, especially for ATOM. So remove these code. Following is the experiment. old: 0.19.5-with-cache-prefetch new: 0.19.5-without-cache-prefetch CPU: Intel Atom N270@1.6GHz OS: MeeGo (32 bits) Speedups ======== image-rgba poppler-0 17125.68 (17279.58 0.92%) -> 14765.36 (15926.49 3.54%): 1.16x speedup image-rgba ocitysmap-0 9008.25 (9040.41 7.50%) -> 8277.94 (8343.09 5.44%): 1.09x speedup image-rgba xfce4-terminal-a1-0 18020.76 (18230.68 0.97%) -> 16703.77 (16712.42 1.22%): 1.08x speedup image-rgba gnome-terminal-vim-0 25081.38 (25133.38 0.24%) -> 23407.47 (23652.98 0.54%): 1.07x speedup image-rgba firefox-talos-gfx-0 57916.97 (57973.20 0.11%) -> 54556.64 (54624.55 0.39%): 1.06x speedup image-rgba firefox-planet-gnome-0 102377.47 (103496.63 0.70%) -> 96816.65 (97075.54 0.15%): 1.06x speedup image-rgba swfdec-giant-steps-0 12376.24 (12616.84 1.02%) -> 11705.30 (11825.20 1.06%): 1.06x speedup CPU: Intel Core(TM)2 Duo CPU T9600@2.80GHz OS: Ubuntu 10.04 (64bits) Speedups ======== image-rgba ocitysmap-0 2671.46 (2691.82 8.55%) -> 2296.20 (2307.26 5.77%): 1.16x speedup image-rgba swfdec-giant-steps-0 1614.55 (1615.18 1.68%) -> 1532.84 (1538.52 0.72%): 1.05x speedup Signed-off-by: Liu Xinyun <xinyun.liu@intel.com> Signed-off-by: Chen Miaobo <miaobo.chen@intel.com>
2010-09-23Use <sys/mman.h> macros only when they are availableDmitri Vorobiev1-1/+1
Not all systems are regular Unices, so let's be careful with the mmap()-related stuff, which might be unavailable. This patch makes sure that mmap() and friends is used only when the <sys/mman.h> header is found.
2010-09-21Revert "add enable-cache-prefetch option"Søren Sandmann Pedersen1-0/+659
Revert this accidentally committed patch. This reverts commit 19ea0e16b958e5abe491365c203293ab372f3586.
2010-09-21If MAP_ANONYMOUS is not defined, define it to MAP_ANON.Søren Sandmann Pedersen1-0/+5
This hopefully fixes the build failure on OS X.
2010-09-21add enable-cache-prefetch optionLiu Xinyun1-659/+0
OK. here is the work to clear all cache prefetch. Please review it. 3x On Tue, Sep 21, 2010 at 11:36:30PM +0800, Soeren Sandmann wrote: > Liu Xinyun <xinyun.liu@intel.com> writes: > > > This patch is to add a new configuration option: enable-cache-prefetch, > > which is default yes. > > > > Here is a link which talks on cache issue. > > http://lists.freedesktop.org/archives/pixman/2010-June/000218.html > > > > When disable it on Atom CPU(configured with --enable-cache-prefetch=no), > > it will have a little performance gain. Here is the patch. > > I think the cache prefetch code should just be deleted outright. No > benchmarks that I'm aware of show it to be an improvement. > > > Thanks, > Soren >From bca2192ef524bcae4eea84d0ffed9e8c4855675f Mon Sep 17 00:00:00 2001 From: Liu Xinyun <xinyun.liu@intel.com> Date: Wed, 22 Sep 2010 00:11:56 +0800 Subject: [PATCH] remove cache prefetch
2010-09-21Post-release version bump to 0.19.5Søren Sandmann Pedersen1-1/+1
2010-09-21Pre-release version bump to 0.19.4Søren Sandmann Pedersen1-1/+1
2010-09-21compute_composite_region32: Zero extents before returning FALSE.Søren Sandmann Pedersen1-0/+4
If the extents of the composite region are broken such that x2 <= x1 or y2 <= y1, then we need to zero the extents before returning so that the region won't be completely broken when calling pixman_region32_fini().
2010-09-21Add a lowlevel blitter benchmarkJonathan Morton2-2/+721
This test is a modified version of Siarhei's compositor throughput benchmark. It's expanded with explicit reporting of memory bandwidth consumption for the M-test, and with an additional 8x8-random test intended to determine peak ops/sec capability. There are also quite a lot more operations tested for.
2010-09-21Add noinline macroDmitri Vorobiev1-0/+5
This patch adds a noinline macro, which expands to compiler-dependent keywords that tell the compiler to never inline a function.
2010-09-21Add gettime() routine to test utilsDmitri Vorobiev3-1/+31
Impending benchmark code will need a function to get current time in seconds, and this patch introduces such routine. We try to use the POSIX gettimeofday() function when available, and fall back to clock() when not.
2010-09-21Move aligned_malloc() to utilsDmitri Vorobiev3-15/+19
The aligned_malloc() routine will be used in more than one test utility. At least, a low-level blitter benchmark needs it. Therefore, let's make this function a part of common test utilities code.
2010-09-21Enable bits_image_fetch_bilinear_affine_normal_r5g6b5Søren Sandmann Pedersen1-4/+0
2010-09-21Enable bits_image_fetch_bilinear_affine_reflect_r5g6b5Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_none_r5g6b5Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_pad_r5g6b5Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_normal_a8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_reflect_a8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_none_a8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_pad_a8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_normal_x8r8g8b8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_reflect_x8r8g8b8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_none_x8r8g8b8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_pad_x8r8g8b8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_normal_a8r8g8b8Søren Sandmann Pedersen1-2/+2
2010-09-21Enable bits_image_fetch_bilinear_affine_reflect_a8r8g8b8Søren Sandmann Pedersen1-2/+2