summaryrefslogtreecommitdiff
path: root/uxa
AgeCommit message (Collapse)AuthorFilesLines
2010-05-28uxa: Fix prepare_solid being called without check_solid first.Eric Anholt1-0/+5
Fixes GPU hang on gen6.
2010-05-28uxa: Skip the redundant miComputeCompositeRects() when adding to the maskChris Wilson1-33/+20
As we are in full control of the destination (the temporary glyph mask) and the source (the glyph cache) we know that there are no clip regions on either and so can skip computing the composite rectangles. (We trust the device clipping to prevent compositing outside the target.) x11perf on PineView: 701/686 -> 881/856 kglyphs/s [aa/rgb] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-28uxa: Make the glyph caches' fixed size explicit.Chris Wilson2-9/+6
Until we actual resize the glyph cache dynamically, make it obvious to the reader and the compiler that the size is fixed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-28uxa: Use a glyph private rather than a hash table.Chris Wilson3-196/+141
Store the cache position directly on the glyph using a devPrivate rather than an through auxiliary hash table. x11perf on PineView: 650/638 kglyphs/s -> 701/686 kglyphs/s [aa/rgb] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-26uxa: Perform manual damage for CompositeRectsChris Wilson1-0/+5
[xserver-1.8] The damage layer doesn't wrap CompositeRects, so we need to manually append the damaged region ourselves. This works for miCompsiteRects since that translates the call into multiple invocations of either PolyFillRectangle or Composite, which themselves cause damage. Fixes: Bug 28120 - Tint2's tooltip borders end up at 0,0 and do not disappear https://bugs.freedesktop.org/show_bug.cgi?id=28120 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-26uxa: Force the alpha value to 0xffff when treating Over as SrcChris Wilson1-1/+3
Since we have at most 8 bits of alpha, we treat >= 0xff00 as opaque. However, being paranoid we should set the alpha value to 0xfff in case something unexpected happens when converting from the xRenderColor to the pixel value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-26uxa: Use Composite rather than solid blitter for PolyRectChris Wilson1-22/+105
Due to the relocation overhead, using a single composite with many rectangles outperforms many solid blits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-26uxa: Add PICT format mapping for depth 4 pixmaps.Chris Wilson1-0/+1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-25uxa: Apply the drawable offset to the solid rectsChris Wilson1-6/+9
Fixes: Bug 28120 - Tint2's tooltip borders end up at 0,0 and do not disappear https://bugs.freedesktop.org/show_bug.cgi?id=28120 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24uxa: Use temporary dest when target is too large for compositorChris Wilson4-102/+364
If the destination cannot fit into the 3D pipeline when we need to composite, we fallback to doing the operation on the CPU. This is very slow, and quite easy to trigger on i915 by plugging in an external display. An alternative is to extract the extents of the operation from the destination using the blitter which can usually handle much larger operations. This gives us a temporary target that can fit into the 3D pipeline and thus be accelerated, before copying back into the larger real destination. For x11perf this boosts glyph rendering on PineView, from 38kglyphs/s to 480kglyphs/s. Just a little shy of the native performance of 601kglyphs/s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24uxa: Composite glyphs directly onto dst when possible.Chris Wilson3-158/+324
Without using a mask and compositing directly onto the destination, takes us from 580 kglyphs/s to 850 kglyphs/s on i945 [x11perf -aa10text]. However, the extra intersection check almost entirely cancels out the speed up and we discover that the glyphs in x11perf are always overlapping. Nothing is ever easy. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-24uxa: translate the region in line for compositesChris Wilson1-19/+14
When compositing, we need to convert the box into a rect and so the advantages of using REGION_TRANSLATE are lost. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-23uxa: Spans! OMG!Chris Wilson3-27/+187
Use composite rather than solid blits in order to bring performance on a par with the CPU when using GEM and relocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-16uxa: Replace solid planemask [0xffffffff] with FB_ALLONESChris Wilson1-4/+4
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-16uxa: Tidy uxa_solid_rects()Chris Wilson1-5/+5
Move the operator reduction after a few fallbacks, closer to its use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-16uxa: Patterns are acquired at 0,0Chris Wilson1-4/+4
Set the correct offset for the gradients patterns after rendering to a local Picture. Fixes cairo/test/huge-radial and friends Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Force an alpha channel when rendering source fallbacksChris Wilson1-0/+6
As the source may not cover the extents, we need to represent those areas as transparent in the fallback picture, ergo we need an alpha channel. We could be smarter and force a format conversion when necessary, and we could let the backend choose the most appropriate format. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Apply clip for solid rectangles.Chris Wilson1-41/+118
References: Bug 28120 - Tint2's tooltip borders end up at 0,0 and do not disappear https://bugs.freedesktop.org/show_bug.cgi?id=28120 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Avoid using blits when with PictFilterConvolutionChris Wilson1-2/+6
References: Bug 28098 Compiz renders shadows wrong, garbage line of pixels along left and top edge of windows https://bugs.freedesktop.org/show_bug.cgi?id=28098 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Check the w-scaling component is 1 for an translation matrixChris Wilson1-1/+2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Fix order of conditionals to only run fill_region for SRC or opaqueChris Wilson1-64/+64
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Expand the range of compatible formats to cover all bpp.Chris Wilson1-7/+9
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Only use 1x1R as a solid with an opaque format or SRCChris Wilson1-1/+2
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-15uxa: Call check_solid before running the solid blitter.Chris Wilson1-4/+8
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14uxa: Disable compatible src xrgb and dst argbChris Wilson1-1/+3
I'm seeing garbage alpha for rendercheck blend: x8r8g8b8a 10x10 SRC ar8g8b8a so disable blitting until I work out if we can fast-path it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14uxa: Parse BGRA pixel formats.Chris Wilson1-8/+45
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14Split the prepare blitter functions into check + prepare.Chris Wilson3-9/+36
Allow us to check whether we can handle the operation using the blitter prior to doing any work. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-14uxa: enable solid rects for backends that require pixmapsChris Wilson1-12/+19
Convert the color into a (cached) pixmap if the backend cannot handle the SolidFill natively. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-13uxa: Convert 1x1R back to solid_fillChris Wilson1-76/+75
In the change to prevent blitting between incompatible sources, we also prevented 1x1R pixmaps from being used for solid fills. Reorder the sequence of conditions to enable this fast path again.
2010-05-13uxa: Only use solid_fill for SRC.Chris Wilson1-6/+10
2010-05-13uxa: Replace source for CLEAR with a transparent solidChris Wilson1-6/+29
This means that we will hit the faster try_solid_fill path instead.
2010-05-13uxa: Fallback early if compositing with alphaMapsChris Wilson1-12/+9
2010-05-12uxa: Avoid glyph ping-pong with !offscreen destinationChris Wilson1-0/+119
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-12uxa: Avoid ping-pong with !offscreen destination and trapsChris Wilson1-18/+41
If we are destined to target an !offscreen drawable, then uploading the trapezoid mask to a bo is the last thing we actually want to do... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-12uxa: Fallback when compositing to a !offscreen destinationChris Wilson1-0/+3
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-12uxa: Use accelerated PutImage for uploading pixman images.Chris Wilson1-47/+57
Short-circuits the current use of PutImage from CopyArea, bypassing all the temporary allocations.
2010-05-12uxa: solid rectsChris Wilson3-0/+81
The cost of performing relocations outweigh the advantages of using the blitter for solids with lots of rectangles. References: Bug 22127 - [UXA] 50% performance regression for XRenderFillRectangles https://bugs.freedesktop.org/show_bug.cgi?id=22127 By using the 3D pipeline we improve our performance by around 4x on i945, measured by the jxbench microbenchmark, and a factor of 10x by short-cutting to the 3D pipeline for blended rectangles. Before, on a i945GME: 19982.412060 Ops/s; rects (!); 15x15 9599.131693 Ops/s; rects (!); 75x75 3803.654743 Ops/s; rects (!); 250x250 6836.743772 Ops/s; rects blended; 15x15 1443.750000 Ops/s; rects blended; 75x75 495.335821 Ops/s; rects blended; 250x250 23247.933884 Ops/s; rects composition (!); 15x15 10993.073048 Ops/s; rects composition (!); 75x75 3595.905172 Ops/s; rects composition (!); 250x250 After: 87271.145975 Ops/s; rects (!); 15x15 32347.744361 Ops/s; rects (!); 75x75 5884.177215 Ops/s; rects (!); 250x250 73500.000000 Ops/s; rects blended; 15x15 33580.882353 Ops/s; rects blended; 75x75 5858.811749 Ops/s; rects blended; 250x250 25582.317073 Ops/s; rects composition (!); 15x15 6664.728682 Ops/s; rects composition (!); 75x75 14965.909091 Ops/s; rects composition (!); 250x250 [suspicious] This has no impact on Cairo, but I have a suspicion from watching xtrace that Qt likes to blit thousands of 1x1 rectangles with the same colour. However, we are still around 2-3x slower than the reported figures for EXA! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-12debug: Add names for operatorsChris Wilson1-14/+73
Most useful for confirming my worst fears: unwarranted use of OutReverse + Add. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-11uxa: Recheck texture after acquiring pattern.Chris Wilson1-49/+57
As the first step to handling unsupported texture formats, double check that the converted pattern can be used as a texture by the card. Fixes: rendercheck -t repeat Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa: Protect against valid SourcePict in uxa_acquire_mask()Chris Wilson1-2/+7
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa,i915: Handle SourcePict through uxa_composite()Chris Wilson1-14/+37
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-10uxa: Rearrange checking and preparing of composite textures.Chris Wilson2-75/+68
x11perf regression caused by 2D driver https://bugs.freedesktop.org/show_bug.cgi?id=28047 caused by commit a7b800513fcc94e063dfd68d2f63b6bab7fae47d uxa: Extract sub-region from in-memory buffers. The issue is that as we extract the region prior to checking whether the composite can in fact be accelerated, we perform expensive surplus operations. This is particularly noticeable for ComponentAlpha text, such as rgb10text. The solution here is to rearrange the check_composite() prior to acquiring the sources, and only extracting the subregion if the render path can not actually handle the texture. Performance (on PineView): a7b800513^: aa=68600 glyphs/s, rgb=29900 glyphs/s a7b800513: aa=65700 glyphs/s, rgb=13200 glyphs/s now: aa=66800 glyph/s, rgb=28800 glyphs/s The residual lossage seems to be from the extra function call and dixPrivate lookups. Hmm. More warning is the extremely low performance, however the results are consistent so the improvement looks real... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-08uxa: Transform composites with a simple translation into a blitChris Wilson1-11/+15
We can also convert a composite with an integer translation into a blit, so long as the sample extents remains within the source. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-05-08uxa: Extract sub-region from in-memory buffers.Chris Wilson1-23/+183
If the buffer is too large or not suitable for a GPU operation, we currently fallback and perform the composite on the CPU. An alternative is too extract the small region out of the source (as usually the sample extents are much smaller than the actual surface size) and try the composite with the new surface. The effect is particularly noticeable on pathological websites that use very large background images. For example, http://www.woodtv.com/ uses a 1299x15000 pattern that is obscured by another opaque pattern. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2010-04-14Revert "Revert "uxa: Try using put_image when copying from a memory buffer.""Chris Wilson1-8/+43
This reverts commit 6d50553e8f70d8f2142efdfd6c90bc27a599d0bc. Now we have taught the fallback path not to infinitely recurse, re-enable the accelerated path for ShmPutImage and friends.
2010-04-12Revert "uxa: Try using put_image when copying from a memory buffer."Eric Anholt1-43/+8
This reverts commit 27195d7dba0f3ff08b92f3fd916cdf5113cbef58. put_image often calls copy_area. Which calls put_image. Exhausting of the stack follows.
2010-04-12Revert "uxa: Add fallback warnings for PutImage."Chris Wilson1-23/+3
This reverts commit 299b0338d0811192dc4f8eae5d79453e9882c5d1. A debugging patch, it was never intended to go into master
2010-04-10uxa: Try using put_image when copying from a memory buffer.Chris Wilson1-8/+43
Often, for example in the fallback for ShmPutImage, we will attempt to use uxa_copy_area() copying to a normal pixmap from a memory buffer. This triggers a fallback, and maps the destination pixmap back into the GTT. The accelerated put_image path will attempt to stream a blit to the destination pixmap if it is currently active, avoiding the stall.
2010-04-10uxa: Add fallback warnings for PutImage.Chris Wilson1-3/+23
2010-03-25uxa make: remove unused XORG_INCS and DIX_CFLAGS variablesGaetan Nadon1-4/+1
Most likely copied from xserver makefile. Acked-by: Dan Nicholson <dbn.lists@gmail.com> Signed-off-by: Gaetan Nadon <memsize@videotron.ca>