~gongzg/glamor - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2012-08-10	Bump to version 0.5.HEAD v0.5 master	Zhigang Gong	1	-1/+1
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-10	Increase vbo size to 64K verts.	Zhigang Gong	2	-2/+1
	This commit will benefit vertex stressing cases such as aa10text/rgb10text, and can get about 15% performance gain. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Acked-by: Junyan <junyan.he@linux.intel.com>
2012-08-10	Silence compilation warnings.	Zhigang Gong	17	-337/+280
	After increase to gcc4.7, it reports more warnings, now fix them. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com> Tested-by: Junyan He<junyan.he@linux.intel.com>
2012-08-08	glamor_largepixmap: Fixed a bug in repeat clipping.	Zhigang Gong	1	-18/+6
	If the repeat direction only has one block, then we need to set the dx/dy to cover all the extent. This commit also silence some compilation warnings. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-08	Prefer KHR_surfaceless_context EGL extension over KHR_surfaceless_opengl/gles2.	Michel Dänzer	1	-3/+10
	Current Mesa Git only advertises the former instead of the latter. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-08	Print space between name of missing EGL extension and 'required'.	Michel Dänzer	1	-1/+1
	Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-07	Fallback to pixman when trapezoid mask is big.	Junyan He	2	-31/+75
	The trapezoid generating speed of the shader is relatively slower when the trapezoid area is big. We fallback when the trapezoid's width and height is bigger enough. The big traps number will also slow down the render because of the VBO size. We fallback if ntrap > 256 Signed-off-by: Junyan He <junyan.he@linux.intel.com> Reviewed-By: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-02	glamor_glyphs: When dst arg point to a NULL buffer, dont't flush.	Zhigang Gong	1	-0/+3
	This is a corner case, when we render glyphs via mask cache, and when we need to upload new glyphs cache, we need to flush both the mask and dest buffer. But we the dest arg may point to a NULL buffer at that time, we need to check it firstly. If the dest buffer is NULL. Then we don't need to flush both the dest and mask buffer. This commit fix a potential crash. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-08-02	glamor_trapezoid: workaround a glsl like problem.	Zhigang Gong	1	-2/+10
	It seems that the following statement cann't run as expected on SNB. bool trap_left_vertical = (abs(trap_left_vertical_f - 1.0) <= 0.0001); Have to rewrite it to another style to let the vertical edge trapezoid to be rendered correctly. Reviewed-by: Junyan He <junyan.he@linux.intel.com> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-23	Fix the problem of VBO leak.	Junyan He	2	-6/+6
	In some cases we allocate the VBO but have no vertex to emit, which cause the VBO fail to be released. Fix it. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-07-23	Just use the shader to generate trapezoid if PolyMode == Imprecise	Junyan He	1	-9/+14
	The precise mode of trapezoid rendering need to sample the trapezoid on the centre points of an (2n+1)x(2n-1) subpixel grid. It is computationally expensive in shader, and we use inside area ratio to replace it. The result has some difference, and we just use it if the polymode == Imprecise. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-07-23	Change the trapezoid render to use VBO.	Junyan He	2	-173/+330
	Because some uniform variables need to be set for every trapezoid rendering, we can not use vbo to render multi trapezoids one time, which have performance big loss. We now add attributes which contain the same value to bypass the uniform variable problem. The uniform value for one trapezoid will be set to the same value to all the vertex of that trapezoid as an attribute, then in FS, it is still a constant. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-07-16	Added the missed header file for xorg 1.13 compat.	Zhigang Gong	1	-0/+107
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-16	Synch with xorg 1.13 change.	Zhigang Gong	5	-10/+23
	As xorg 1.13 change the scrn interaces and remove those global arrays. Some API change cause we can't build. Now fix it. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-13	gles2: Fixed the compilation problem and some bugs.	Zhigang Gong	4	-4/+7
	Previous patch doesn't set the offset to zero for GLESv2 path. Now fix it. This patch also fix a minor problem in pixmap uploading preparation. If the revert is not REVERT_NORMAL, then we don't need to prepare a fbo for it. As current mesa i965 gles2 driver doesn't support to set a A8 texture as a fbo target, we must fix this problem. As some A1/A8 picture need to be uploaded, this is the only place a A8 texture may be attached to a fbo. This patch also enable the shader gradient for GLESv2. The reason we disable it before is that some glsl linker doesn't support link different objects which have cross reference. Now we don't have that problem. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-12	Stream vertex data to VBOs.	Michel Dänzer	1	-15/+15
	Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-11	Fix translation of clip region for composite fallback.	Michel D=C3=A4nzer	1	-2/+2
	Fixes incorrectly clipped rendering. E.g. the cursor in Evolution composer windows became invisible. Signed-off-by: Michel Daenzer <michel.daenzer@amd.com> Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-11	glamor_glyphs: Don't merge extents for different lists.	Zhigang Gong	1	-39/+71
	If we merge all lists's extent together, than we may have some fail overlap checking. Here is a simple: A E B F C D The first list has vertical "ABCD". And the second list has two char "EF". When detecting E, it can successfully find it doesn't overlap with previous glyphs. But after that, the original code will merge the previous extent with E's extent, then the extent will cover "F", so when detecting F, it will be treated as overlapped. We can simply solve this issue by not merge extent from different list. We can union different list's extent to a global region. And then do the intersect checkint between that region and current glyph extent, then we can avoid that fail checking. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-11	glamor_copyarea: Use blitcopy if current state is not render.	Zhigang Gong	6	-44/+24
	Practically, for pure 2D blit, the blit copy is much faster than textured copy. For the x11perf copywinwin100, it's about 3x faster. But if we have heavy rendering/compositing, then use textured copy will get much better (>30%)performance for most of the cases. So we simply add a data element to track current state. For rendering state we use textured copy, otherwise, we use blit copy. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-11	glamor_glyphs: Use cache picture to store mask picture if possible.	Zhigang Gong	1	-215/+830
	By default, mask picture is newly created, and each time we need to clear the whole mask picture, and then composite glyphs to the mask picture and then composite the mask picture to destination. Testing results shows that the filling of the mask picture takes a big portion of the rendering time. As we don't really need to clear the whole region, we just need to clear the real overlapped region. This commit is to solve this issue. We split a large glyphs list to serval lists and each list is non-overlapped or overlapped. we can reduce the length of overlapped glyphs to do the glyphs_via_mask to 2 or 3 glyphs one time for most cases. Thus it give us a case to allocate a small portion of the corresponding cache directly as the mask picture. Then we can rendering the glyphs to this mask picture, and latter we can accumulate the second steps, composite the mask to the dest with the other non-overlapped glyphs's rendering process. It also make us implement a batch mask cache blocks clearing algorithm to avoid too frequently small region clearing. If there is no any overlapping, this method will not get performance gain. If there is some overlapping, then this algorithm can get about 15% performance gain. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_compositerects: Implement optimized version.	Zhigang Gong	6	-49/+439
	Don't call miCompositeRects. Use glamor_composite_clipped_region to render those boxes at once. Also add a new function glamor_solid_boxes to fill boxes at once. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	optimize: Use likely and unlikely.	Zhigang Gong	3	-19/+78
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	create_pixmap: use texture for large glyphs.	Zhigang Gong	1	-1/+1
	As we only cache glyphs smaller than 64x64, we need to use texutre for the large glyphs. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_copyarea: Fixed a bug introduced by 996194...	Zhigang Gong	1	-2/+6
	Default return value should be FALSE. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_glyphs: Slightly performance tuning.	Zhigang Gong	2	-56/+37
	As glamor_glyphs never fallback, we don't need to keep the underlying glyphs routines, just override the ps->glys Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_render: Don't allocate buffer for vbo each time.	Zhigang Gong	1	-5/+16
	We can reuse the last one if the last one is big enough to contain current vertext data. In the meantime, Use MapBufferRange instead of MapBuffer. Testing shows, this patch brings some benefit for aa10text/rgb10text. Not too much, but indeed faster. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_largepixmap: Walkaround for large texture's upload.	Zhigang Gong	2	-1/+6
	I met a problem with large texture (larger than 7000x7000)'s uploading on SNB platform. The map_gtt get back a mapped VA without error, but write to that virtual address triggers BUS error. This work around is to avoid that direct uploading. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_render: Optimize the two pass ca rendering.	Zhigang Gong	3	-111/+169
	For the componentAlpha with PictOpOver, we use two pass rendering to implement it. Previous implementation call two times the glamor_composite_... independently which is very inefficient. Now we change the control flow, and do the two pass internally and avoid duplicate works. For the x11perf -rgb10text, this optimization can get about 30% improvement. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_composite_glyph: Optimize glyphs with non-solid pattern.	Zhigang Gong	1	-10/+78
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-07-03	glamor_glyphs: Detect fake or real glyphs overlap.	Zhigang Gong	4	-103/+356
	To split a glyph's extent region to three sub-boxes as below. left box 2 x h center box (w-4) x h right box 2 x h Take a simple glyph A as an example: * __* __ **** * * ~~ ~~ The left box and right boxes are both 2 x 2. The center box is 2 x 4. The left box has two bitmaps 0001'b and 0010'b to indicate the real inked area. The right box also has two bitmaps 0010'b and 0001'b. And then we can check the inked area in left and right boxes with previous glyph. If the direction is from left to right, then we need to check the previous right bitmap with current left bitmap. And if we found the center box has overlapped or we overlap with not only the previous glyph, we will treat it as real overlapped and will render the glyphs via mask. If we only intersect with previous glyph on the left/right edge. Then we further compute the real overlapped bits. We set a loose check criteria here, if it has less than two pixel overlapping, we treat it as non-overlapping. With this patch, The aa10text boost fom 1660000 to 320000. Almost double the performance! And the cairo test result is the same as without this patch. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-21	glamor_render: Don't fallback when rendering glyphs with OpOver.	Zhigang Gong	1	-3/+25
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-21	glamor_create_pixmap: Allocate glyphs pixmap in memory.	Zhigang Gong	1	-0/+1
	As we have glyphs atlas cache, we don't need to hold each glyphs on GPU. And for the subsequent optimization, we need to store the original glyphs pixmap on system memory. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-21	glamor_fbo: fix a memory leak for large pixmap.	Zhigang Gong	1	-1/+2
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-15	Fix a bug for trapezoid clip	Junyan He	1	-8/+107
	We find in some cases the trapezoid will be render as a triangle and the left edge and right edge will cross with each other just bellow the top or over the bottom. The distance between the cross poind and the top or bottom is less than pixman_fixed_1_minus_e, so after the fixed converted to int, the cross point has the same value with the top or botton and the triangle should not be affected. But in our clip logic, the cross point will be clipped out. So add a logic to fix this problem. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-15	gles2_largepixmap: force clip for a non-large pixmap.	Zhigang Gong	2	-8/+32
	One case we need force clip when download/upload a drm_texture pixmap. Actually, this is only meaningful for testing purpose. As we may set the max_fbo_size to a very small value, but the drm texture may exceed this value but the drm texture pixmap is not largepixmap. This is not a problem with OpenGL. But for GLES2, we may need to call glamor_es2_pixmap_read_prepare to create a temporary fbo to do the color conversion. Then we have to force clip the drm pixmap here to avoid large pixmap handling at glamor_es2_pixmap_read_prepare. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-15	glamor_emit_composite_vert: Optimize to don't do two times vert coping.	Zhigang Gong	3	-137/+216
	We change some macros to put the vert to the vertex buffer directly when we cacluating it. This way, we can get about 4% performance gain. This commit also fixed one RepeatPad bug, when we RepeatPad a not eaxct size fbo. We need to calculate the edge. The edge should be 1.0 - half point, not 1.0. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-15	glamor_glyphs: Before get upload to cache flush is needed.	Zhigang Gong	1	-87/+139
	When we can't get a cache hit and have to evict one cache entry to upload new picture, we need to flush the previous buffer. Otherwise, we may get corrupt glyphs rendering result. This is the reason why user-font-proxy may fail sometimes. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-15	copyarea: Cleanup the error handling logic.	Zhigang Gong	1	-6/+8
	Should use ok rather than mixed ok or ret. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-14	trapezoid: Fallback to sw-rasterize for largepixmap.	Zhigang Gong	1	-4/+13
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-12	Use the direct render path for A1	Junyan He	2	-8/+27
	Because when mask depth is 1, there is no Anti-Alias at all, in this case, the directly render can work well and it is faseter. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-12	Add the trapezoid direct render logic	Junyan He	3	-0/+806
	We firstly get the render area by clipping the trapezoid with the clip rect, then split the clipped area into small triangles and use the composite logic to generate the result directly. This manner is fast but have the problem that some implementation of GL do not implement the Anti-Alias of triangles fill, so the edge sometimes has sawtooth. It is not acceptable when use trapezoid to approximate circles and wide lines. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-12	Modilfy the composite logic to two phases	Junyan He	4	-81/+134
	We seperate the composite to two phases, firstly to select the shader according to source type and logic op, setting the right parameters. Then we emit the vertex array to generate the dest result. The reason why we do this is that the shader may be used to composite no only rect, trapezoid and triangle render function can also use it to render triangles and polygens. The old function glamor_composite_with_shader do the whole two phases work and can not match the new request. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-12	Add macro of vertex setting for triangle stripe	Junyan He	3	-54/+81
	Add macro of vertex setting for triangle stripe draw, and make the code clear. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-12	Use shader to generate the temp trapezoid mask	RobinHe	3	-47/+657
	The old manner of trapezoid render uses pixman to generate a mask pixmap and upload it to the GPU. This effect the performance. We now use shader to generate the temp trapezoid mask to avoid the uploading of this pixmap. We implement a anti-alias manner in the shader according to pixman, which will caculate the area inside the trapezoid dividing total area for every pixel and assign it to the alpha value of that pixel. The pixman use a int-to-fix manner to approximate but the shader use float, so the result may have some difference. Because the array in the shader has optimization problem, we need to emit the vertex of every trapezoid every time, which will effect the performance a lot. Need to improve it. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-12	Create the file glamor_triangles.c	RobinHe	4	-149/+189
	Create the file glamor_trapezoid.c, extract the logic relating to trapezoid from glamor_render.c to this file. Signed-off-by: Junyan He <junyan.he@linux.intel.com>
2012-06-11	Enable large pixmap by default.for_large_pixmap	Zhigang Gong	1	-2/+2
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-11	largepixmap: Fix the selfcopy issue.	Zhigang Gong	6	-44/+86
	If the source and destination are the same pixmap/fbo, and we need to split the copy to small pieces. Then we do need to consider the sequence of the small pieces when the copy area has overlaps. This commit take the reverse/upsidedown into the clipping function, thus it can generate correct sequence and avoid corruption self copying. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-11	largepixmap: Support self composite for large pixmap.	Zhigang Gong	3	-45/+57
	The simplest way to support large pixmap's self compositing is to just clone a pixmap private data structure, and change the fbo and box to point to the correct postions. Don't need to copy a new box. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-11	largepixmap: Add transform/repeat/reflect/pad support.	Zhigang Gong	3	-75/+981
	This commit implement almost all the needed functions for the large pixmap support. It's almost complete. Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>
2012-06-11	glamor_getimage: should call miGetimage if failed to get sub-image.	Zhigang Gong	1	-1/+3
	Signed-off-by: Zhigang Gong <zhigang.gong@linux.intel.com>