1Mesa 21.1.0 Release Notes / 2021-05-05
2======================================
3
4Mesa 21.1.0 is a new development release. People who are concerned
5with stability and reliability should stick with a previous release or
6wait for Mesa 21.1.1.
7
8Mesa 21.1.0 implements the OpenGL 4.6 API, but the version reported by
9glGetString(GL_VERSION) or glGetIntegerv(GL_MAJOR_VERSION) /
10glGetIntegerv(GL_MINOR_VERSION) depends on the particular driver being used.
11Some drivers don't support all the features required in OpenGL 4.6. OpenGL
124.6 is **only** available if requested at context creation.
13Compatibility contexts may report a lower version depending on each driver.
14
15Mesa 21.1.0 implements the Vulkan 1.2 API, but the version reported by
16the apiVersion property of the VkPhysicalDeviceProperties struct
17depends on the particular driver being used.
18
19SHA256 checksum
20---------------
21
22::
23
24    0128f10e22970d3aed3d1034003731f94623015cd9797c07151417649c1b1ff8  mesa-21.1.0.tar.xz
25
26
27New features
28------------
29
30- VK_KHR_workgroup_memory_explicit_layout on Intel, RADV
31
32- DRM format modifiers for AMD.
33
34- VK_KHR_zero_initialize_workgroup_memory on Intel, RADV
35
36- Zink exposes GL 4.6 and ES 3.1
37
38- GL_EXT_depth_bounds_test on softpipe, zink
39
40- GL_EXT_texture_filter_minmax on nvc0 (gm200+)
41
42- GL_ARB_texture_filter_minmax on nvc0 (gm200+)
43
44- GL_ARB_post_depth_coverage on zink
45
46- VK_KHR_copy_commands2 on lavapipe
47
48- lavapipe exposes Vulkan 1.1
49
50- VRS attachment on RADV
51
52- None
53
54
55Bug fixes
56---------
57
58- No sRGB capable visuals/fbconfigs reported in glx
59- Graphics corruption and GPU hang with RADV/LLVM
60- old kernels (4.19) support in radv
61- Elite Dangerous: Odyssey alpha crashes GPU on launch
62- CSGO: Some default variables can cause problems with trust mode
63- mesa git started to break wine + UnrealTournament.exe (old dx6 game)
64- SuperTuxKart artifacting on RK3399
65- [amdgpu]: Golf With Your Friends (431240): ERROR Waiting for fences timed out
66- Strange results when trying to read from VK_FORMAT_R64_SFLOAT in compute shader
67- anv: dEQP-VK.binding_model.buffer_device_address.set3.depth3.basessbo.convertcheck* slow
68- Iris doesn't support INTEL_performance_query anymore
69- RADV: TRUNC_COORD breaks gather operations
70- [RADV] corruption in avatar after dying in Heroes of the Storm
71- Metro Exodus crashing due to memory overflow
72- Sauerbraten shader rendering broken on RV530 (r300g)
73- texture glitches on CS:GO on Tiger Lake
74- Build fail due to "parameter name omitted" on Gallium Nine
75- Non-DRI builds broken by recent cleanups in Mesa core
76- Cinnamon core dump after installing latest oibaf mesa build (165a69d2)
77- yuv sampler lowering regression
78- anv: anv_descriptor_set_binding_layout::array_size overflows u16
79- RADV - Vertex explosion in DIRT 5 on RDNA2
80- ci: Use renderdoc from debian
81- ci: Use debian apitrace in x86 images
82- SIGSEV in v3d_emit_gl_shader_state
83- Xorg crash due to assertion failure after GPU soft reset
84- AMD hevc_vaapi ffmpeg encoding = wrong image width (48px black bar on the right)
85- panfrost: Page fault in glamor when running GIMP with X11 on Mali T860
86- gallium: python trace scripts need updating
87- EGL context creation fails when EGL_KHR_create_context_no_error is mentioned for OpenGL ES 1.1.
88- [spirv-fuzz] NIR validation failed after spirv_to_nir: error: nir_block_dominates
89- [bisected][regression][i965,iris] dEQP-VK.clipping.user_defined.clip_cull_distance.* failing on multiple platforms
90- No Mans Sky GPU hang on Radeon ACO
91- radeonsi: prusa-slicer crashes on mesa 21
92- anv: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec3_requiredsubgroupsize16 fails on ICL, TGL
93- [radeonsi] Rendering of Firefox UI and website content corrupts randomly and after window geometry changes
94- max_binding inconsistency in vulkan descriptor set drivers
95- anv: conservative rasterization ext question mark
96- Unigine Tropics MSAA failure
97- warning: xnack 'Off' was requested for a processor that does not support it! [AMD VEGAM with LLVM 12.0.0]
98- Compiling some ARB assembly shaders leads to memory corruption
99- Specifying an explicit location for an array output messes up transform feedback
100- Request for VK_EXT_conservative_rasterization support on Anvil Intel driver for newest DXVK..
101- radv: dEQP regressions after addrlib update
102- Up to 30% performance drop (GLBenchmark, GfxBench)
103- DOTA 2 don't no longer starts since commit ad241b15a9e517dd4c4e8d7b1d5dab7c3a74b37c
104- Clover doesn't work for kmsro drivers
105- aco_tests isel.sparse.clause fails with llvm-12
106- util cpu detection breaks on 128-core AMD machines
107- util cpu detection breaks on 128-core AMD machines
108- Default GL_MAX_TEXTURE_BUFFER_SIZE very small
109- intel_nullhw.c:41:38: error: field ‘vtable’ has incomplete type
110- ACO error with GCN 1 GPU
111- kmsro advertises EGL_MESA_device_software
112- d3d12: Use ID3D12Device9::CreateCommandQueue1 when available
113- [RADV] Halo: The Master Chief Collection: Crash in Halo Reach Firefight
114- freedreno: use SAMPLE_COUNT to autotune sysmem vs gmem
115- freedreno: draw_vbo optimizations
116- [Bisected][RadeonSI] Mesa crashes when rendering with Eevee in Blender
117- subgroupBallotFindMSB() broken in RADV/ACO 20.3.4
118- nir_print: util_cpu_detect() is not called prior to _mesa_half_to_float()
119- turnip: buffer overflow read on dEQP-VK.ycbcr.query.levels.tess_eval.r8g8b8a8_unorm
120- RuneScape crashes GLOn12
121- d3d12: Surfaces need to use shareable descriptors
122- [RADV][RDNA2] Red Dead Redemption 2 image glitches during menu/overlay menu transitions
123- "unknown intrinsic" assertion triggered by multiview shader in non-multiview renderpass in Vulkan on intel
124- [i965][g965,ilk,g33,g45][bisected] dEQP-GLES2.functional.fbo.completeness.attachment_combinations.* failures
125- radv: VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT exposed for VK_FORMAT_R64_SFLOAT
126- anv: android building error after commit 4fb6c05
127- Compiling mesa with -Dtools=all throws deprecation warnings for intel tools
128- DXVK is broken in latest master
129- OpenCLOn12: Affinity Photo CL kernels produce invalid DXIL phis
130- nir -> tgsi conversion problem
131- [i965][g965,ilk,g33][bisected] fp16 enablement causes deqp test failures
132- mesa/st: Uniforms are not updated after lowering alpha test
133- [i965][bisected][regression] piglit failing primitive-restart-vbo_combined_vertex_and_index on multiple platforms
134- [RADV] Nioh 2 - The Complete Edition: "Bloom" on lights
135- [RADV] Oblivion: Poor Performance while MSAA Is Enabled
136- lima context state bugs with shader compile
137- [RADV][BISECTED] The Surge 2 (644830) - In-game assets do not render correctly since 20.3.4.
138- amd clang cannot convert ‘llvm::AtomicOrdering’ to ‘llvm::MaybeAlign’ build failure
139- [iris][icl,tgl][bisected][regression] failure on piglit.spec.arb_separate_shader_objects.programuniform coverage
140- opencl build fail
141- anv: dEQP-VK.glsl.builtin_var.fragcoord_msaa.* fails
142- Request - depth format feature SAMPLED_IMAGE_FILTER_LINEAR
143- "radeonsi: Check pitch and offset for validity." is a bad commit
144- Add OpenCL information to docs/features.txt
145- [regression] [bisected] piglit.spec.arb_framebuffer_object.fbo-drawbuffers-none gldrawpixels fails
146- RADV: robustBufferAccessUpdateAfterBind is not exposed
147- debug build compilation failed: inlining failed in call to ‘always_inline’ ‘_nir_visit_dest_indirect’: indirect function call with a yet undetermined callee
148- [RADV/DXVK] Shadow artifacts with different games
149- glxgears segfaults with classic i915
150- mesa_glthread=true Black Mesa
151- freedreno: rendering corruption in dead cells
152- ac/rgp: Android building error after commit 12515d6
153- d3d12: Assert failures & crashes on latest master
154- RADV/ACO - DCC causing garbled output on RX570
155- draw.c:121: _mesa_set_draw_vao: Assertion \`vao->_EnabledWithMapMode == _mesa_vao_enable_to_vp_inputs(vao->_AttributeMapMode, vao->Enabled)' failed.
156- ANV: Weird jitter in Witcher 1
157- RADV - Path of Exile: Shimmering outlines where water and other objects meet
158- ANV: Weird jitter in Witcher 1
159- ANV: Weird jitter in Witcher 1
160- meson: meson-built libraries have inconsistent compatability / current versions compared to older autotools-built libraries
161- device select layer breaks other layers
162- RADV: Extreme overhead in vkQueueSubmit
163- Graphical glitch of popupping missing texture on Mesa version >18.0.5 (Padoka Stable + Unstable/Oibaf/ubuntu-x-swat PPAs)
164- [regression] [bisected] dEQP-GLES2.functional.fbo.render.stencil_clear.rbo_rgb5_a1_stencil_index8 fails
165- occasional corruption issue with RADV in multiple games, disappears after using amdvlk
166- panfrost T860 regression
167- OpenGL on GMA4500MHD
168- piglit-replay: JUnit file contains wrong links to the tracie dashboard
169- R8 texture upload / corruption bug on Radeon RX 5700 XT
170- Ambient Occlusion in Two Point Hospital shows black spot artifacts
171- freedreno: async background shader compile
172- AMD VAAPI encoding has ceased to work
173- Rage 2: Visual corruption on in-game menu with ACO.
174- ACO doesn't correctly render map in Borderlands 3 vs. LLVM on 5700 XT
175- Invalid shader under panfrost/wayland
176- Strange Brigade refuses to load correctly since some recent commits
177- GLonD3D12: Crashes and suboptimal fallback
178- GLonD3D12: Crashes and suboptimal fallback
179- GLonD3D12: Crashes and suboptimal fallback
180- [RADV][REGRESSION][BISECTED] radv_GetMemoryFdPropertiesKHR returns no valid memory types for vaapi drmbuf
181- anv: vkQueueSubmit with waitSemaphore value of 0 hangs CPU
182- ttn: invalid base/range triggering nir_validate assertion
183- Sampling with mipmapped HiZ behaves unexpectedly on Gen9
184- zink: ARB_map_buffers issues on CI
185- u_upload_mgr: assert failure for large uploads
186- [RADV][ACO] Overwatch game crash: amd/compiler/aco_insert_exec_mask.cpp: Failed Assertion
187- PRIME render offloading broken
188- Use out encoding for float immediates
189- [RADV] Severe performance drop when exceeding VRAM compared to AMDVLK
190- LIBGL_ALWAYS_SOFTWARE=1 picks zink over actual software rasterizers
191- crash/assert in fd_set_viewport_states
192- RADV: Occlusion query hangs Big Navi GPU
193- "mesa: don't allocate matrices with malloc" cause eglCreateContext problem on android 7.
194- Metal Gear Solid V: The Phantom Pain: texture issues and vertex stretches
195- [iris and Navi 10] piglit.spec.arb_multi_draw_indirect.arb_draw_elements_base_vertex-multidrawelements -indirect regression
196- miscompiled compute shader loop on llvmpipe (and Iris)
197- ci: minio caching of arm64 artifacts for bare-metal
198- Graphics glitches after upgrade to mesa 20.3 on Khadas VIM3 Pro (Mali G52 GPU)
199- glthread crash in _mesa_glthread_upload
200- freedreno piglit flakes
201- RADV: NonUniform OpArrayLength on SSBO ignores NonUniform.
202- Iris driver causing graphics glitch in QEMU spice egl DMA-BUF
203- [RADV/ACO] Death Stranding cause a GPU hung (\*ERROR* Waiting for fences timed out!)
204- [TGL] Elder Scrolls Online misrenders
205- [ANV] System hang with GRVK demos
206- ci: Fractional deqp runs with valgrind enabled.
207- Regression: Segfault in cso_destroy_context() regression in 20.2
208- Rendering artifacts in Barn Finders specifically on Radeon Vega
209- Graphics regression in Assassins Creed Odyssey
210- [ANV] Compilation warnings
211- regression in !8152
212- [bdw][icl][iris] fails new test \`clearbuffer-depth-cs-probe`
213- ci: new traces runner needs dashboard links in the job log and junit
214- zink: car model corruption with game TORCS
215- glGetInternalformati64v(GL_TEXTURE_2D, GL_SR8_EXT, GL_COLOR_ENCODING) returns GL_NONE
216- Windows: 32-bit build is broken hard
217- ANV: Not handling separate stencil layouts properly
218- [Regression][Intel][OpenGL][Bisected] Copying whole 2D array texture failed on latest driver
219- turnip: dEQP-VK.tessellation.invariance.outer_triangle_set.quads_fractional_odd_spacing failure
220- i915 regressions bisected to "vbo/dlist: use a shared index buffer"
221- intel: Chrome OS "hatch" (cometlake) fails on dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_cubemap
222- radv: dEQP-VK.sparse_resources.* failures on GFX9
223- freedreno: rendering corruption in War Robots
224- radv: dEQP-VK.sparse_resources.* failures on GFX9
225- Mesa 20.3.x crashes pidgin on AMD RX480
226- timespec_get used unconditionally / build fails when targeting macOS 10.14 or earlier
227- libunwind not located / used on macOS
228- libunwind not located / used on macOS
229- meson fails to locate libexpat on macOS
230- CarX Drift Racing 2 fails to start
231- Some games using FNA framework show blank screen
232- Intel Vulkan regression of angle_end2end_tests
233- Defer lavapipe warning to queue / command / swapchain buffer creation
234
235
236Changes
237-------
238
239Aaron Watry (2):
240
241- clover: implement CL_IMAGE_ELEMENT_SIZE
242- clover: implement CL_IMAGE_NUM_MIP_LEVELS and CL_IMAGE_NUM_SAMPLES
243
244Abhishek Kumar (2):
245
246- intel: change urb max shader geometry for CML GT1
247- intel: change urb max shader geometry for KBL GT1
248
249Adam Jackson (66):
250
251- docs: Mark some non-core zink extensions complete
252- docs: Mark some ES3 zink features complete
253- egl: Fix error string returned by eglQueryDeviceAttribEXT
254- zink: Factor out instance setup a bit more
255- osmesa: Pacify MSVC in the test code
256- glx: Fix GLX_SGI_video_sync for the no-current-drawable case
257- nir: Silence a warning at -Og
258- softfloat: Silence a warning at -Og
259- glsl: Silence a warning at -Og
260- glsl: Silence some warnings at -Og
261- glsl: Silence a warning at -Og
262- loader: Silence a warning at -Og
263- gallivm: Silence a warning at -Og
264- nir/ttn: Silence some warnings at -Og
265- vl: Silence a warning at -Og
266- gallivm: Silence a warning at -Og
267- nouveau: Silence some warnings at -Og
268- nouveau: Silence a warning at -Og
269- xlib: Fix build regression since 99e25d183d9
270- gallium/xlib: Partial fix for glXCopySubBufferMESA
271- mesa: Store depth bounds test bounds as GLclampd
272- softpipe: Fix depth comparison with float Z formats
273- softpipe: Implement GL_EXT_depth_bounds_test
274- docs: Document GL_EXT_depth_bounds_test
275- zink: Enable GL_EXT_depth_bounds_test
276- zink: more and better debug printfs
277- zink: Fix a thinko in instance setup
278- zink: Wire up ARB_post_depth_coverage
279- glx: Pull use_x_font out of the context vtable
280- glx: Pull get_proc_address out of the context vtable
281- glx: Remove windows' stub {bind,release}_text_image context hooks
282- glx/drisw: Implement WaitX and WaitGL
283- dri: Explicitly handle all the config attributes
284- dri: Fold attribMap into the code
285- mesa: Remove misc pbuffer attributes from struct gl_config
286- mesa: Remove the texture-from-pixmap state from struct gl_config
287- mesa: Remove transparency state from struct gl_config
288- mesa: Remove unused gl_config::level
289- mesa: Remove the pretense of aux buffer support
290- mesa: Stop tracking visual rating in gl_config
291- mesa: Remove redundant gl_config::sampleBuffers
292- ci: Bump the llvmpipe test timeout to 240 seconds
293- mesa/st: Remove unused ST_ATTACHMENT_SAMPLE
294- mesa/st: Check for successful framebuffer allocation in st_api_make_current
295- gallium: Remove curious st_visual::no_config
296- radeon: Exchange one curious idiom for another in radeonMakeCurrent
297- mesa: Remove unused _mesa_create_framebuffer
298- mesa: Make _mesa_initialize_visual return void
299- mesa: Remove unused gl_config::mutableRenderBuffer
300- mesa: gl_config::rgbBits should count alphaBits too
301- dri: Don't tie the accum buffer's alpha-ness to the color buffer's
302- glx: Stop pretending to validate the pbuffer fbconfig attributes
303- glx: Don't downgrade the visual caveat from the server
304- glx: Downgrade aux-buffer-ful fbconfigs
305- glx: Downgrade tfp mipmap-capable fbconfigs
306- glx: Downgrade sRGB-ful fbconfigs
307- dri: Use __DRI_BUFFER_COUNT consistently internally
308- glx: Default sRGBCapable in the same place as the other config attribs
309- glx: Clean up fbconfig attribute handling
310- glx: Remove some #if 0'd DRI config attribute fetch
311- glx: Don't pointlesly add -D_REENTRANT to libGL's cflags
312- glx: Move {Bind,Release}TexImage from context to screen vtable
313- glx: Be more robust against null fbconfigs
314- glx: Lift sending the MakeCurrent request to top-level code
315- Revert "glx: Lift sending the MakeCurrent request to top-level code"
316- gallium/xlib: Fix for recent gl_config changes
317
318Adrian Ratiu (1):
319
320- docs: docker: minor stale documentation fix
321
322Alejandro Piñeiro (36):
323
324- v3dv/pipeline: enable lower_add_sat NIR option
325- v3d/compiler: enable lower_add_sat NIR option
326- v3dv/descriptor: assert CrateDescriptorPool receives valid count values
327- v3dv: drop v3dv_resource definition
328- v3dv: properly handle two different binding points for cmd_buffers
329- v3dv: move to subclassing instance/physical device
330- v3dv: remove reference to v3dv_instance on v3dv_physical_device
331- v3dv: port to using common dispatch code.
332- v3dv: support for depthBiasClamp
333- v3dv/device: clarify that we can't expose textureCompressionBC
334- v3dv/formats: expose support for BC1-3 compressed formats
335- v3dv/meta_copy: get tlb compatible BC compressed formats for copies
336- v3dv/descriptor_set: don't free individual set if not allowed
337- v3dv: avoid some maybe-uninitialized warnings
338- v3dv/pipeline_cache: add more details when dumping debug info
339- v3dv/pipeline: remove pregenerate_variant
340- v3dv/pipeline: remove pipeline->use_push_constants
341- broadcom/compiler: add local_size in v3d_compute_prog_data
342- broadcom/compiler: add driver_location_map at vs prog data
343- v3dv/pipeline: use driver_location_map instead of nir utilities
344- v3dv/pipeline: move topology to pipeline
345- v3dv/pipeline: remove compiled_variant_count field
346- v3dv/pipeline: remove v3d_key from shader_variant and pipeline stage
347- v3dv: define broadcom shader stages
348- v3dv/pipeline: use broadcom_shader_stage as pipeline/variant stage type
349- v3dv/pipeline: try to get the shader variant directly from the cache
350- v3dv/pipeline: don't create a variant if compilation failed
351- v3dv/pipeline: compute sha1 for no-op fragment shaders correctly
352- v3dv/device: avoid unused-result warning with asprintf
353- v3dv: Add support for the on-disk shader cache
354- v3dv/cmd_buffer: return early for draw commands if there is nothing to draw
355- v3dv: define a default attribute values with float type
356- vulkan: track number of bindings instead of max binding for CreateDescriptorSetLayout
357- v3dv/device: do not compute per-pipeline limits multiplying per-stage
358- v3dv/device: fix and cleanup v3dv limits
359- v3dv/pipeline: reduce descriptor_map size
360
361Alexander Kapshuk (1):
362
363- frontends/va/image: Eliminate repetitive code on error paths
364
365Alexander Shi (1):
366
367- mesa: texparam: Add a clamping macro to handle out-of-range floats returned as integers.
368
369Alexander von Gluck IV (1):
370
371- egl/haiku: Fix ConfigID naming inline with mesa
372
373Alyssa Rosenzweig (345):
374
375- pan/bi: Fix assertion
376- pan/bi: Pipe scratch_size in from NIR
377- pan/bi: Fix 64-bit SSBO addresses
378- pan/bi: Fix RA of node 0
379- pan/bi: Fix printing of node 0
380- panfrost: Fix TLS sizing if cores are missing
381- panfrost: Allow waiting on slots 6/7 during preload
382- pan/bi: Add internal debug flag
383- pan/bi: Validate format 12 tuple count in disasm
384- pan/bi: Print FAU index in verbose mode
385- pan/bi: Refactor PC-relative printing
386- pan/bi: Lint for infinite loops
387- pan/bi: Print disasm/stats with DEBUG=internal
388- pan/bi: Fix IDLE register mode packing
389- pan/bi: Fix staging register packing
390- pan/bi: Fix dependency wait calculation
391- pan/bi: Fix M1/M2 decoding in disassembler
392- pan/bi: Pull out bi_count_read_registers helper
393- pan/bi: Move bi_next_clause to bir.c
394- pan/bi: Pass through wait_{6, 7} flags
395- pan/bi: Add dead branch elimination pass
396- pan/bi: Add "soft" mode to DCE
397- pan/bi: Add bi_{before,after}_clause cursors
398- pan/bi: Add bi_foreach_clause_in_block_rev
399- pan/bi: Add bi_foreach_instr_in_tuple helper
400- pan/bi: Add bi_foreach_instr_in_clause iterators
401- pan/bi: Add destination iterator macro
402- pan/bi: Don't open code bi_foreach_dest
403- pan/bi: Permit multiple destinations in RA
404- pan/bi: Add interference per clause
405- pan/bi: Implement spilling at the clause-level
406- pan/bi: Don't fill garbage
407- pan/bi: Add CUBEFACE pseudoinstruction
408- pan/bi: Print multiple destinations if needed
409- pan/bi: Move init_builder to common code
410- pan/bi: Add "word equivalence" relation for index
411- pan/bi: Stub out scheduler unit test
412- pan/bi: Factor nir_function_impl out of the context
413- pan/bi: Add bi_can_{fma, add} predicates
414- pan/bi: Annotate ISA.xml with 'last' parameter
415- pan/bi: Pipe last flag into opcode tables
416- pan/bi: Add bi_must_last predicate
417- pan/bi: Add bi_must_message predicate
418- pan/bi: Label table instructions
419- pan/bi: Emit branch and table bits in opcode table
420- pan/bi: Add various read predicates
421- pan/bi: Unit test bi_can_{fma, add}
422- pan/bi: Test bi_must_last
423- pan/bi: Test bi_must_message
424- pan/bi: Test read predicates
425- pan/bi: Move bi_constants to bifrost.h
426- pan/bi: Use canonical terminology for tuple
427- pan/bi: Use enum bifrost_message_type
428- pan/bi: Clarify tuple comment
429- pan/bi: Amend misleading comment
430- pan/bi: Pack multiple tuples in-memory
431- pan/bi: Add clause encodings as a table
432- pan/bi: Move bi_packed_tuple to compiler.h
433- pan/bi: Add bi_pack_literal
434- pan/bi: Add bi_pack_upper
435- pan/bi: Add bi_pack_tuple_bits
436- pan/bi: Add bi_pack_sync
437- pan/bi: Add tuple/embedded constant pack
438- pan/bi: Add subword 5/6 pack
439- pan/bi: Add subword 4 or 7 pack
440- pan/bi: Add pack_format helper
441- pan/bi: Calculate pos for constant packing
442- pan/bi: Pack multiple tuples per clause
443- pan/bi: Add packing unit test group
444- pan/bi: Test pack_literal
445- pan/bi: Test pack_upper
446- pan/bi: Test pack_tuple_bits
447- pan/bi: Test pack_sync
448- pan/bi: Add packing format tests
449- pan/decode: Be explicit when printing invocations
450- pan/decode: Remove tiler size checks
451- pan/decode: Remove dependency of decoder on the encoder
452- pan/decode: Deduplicate SFBD blend printing
453- pan/decode: Deduplicate shader property printing
454- pan/decode: Remove unused MEMORY_PROP macro
455- pan/decode: Simplify tiler printing
456- pan/decode: Remove pandecode_prop
457- pan/decode: Remove unused disasm stats
458- pan/decode: Remove mesa header dependencies
459- pan/mdg: Drop unused stage parameter to disassembler
460- pan/decode: Remove tile range validation
461- pan/decode: Prefer sizeof to ARRAY_SIZE for char
462- nir/lower_io: Fix grammar errors
463- pan/bi: Fix NULL deref with empty shader
464- pan/bi: Add side_effects helper
465- pan/bi: Respect side effects in DCE
466- pan/bi: Implement AXCHG
467- pan/bi: Implement ACMPXCHG
468- pan/bi: Add bi_fmul_f32 convenience method
469- pan/bi: Fix FLOG_TABLE modifier handling
470- pan/bi: Lower frcp to Newton-Raphson
471- pan/bi: Lower frsq to Newton-Raphson
472- pan/bi: Lower FEXP2 with a table
473- pan/bi: Lower flog2 to a table and polynomial
474- pan/bi: Rename NO_FP32_TRANSCENDENTALS quirk
475- pan/bi: Fix bi quirks detection
476- pan/bi: Lower FP32 transcendentals where required
477- pan/bi: Lower transcendentals on G71
478- pan/bi: Print program size in shader-db
479- pan/bi: Cleanup terminal block check
480- pan/bi: Dead code eliminate per-channel
481- pan/bi: Include ATEST datum in the instruction
482- pan/bi: Add scheduler data structures
483- pan/bi: Add cubeface lowering
484- pan/bi: Flatten block lists
485- pan/bi: Stub worklist routines
486- pan/bi: Add constant count estimates to scheduler
487- pan/bi: Add FAU update helper
488- pan/bi: Add bi_tuple_is_new_src
489- pan/bi: Add bi_count_succ_reads helper
490- pan/bi: Validate reads_t
491- pan/bi: Add T0/T1 constraint check
492- pan/bi: Add writes_reg predicate
493- pan/bi: Add bi_instr_schedulable predicate
494- pan/bi: Choose instructions to schedule
495- pan/bi: Destructively schedule a single instruction
496- pan/bi: Add passthrough register rewriting helper
497- pan/bi: Extract bi_ec0_packed helper
498- pan/bi: Add bi_foreach_instr_and_src_in_tuple
499- pan/bi: Move bi_constant_field to bifrost.h
500- pan/bi: Add pcrel_idx to bi_clause
501- pan/bi: Derive M0 from pcrel_idx while packing
502- pan/bi: Add trivial rewrite helpers
503- pan/bi: Add constant to passthrough rewrite
504- pan/bi: Add constant state constructor
505- pan/bi: Add constant merging routines
506- pan/bi: Add constant modifier handling
507- pan/bi: Schedule blocks
508- pan/bi: Switch to new scheduler
509- pan/bi: Remove old FAU assignment code
510- pan/bi: Remove older cube map lowering
511- pan/bi: Add nosched debug option
512- pan/bi: Fix 'last tuple' for terminal-NOP clauses
513- pan/bi: Fix 2-write pseudo op scheduling
514- pan/mdg: Fix multithreaded shader-db
515- pan/mdg: Add MIDGARD_MESA_DEBUG=inorder option
516- pan/mdg: Optimize UBO offset calculations
517- pan/mdg: Set lower_uniforms_to_ubo
518- panfrost: Fix race condition in UBO mapping to CPU
519- panfrost: Respect buffer_offset when mapping to CPU
520- panfrost: Move sysvals to dedicated UBO
521- panfrost: Don't truncate uniform_count
522- panfrost: Add UBO push data structure
523- panfrost: Push uniforms required by the program
524- panfrost: Set FAU count based on program->push
525- panfrost: Don't store uniform_count on Midgard
526- pan/mdg: Update UBO promotion comment
527- pan/mdg: Push uniforms based on UBO analysis
528- pan/bi: Fix multithreaded shader-db
529- pan/bi: Add bi_replace_index helper
530- pan/bi: Add bi_is_ssa helper
531- pan/bi: Print FAU uniforms in IR
532- pan/bi: Generalize bi_update_fau with fast zero
533- pan/bi: Handle modifiers in rewrite_fau_to_pass
534- pan/bi: Rework FAU lowering
535- pan/bi: Simplify derivative lowering
536- pan/bi: Add SSA-based scalar copy propagation
537- pan/bi: Push UBOs on Bifrost
538- panfrost: Enable ES3 conformant floating-point
539- compiler, nir: Add and set barrier metadata
540- panfrost: Set barriers flag for compute shaders
541- panfrost: Pass is_blit flag around
542- pan/bi: Skip ATEST for colour blit shaders
543- panfrost: Fake shader images for bifrost+deqp
544- pan/bi: Fix jumps to terminal block again
545- pan/bi: Fix empty shader handling
546- nir: Add sample_positions_pan intrinsic
547- pan/decode: Cleanup sample locations decode
548- pan/decode: Only print local storage for vertex jobs
549- panfrost: Preload sample mask if needed
550- panfrost: Add sample positions sysval
551- panfrost: Push sample positions sysval for Midgard
552- panfrost: Refactor sample shading state
553- panfrost: Respect info.fs.uses_sample_shading
554- panfrost: Add panfrost_sample_pattern helper
555- panfrost: Set tiler descriptor sampler pattern
556- panfrost: Generalize MSAA handling
557- panfrost: Don't set REQ_MSAA in pan_mfbd
558- panfrost: Don't use PAN_REQ_MSAA in SFBD
559- panfrost: Remove PAN_REQ_MSAA
560- panfrost: Remove PAN_REQ_DEPTH_WRITE
561- panfrost: Remove batch_is_scanout
562- panfrost: Set sample count/pattern for tiler FBD
563- panfrost: Upload sample positions on device init
564- panfrost: Use sample location LUT
565- panfrost: Ensure open_device has pandecode initialized
566- panfrost: Advertise MSAA 8x and 16x
567- panfrost: Implement get_sample_position
568- panfrost: Overhaul sysval handling
569- panfrost: Add MULTISAMPLED sysval
570- pan/mdg: Assert on bad 64-bit swizzle in disassembly
571- pan/mdg: Remove unused pack_unorm_4x8 lowering
572- pan/mdg: Lower bitfield instructions
573- pan/mdg: Rename bitcount8 to popcnt, fixing the unit
574- pan/mdg: Implement uclz
575- pan/mdg: Lower ufind_msb, poorly
576- pan/mdg: Stub load_barycentric_sample
577- pan/mdg: Lower stores from helpers
578- pan/bi: Remove redundant TEXC opcode check
579- pan/bi: Implement texture gathers
580- pan/bi: Lower bitfield inserts/extracts
581- pan/bi: Implement bitfield_reverse
582- pan/bi: Implement ufind_msb
583- pan/bi: Lower ifind_msb
584- pan/bi: Fix gl_SampleID read
585- pan/bi: Implement load_sample_mask_in
586- pan/bi: Implement nir_intrinsic_load_sample_positions_pan
587- pan/bi: Decouple sysval loading from NIR
588- pan/bi: Implement coverage mask updates
589- pan/{mdg, bi}: Lower load_helper_invocation
590- pan/{mdg, bi}: Lower load_sample_pos
591- panfrost: Simplify bind_compute_state
592- panfrost: Remove stale TODOs
593- panfrost: Assert on indirect compute shaders
594- panfrost: Advertise SAMPLE_SHADING
595- panfrost: Bump max SSBO count
596- panfrost: Bump advertised ESSL feature level
597- panfrost: Advertise OES_standard_derivatives
598- docs/features: Mark more TBO exts done on panfrost
599- docs/features: Mark some ES3.1 done on Panfrost
600- docs/features: Mark sample shading done on Panfrost
601- docs/features: gl_HelperInvocation on Panfrost
602- nir: Fix grammar error
603- panfrost: Fix uniform_count on Midgard
604- pan/bi: Stub scoreboarding
605- pan/bi: Implement barriers
606- pan/decode: Fix tiler printing on Bifrost
607- pan/decode: Pretty print 22-bit pixel formats
608- pan/decode: Disambiguate border colours
609- panfrost: Label groups in GenXML
610- panfrost: Track coverage, early fragment tests
611- panfrost: Flesh out pixel kill / zs update
612- panfrost: Handle PIPE_FORMAT_NONE as 'constant'
613- panfrost: Flesh out allow_forward_pixel_to_kill check
614- panfrost: Export bifrost_get_blend_desc with type size
615- panfrost: Add RT conversion sysval
616- panfrost: Fix NULL deref for an empty frag shader
617- panfrost: Spell fix
618- panfrost: Take panfrost_dev for AFBC selection
619- panfrost: Set border colour on Bifrost sampler
620- panfrost: Remove useless check
621- pan/bi: Fix RA of node 0 again
622- pan/bi: Don't inline 64-bit constants
623- pan/bi: Fix LD_GCLK staging count
624- pan/bi: Don't read alpha out of bounds
625- pan/bi: Allow @rNULL with tied operands
626- pan/bi: Add SEG_ADD.i64 pseudoinstruction
627- pan/bi: Add 32-bit atomic pseudoinstruction
628- pan/bi: Lower atomic pseudo-instructions
629- pan/bi: Extract bi_atom_opc from NIR intrinsic
630- pan/bi: Add ATOM_C1 promotion check
631- pan/bi: Handle computational atomics
632- pan/bi: Materialize \*DTSEL_IMM in the scheduler
633- pan/bi: Implement image_atomic_exchange
634- pan/bi: Implement image_atomic_comp_swap
635- pan/bi: Implement shader_clock intrinsic
636- pan/bi: Fix blend shaders using LD_TILE with MRT
637- pan/bi: Flesh out LD_TILE emit
638- pan/bi: Elucidate slot 6/7 operation
639- pan/bi: Preload sample ID for sample shading
640- pan/bi: Implement interpolateAtSample
641- pan/bi: Add imm_f16 helper
642- pan/bi: Implement interpolateAtOffset
643- pan/bi: Allow dynamically uniform tex indices
644- pan/bi: Use explicit move even for RT#0 of MRT
645- panfrost: Comment on state of ARB_shader_clock
646- panfrost: Advertise FRAMEBUFFER_NO_ATTACHMENTS
647- docs/features: Mark atomics/images done on Bifrost
648- panfrost/ci: Mark flaky test
649- gallium/tessellator: Remove XBOX 360 code
650- gallium/tessellator: Remove unused includes
651- gallium/tessellator: Rename D3D11 defines
652- pan/bi: Fix more jumps to terminal blocks
653- pan/bi: Optimize out redundant jumps to #0x0
654- pan/bi: Fix elimination of repeated branches
655- panfrost: Fix infinite loop spilling
656- panfrost: Fix NULL dereference adding cbuf to batch
657- panfrost: Remove redundant NULL check
658- panfrost: Fix NULL deref in pan_mfbd.c
659- panfrost: Fix NULL derefs in pan_cmdstream.c
660- panfrost: Fix NULL deref in pan_sfbd
661- panfrost: Raise TEXTURE_BUFFER_OFFSET_ALIGNMENT
662- panfrost: Hide MSAA 8x/16x support
663- panfrost: Fix UNORM 16 rendering
664- panfrost: Reinterpret format for reload blits
665- panfrost: Fix typo in midgard.xml
666- panfrost: Don't advertise OES_copy_image
667- pan/bi: Use nir_opt_sink/move for constants
668- pan/bi: Reduce liveness calculations in DCE
669- pan/bi: Inline \`bytemask of read components`
670- pan/bi: Mark branches as having side effects
671- pan/bi: Mark DISCARD as having side effects
672- pan/bi: Make bi_writemask take a destination
673- pan/bi: Allow spilling with multiple destinations
674- pan/bi: Annotate instructions by destination count
675- pan/bi: Adapt builder to dest count
676- pan/bi: Remove unused definitions
677- pan/bi: Do copyprop in linear-time
678- panfrost/lcra: Fix constraint counting
679- pan/bi: Use replace_index in more places
680- pan/bi: Allow negating constants
681- pan/bi: Implement fsin/fcos
682- pan/bi: Mark message-passing sources/dests live
683- pan/bi: Set clause_state.message conservatively
684- pan/bi: Treat +DISCARD.f32 as message-passing
685- nir/lower_viewport_transform: Allow geom/tess
686- pan/bi: Implement u{add, sub}_sat
687- nir: Unify memory atomics
688- meson: Remove kmsro from gallium-drivers
689- pan/bi: Document register conventions
690- pan/bi: Add bi_foreach_block_rev
691- pan/bi: Handle 16-bit blend sr_count
692- pan/bi: Only run copyprop once
693- pan/bi: Only run DCE once
694- pan/bi: Remove TODO: RA warnings
695- pan/bi: Remove stale todo/assert
696- panfrost: Deduplicate UBO count assignment
697- panfrost: Deduplicate Bifrost fau_count
698- panfrost: Only check blend work count on midgard
699- vulkan: Deduplicate mesa stage conversion
700- pan/bi: Enable all nir_opt_move/sink optimizations
701- pan/bi: Split writemasks for memory stores
702- pan/bi: Lower large arrays to scratch
703- pan/bi: Add bi_swz_16 helper
704- pan/bi: Optimize MKVEC.v2i16 generation
705- pan/bi: Lower swizzles
706- pan/bi: Fill in some more conversions
707- pan/bi: Generalize f2i16, f2u16
708- pan/bi: Remove conversion lowering
709- nir: Add nir_type_convert
710- nir: Add {i2f, u2f, f2i, f2u} helpers
711- nir/lower_idiv: Convert to lower_instructions
712- nir/lower_idiv: Factor out numer/denom load
713- nir/lower_idiv: Add 8-bit and 16-bit lowering path
714- pan/bi: Determine block successors correctly
715- panfrost: Fix AFBC body_size for shared resources
716- panfrost: Enable AFBC buffer sharing
717- nir: Add varying precision linking helper (v2)
718- docs: Add some notes on building for macOS
719- panfrost: Fix formats converting uninit from AFBC
720
721Andreas Bergmeier (1):
722
723- v3dv: Output a message if file open fails in physical_device_init
724
725Andres Gomez (29):
726
727- ci: recover tracie dashboard URLs for failing traces
728- ci: correct the trace image URLs in the piglit summary
729- ci: make piglit runner less noisy and show a better failure message
730- ci: clean paths used in the piglit runner
731- ci: correct piglit's HTML summary location for artifacts upload
732- ci: make sure piglit's artifacts are not overwritten
733- ci: correct artifacts location for piglit's runner messages
734- ci: tracie dashboard URLs only in the failure after the testcase
735- ci: piglit runner colors diff output on failures
736- ci: remove pytest since we don't need it any more
737- ci: only install piglit dependencies when installing piglit
738- ci: build gfxreconstruct v0.9.5
739- ci: add libdrm to the x86_test-vk container
740- .mailmap: colapse duplicates for Timothy Arceri
741- .mailmap: resolve duplicates for Icecream95
742- .mailmap: resolve duplicates for Christopher Li
743- .mailmap: resolve duplicates for Emmanuel Vadot
744- .mailmap: resolve duplicates for Indrajit Das
745- .mailmap: resolve duplicates for James Xiong
746- .mailmap: resolve duplicates for Jan Zielinski
747- .mailmap: resolve duplicates for Lin Johnson
748- .mailmap: resolve duplicates for Mark Menzynski
749- .mailmap: resolve duplicates for Matthias Hopf
750- .mailmap: resolve duplicates for Matthias Lorenz
751- .mailmap: resolve duplicates for Maya Rashish
752- .mailmap: resolve duplicates for Mun Gwan-gyeong
753- .mailmap: resolve duplicates for Satyeshwar Singh
754- .mailmap: resolve duplicates for Yogesh Mohan Marimuthu
755- .mailmap: add an alias for Eleni Maria Stea
756
757Andrew McMahon (1):
758
759- util: add mesa_glthread for Half Life 2 and Black Mesa.
760
761Andrii Simiklit (7):
762
763- st/mesa: fix pbo upload/download for arrays of textures with only 1 layer
764- iris: don't emit IRIS_DIRTY_VF depending on trash in restart_index
765- mesa: ensure parameter list capacity before associating uniform storage
766- glsl/linker: Fix xfb stride alignment for buffers containing 64bit types
767- gitlab-ci: remove fixed tests
768- spirv: repair ssa defs for switchs with only default case
769- nir/spirv: remove unused fields from \`vtn_builder`
770
771Antonio Caggiano (4):
772
773- zink: check shader stencil output
774- zink: support stencil-export
775- zink: fix destroy batch
776- ci: Use lock file to build deqp-runner
777
778Anuj Phogat (32):
779
780- intel/anv: Fix condition to set MipModeFilter for YUV surface
781- intel/anv: Fix condition for planar yuv surface
782- intel: Rename files with gen\_ prefix in common code to intel\_
783- intel: Rename "gen\_" prefix used in common code to "intel\_"
784- intel: Fix broken alignment due to gen\_ prefix renaming
785- intel: Rename "GEN\_" prefix used in common code to "INTEL\_"
786- i965: Remove blank line at EOF
787- i965: Rename files with "intel\_" prefix to "brw\_"
788- intel/isl: Drop intel\_ prefix in function names
789- anv: Remove redundant #if checks
790- intel: Remove GEN_IS_HASWELL macro
791- intel: Simplify version checks involving haswell
792- intel: Remove GEN_IS_G4X macro
793- intel: Simplify few version checks involving G4X
794- intel: Rename GEN_VERSIONx10 macro to GFX_VERx10
795- intel: Rename GEN_GEN macro to GFX_VER
796- intel: Rename ISL_DEV_GEN to ISL_GFX_VER
797- intel: Rename genx10 field in gen_device_info struct to verx10
798- intel: Rename gen field in gen_device_info struct to ver
799- intel: Rename genx keyword in filenames to gfxx
800- intel: Rename GENx prefix in macros to GFXx in build files
801- intel: Rename GENx prefix in macros to GFXx in source files
802- intel: Rename genx keyword to gfxx in build files
803- intel: Rename genx keyword to gfxx in source files
804- intel: Rename Genx keyword to Gfxx
805- intel: Rename GENx keyword to GFXx
806- intel: Rename IS_GEN* macros to IS_GFX_VER*
807- intel: Make line wrapping changes due to IS_GFX_VER_BETWEEN
808- intel: Remove unused MAKE_GEN macro
809- intel: Rename GEN_{ALL, LT, ..} macros to GFX_{ALL, LT, ..}
810- intel: Rename GEN:BUG:### to Wa_###
811- intel: Rename WA_### to Wa_###
812
813Arcady Goldmints-Orlov (14):
814
815- v3dv: Fix uninitialized variable warnings
816- nir: add more intrinsics to divergence analysis
817- nir: handle v3d intrinsics in divergence analysis
818- nir: store the results of divergence analysis on loops
819- broadcom/compiler: Use ANYA for branches in uniform ifs
820- broadcom/compiler: Emit uniform loops using uniform control flow
821- broadcom/compiler: Enable PER_QUAD TMU access only in uniform control flow
822- v3dv: Only lower local arrays of size up to 2 to if-chains
823- broadcom/compiler: improve generation of if conditions
824- Revert "broadcom/compiler: improve generation of if conditions"
825- v3dv: initialize render_fd at the top of physical_device_init
826- broadcom/compiler: Add a v3d_compile argument to vir_set_[pu]f
827- broadcom/compiler: Skip bool_to_cond where possible
828- broadcom/compiler: Merge instructions more efficiently
829
830Arno Messiaen (1):
831
832- lima/ppir: increase usage of pipeline regs
833
834Axel Davy (76):
835
836- st/nine: Reduce system memory allocated by D3DUSAGE_AUTOGENMIPMAP
837- st/nine: Do not allow depth buffer render targets
838- st/nine: Clamp GetAvailableTextureMem
839- st/nine: Unmap buffers after full unlock
840- st/nine: Track formats compatible with FETCH4
841- st/nine: Implement experimental FETCH4
842- st/nine: Enable DF24 support
843- st/nine: Add new debug and error checks
844- st/nine: Refactor ht_guid_delete
845- st/nine: Protect \*PrivateData also for Volumes
846- st/nine: Fix leak at device destruction
847- driconf: Rename csmt_int back to csmt_force
848- st/nine: Simplify checks for driconf options
849- st/nine: Add new function to know if we are the worker
850- st/nine: Add RAM memory manager for textures
851- st/nine: Use the texture memory helper
852- st/nine: Control the memfd virtual limit
853- st/nine: Add driconf option to limit texture memory
854- st/nine: Set default dynamic_texture_workaround to true
855- st/nine: Check memfd_create support
856- st/nine: Fix compilation issue in nine_debug
857- st/nine: Optimize EndScene
858- st/nine: Implement SYSTEMMEM buffers same as MANAGED
859- st/nine: Refactor DrawPrimitiveUp
860- st/nine: Optimize DrawPrimitiveUp
861- st/nine: Use correct bind flag at buffer creation
862- gallium/util: Add new u_box helpers
863- st/nine: Track pending MANAGED buffer uploads
864- st/nine: Optimize dynamic systemmem buffers
865- st/nine: Force DYNAMIC SYSTEMMEM for sw vertex processing
866- st/nine: Always use DYNAMIC with SYSTEMMEM
867- st/nine: Use stream_uploader for bad cases of systemmem
868- st/nine: detect worker threads syncs for systemmem
869- radeonsi: Limit the size of the in-memory shader cache
870- radeonsi: fix leak when the in-memory cache is full
871- st/nine: Disable fpu exceptions during init
872- st/nine: Fix crash on texture creation failure
873- st/nine: Fix cubetexture early destruction
874- st/nine: Add missing breaks
875- st/nine: Fix invalid NULL check
876- st/nine: Prevent use after free on dtor
877- st/nine: Fix reading invalid pointer
878- st/nine: Fix compilation warnings
879- st/nine: Fix read outside bounds for some textures
880- st/nine: Fix value of pipe_draw_info's max_index vertex
881- st/nine: Prevent negative reference count
882- st/nine: Improve Surface GetContainer
883- st/nine: Fix alpha to coverage states
884- st/nine: Enable multisampling also without depth buffer
885- st/nine: Handle D3DFMT_NULL multisampling
886- st/nine: Remove errors on unsupported lock flags
887- st/nine: Increase number of constants of vs1_sw
888- st/nine: Ignore swizzle on samplers
889- st/nine: Clamp max_anisotropy
890- st/nine: Refuse depth buffers as rendertargets
891- st/nine: Fix ps ff BLENDTEXTUREALPHA
892- st/nine: Fix ff has_aNrm computation
893- st/nine: Catch redundant scissor and viewport settings
894- st/nine: Pseudo implement set/getClipstatus
895- st/nine: Improve Reset on Ex devices
896- st/nine: Pseudo implement Create*Ex functions
897- st/nine: Complete \*Ex stubs
898- st/nine: Add logging to Ex function
899- st/nine: Have NOOVERWRITE win over DISCARD
900- st/nine: Do not memset buffers twice
901- st/nine: Add fallback for YUV formats
902- st/nine: Use PIPE_MAP_ONCE for persistent buffers
903- st/nine: Disable buffer_upload when csmt is off
904- st/nine: Allow to override the vram size
905- st/nine: Make it optional to use a sw renderer
906- st/nine: Lower texture_memory_limit default
907- st/nine: Bump num of backbuffers for tearfree thread_submit
908- st/nine: Improve performance with thread_submit
909- st/nine: Default thread_submit to true
910- st/nine: Default tearfree_discard to true
911- st/nine: Fix compilation error on non-x86 platforms
912
913Bas Nieuwenhuizen (87):
914
915- ac/surface: Fix GFX9 sparse mip info.
916- radv: Do not use a pipe offset for aliased sparse images.
917- radv: Add a trivial implementation of VK_KHR_deferred_host_operation
918- radv: Use stricter HW resolve swizzle compat check.
919- radv: Expose VK_KHR_workgroup_memory_explicit_layout.
920- radv: Do not hash vk_object_base in descriptor set layout.
921- amd/common: Add modifier size helper.
922- radv: Extract DCC format support handling.
923- radv: Use the surface offset from ac_surface instead of a plane offset.
924- radv: Don't relayout images with modifiers.
925- radv: Add format modifier format queries.
926- radv: Add drm format modifier queries.
927- radv: Add image layout with drm format modifiers.
928- radv: Enable DRM format modifiers on GFX9+.
929- radv: Enable modifiers with the WSI.
930- radv: Add modifier fails for CTS bug.
931- radv: Fix assert.
932- radv: Implement VK_KHR_zero_initialize_workgroup_memory.
933- radv: Improve spilling on discrete GPUs.
934- radv: Fix vram override with fully visible VRAM.
935- radv: Remove custom icd json generation.
936- radv: Define supported extensions in C.
937- radv: Ignore WC flags for VRAM.
938- radv: Determine swizzles correctly.
939- radv: Add plane width/height helpers.
940- radv: Use u_format helpers when possible.
941- radv: Remove VK_SWIZZLE_*.
942- radv: Do no use vk_format for getting divisors.
943- radv: Do not use generated table for plane formats.
944- radv: Stop checking for MULTIPLANE layout.
945- radv: Stop using plane_count.
946- radv: Only support format with a PIPE_FORMAT.
947- radv: Start using util_format_description for everything.
948- radv: Remove the format table.
949- radv: Remove vk_format_has_stencil/depth helpers.
950- radv: Properly handle modifier import failure.
951- radv: Do pipe misalignment check per plane.
952- radv: Don't use dedicated memory info to indicate sharing.
953- vulkan/device_select: Stop using device properties 2.
954- amd/common: constify ac_surface_set_umd_metata.
955- radv: Handle UMD metadata on import.
956- radv: Use shared code for setting opaque metadata.
957- amd/common: Add retile map size helper.
958- radv: Implement initialization of displayable DCC.
959- radv: Implement displayable DCC retiling.
960- radv: Add DCC info to the metadata.
961- radv: Use ac_surface DCC settings for shareable images.
962- radv: Enable displayable DCC.
963- radv: Disable displayable DCC for GFX8 properly.
964- ac/rgp: Only report double the prims per clock on GFX10.
965- radv: Expose robustBufferAccessUpdateAfterBind correctly.
966- frontends/va: Use correct size for secondary planes.
967- radv: Enable linear sampling for depth textures.
968- radv: Add sam option.
969- radv: Add nodisplaydcc option.
970- radv: Use correct DCC compressed block size for sampling.
971- radv: Dedupe winsyses per device.
972- radv: Allow extra planes for DCC.
973- radv: Enable sharing with DCC with modifiers.
974- radv: Ensure we never decompress or FCE read-only textures.
975- radv: Allow DCC for images with modifiers that are read-only.
976- radv: Use 8x8 meta compute workgroups.
977- radv: Enable DCC for image stores on GFX10.
978- radv: Only set WRITE_COMPRESS_ENABLE on supported HW.
979- vulkan: Fix descriptor set creation with zero bindings.
980- lavapipe: Free sorted descriptor array.
981- zink: Remove initialization of some arrays
982- zink: Only set the needed number of scissors.
983- radv: Flush caches for shader read operations.
984- nir: Fix shader calls with nir_opt_dead_write_vars.
985- nir: Extract shader_info->cs.shared_size out of union.
986- nir: Remove nir_shader->shared_size.
987- nir: Do not reset shared_size in nir_lower_io.
988- radv: Support DCC without a fast clear value.
989- radv: Support DCC without DCC/FCE predicates.
990- radv: Add retiling for foreign queues.
991- radv: Support DCC modifiers fully.
992- radv: Add clang-format for AMD code.
993- radv: Format.
994- radv: Update editorconfig.
995- radv: Re-enable retiling.
996- radv: Refactor cs_domain to be a winsys function.
997- radv: Use VRAM cmdbuffers in more situations.
998- radv/winsys: Remove use_local_bos
999- radv: Fix memory leak on descriptor pool reset with  layout_size=0.
1000- amd/common: Use cap to test kernel modifier support.
1001- radv: Only require DRM 3.23.
1002
1003Bastian Beranek (1):
1004
1005- glx: Assign unique serial number to GLXBadFBConfig error
1006
1007Ben Niu (1):
1008
1009- util: When building 'ARM64EC', don't use x64 intrinsics which need to be emulated
1010
1011Benjamin Tissoires (3):
1012
1013- CI: windows: augment the timeout of building the windows container
1014- CI: windows: split the layers to meet new registry requirements
1015- CI: windows: Force using LLVM 12
1016
1017BillKristiansen (1):
1018
1019- d3d12: fix for upside-down multisample stencil blit
1020
1021Boris Brezillon (91):
1022
1023- panfrost: Don't skip the test with a 4k shader
1024- panfrost: Fix tiler job injection (again)
1025- panfrost: Get rid of IS_BIFROST
1026- panfrost: Don't memset the last attribute buffer entry twice
1027- panfrost: Only allocate the extra attribute buffer entry on Bifrost
1028- panfrost: Set attribs and attrib_bufs to NULL when attrib_count = 0
1029- panfrost: Rename and move pan_render_condition_check()
1030- panfrost: Use dev->arch where appropriate
1031- panfrost: Add a panfrost_compile_shader() helper
1032- panfrost: Update ctx->batch when a fresh batch is requested
1033- panfrost: Fix a polygon list corruption in the multi-context case
1034- panfrost: Don't add the tiler BO when it's not accessed
1035- pan/bi: Add an is_terminal_block() helper
1036- pan/bi: Make sure we never branch to an non-existing clause
1037- pan/bi: Add uclz() support
1038- pan/bi: Support bit_count()
1039- panfrost: Use panfrost_get_shader_options() in panfrost_build_blit_shader()
1040- panfrost: Hide backend compiler internals
1041- panfrost: Prefix shader related helpers with pan_shader\_
1042- panfrost: Move sysval_to_id out of panfrost_sysvals
1043- panfrost: Keep the compiler inputs in the context
1044- panfrost: Move the shader compilation logic out of the gallium driver
1045- panfrost: Provide a helper to prepare the shader related parts of an RSD
1046- panfrost: Use the pan_shader_prepare_rsd() helper
1047- panfrost: Rename pan_blend.h into pan_blend_cso.h
1048- panfrost: Move the blend lowering code out of the gallium driver
1049- panfrost: Move the blend logic out of the gallium driver
1050- Revert "pan/bi: Optimize out redundant jumps to #0x0"
1051- pan/bi: Move int64 lowering before idiv lowering
1052- panfrost: Split the direct and indirect draw logic
1053- panfrost: Add a parameter to suppress next job prefetching
1054- panfrost: Allow passing an explicit global dependency when queuing a job
1055- panfrost: Add a pan_section_offset() helper
1056- panfrost: Move pan_special_varying definition to pan_encoder.h
1057- pan/bi: Extend the bi_builder to support type variants correctly
1058- panfrost: Add a knob to disable the UBO -> push constants optimization
1059- panfrost: Allow passing an explicit UBO index for the sysval UBO
1060- panfrost: Print the correct UBO size when dumping UBO information
1061- panfrost: Don't count the special vertex/instance ID attributes on Bifrost
1062- panfrost: Split the sampler and texture count
1063- panfrost: Expose panfrost_modifier_to_layout()
1064- pan/gen_pack: Parse alignment requirements
1065- panfrost: Specify descriptor alignment requirements
1066- panfrost: Provide various helpers to simplify descriptor allocation
1067- panfrost: Define the Surface and Surface-with-stride descriptors
1068- panfrost: Emit surface descriptors with pan_pack()
1069- panfrost: Use the descriptor allocators where appropriate
1070- panfrost: Get rid of panfrost_pool_alloc()
1071- panfrost: Move the blend shader cache at the device level
1072- panfrost: Use the blend shader cache attached to the device
1073- panfrost: Don't reserve space in the color buffer for disabled RTs
1074- panfrost: Skip disabled RTs when selecting a RT for transaction elimination
1075- panfrost: Stop including pan_device.h from pan_bo.h
1076- panfrost: Add helpers to support indirect draws
1077- panfrost: Prepare things for indirect draws
1078- panfrost: Hook up indirect draw support
1079- panfrost: s/panfrost_slice/pan_image_slice_layout/
1080- panfrost: Move image states out of pan_image_layout
1081- panfrost: Add a format field to pan_image_layout
1082- panfrost: Stop passing a depth > 1 when creating 2D textures
1083- panfrost: Add extra info to the pan_image_layout struct
1084- panfrost: Split pan_image in two
1085- panfrost: Add an offset field so we can attach a sub-buffer to an image
1086- panfrost: Move out-of-band CRC info to pan_image
1087- panfrost: Move special Z32_S8X24 case out of panfrost_setup_layout()
1088- panfrost: Add a pan_image_layout_init() helper
1089- panfrost: Patch the gallium driver to use pan_image_layout_init()
1090- panfrost: Pass an image view to panfrost_new_texture()
1091- panfrost: Provide a helper to calculate the polygon list size
1092- panfrost: Provide a helper to retrieve image surface pointers
1093- panfrost: Pass a const device to panfrost_sample_positions()
1094- pan/midg: Use the sampler index passed to the texture instruction
1095- panfrost: Add various helpers to simplify FB desc emission
1096- panfrost: Add an helper to emit fragment jobs
1097- panfrost: Add align info to the draw and draw padding definitions
1098- panfrost: Add the early ZS pre frame mode
1099- panfrost: s/pandecode_vertex_tiler_postfix_pre/pandecode_dcd/
1100- panfrost: Decode pre/post frame DCDs
1101- panfrost: Extend pan_fb_info to allow passing a tile enable map
1102- panfrost: Extend pan_fb_info to allow passing pre/post frame DCDs
1103- panfrost: Always pass a non-NULL screen to set_damage_region()
1104- panfrost: Create a blitter library to replace the existing preload helpers
1105- panfrost: Fix partial update
1106- panfrost: Use the generic preload and FB helpers in the gallium driver
1107- panfrost: Kill the old tile-buffer preload logic
1108- panfrost: Pass a tile enable map to avoid reloading untouched tiles
1109- panfrost: Fix pan_blitter_get_blit_shader()
1110- panfrost: Don't advertise AFBC mods when the format is not supported
1111- panfrost: Reserve thread storage descriptor in panfrost_launch_grid()
1112- panfrost: Fix indirect draws
1113- panfrost: Fix ZS reloading on Bifrost v6
1114
1115Boyuan Zhang (2):
1116
1117- frontend/va/image: add pipe flush for vlVaPutImage
1118- frontends/omx/h265: search entire dpb list
1119
1120Caio Marcelo de Oliveira Filho (43):
1121
1122- intel/fs: Separate SLM size calculation from encoding
1123- nir: Add a data pointer to the callback in nir_remove_dead_variables
1124- spirv: Don't remove variables used by resource indexing intrinsics
1125- nir/linking: Remove system_value handling from helper
1126- compiler: Use util/bitset.h for system_values_read
1127- ci: Add nouveau chipset 162 to shader-db runs
1128- vulkan: Update XML and headers to 1.2.168
1129- spirv: Update headers and metadata from latest Khronos commit
1130- nir: Two shared memory \*blocks* may alias each other
1131- spirv: Implement SPV_KHR_workgroup_memory_explicit_layout
1132- anv: Implement VK_KHR_workgroup_memory_explicit_layout
1133- spirv: Don't bother counting num_images/num_textures
1134- spirv: Don't remove dead variables in \`create_library` mode
1135- spirv: Store SPIR-V version of the module
1136- spirv: Refactor variable initializer code
1137- spirv: Recognize zero initializers in Workgroup variables
1138- nir: Add nir_zero_initialize_shared_memory
1139- anv: Implement VK_KHR_zero_initialize_workgroup_memory
1140- spirv: Fail when parsing invalid Initializers
1141- spirv: Use OpEntryPoint to identify valid I/O variables
1142- spirv: Count variables \*after* unused ones are removed
1143- spirv: Skip creating unused variables in SPIR-V >= 1.4
1144- spirv: Allow variable pointers pointing to an array of blocks
1145- intel/compiler: Use gl_varying_slot_name_for_stage()
1146- freedreno/ir3: Use gl_varying_slot_name_for_stage()
1147- etnaviv: Use gl_varying_slot_name_for_stage()
1148- st/atifs: Use gl_varying_slot_name_for_stage()
1149- compiler: Drop now unused gl_varying_slot_name()
1150- spirv: Reuse nir_is_per_vertex_io()
1151- spirv: Explicitly break when finished handling SpvDecorationBuiltIn
1152- spirv: Update a couple of comments in variable handling
1153- anv: Lower ViewIndex to zero when multiview is disabled
1154- spirv: Update headers and metadata from latest Khronos commit
1155- nir: Handle deref_atomic_fadd in a couple of passes
1156- intel/compiler: Make vue_map parameter const for brw_compile_fs
1157- intel/compiler: Use a struct for brw_compile_fs parameters
1158- intel/compiler: Use a struct for brw_compile_vs parameters
1159- intel/compiler: Refactor the shader INTEL_DEBUG checks
1160- intel/compiler: Make brw_postprocess_nir take debug_enabled as a parameter
1161- intel/compiler: Make vec4 generator take debug_enabled as a parameter
1162- intel/compiler: Make visitors take debug_enabled as a parameter
1163- intel/compiler: Use INTEL_DEBUG=blorp to dump blorp shaders
1164- intel/compiler: Use a struct for brw_compile_cs parameters
1165
1166Chad Versace (30):
1167
1168- anv/image: Replace bo_is_owned with from_gralloc (v2)
1169- anv/image: Rename anv_image_plane::surface -> primary_surface
1170- anv/image: Move vkGetImageMemoryRequirements
1171- anv/image: Drop duplicate 'format' in anv_image_create()
1172- anv/image: Fix interpretation of 'disjoint'
1173- anv/android: Fix size check for imported gralloc bo
1174- anv: Add anv_surface_is_valid()
1175- anv/image: Clean up anv_GetImageMemoryRequirements2
1176- anv: Refactor anv_image_get_compression_state_addr
1177- anv/image: Add anv_image_address()
1178- blorp/gen12: Don't use aux address if implicit CCS
1179- anv/image: Make memory layout more explicit
1180- vulkan: Track dependencies of Python imports
1181- anv/image: Simplify assertions in anv_image_from_swapchain()
1182- anv/image: Fix tiling if VkImageSwapchainCreateInfoKHR
1183- anv/image: In vkCreateDmaBufImageINTEL use modifiers
1184- anv/image: Check that anv_image is compatible with its modifier
1185- anv/image: Refactor check_memory_bindings()
1186- anv/image: Fix cleanup of failed image creation
1187- anv/image: Add ANV_IMAGE_MEMORY_BINDING_PRIVATE
1188- anv/image: Fix Vk*ImagePlaneMemory*Info for modifier images
1189- anv: Move assert in vkGetImageSubresourceLayout
1190- anv/image: Fix vkGetImageSubresourceLayout for modifier images
1191- anv: Implement image acquire/release of modifier images
1192- anv: Declare anv_layout_to_* as pure functions
1193- anv/image: Add 'offset' param to add_surface()
1194- anv/image: Support VkImageDrmFormatModifierExplicitCreateInfoEXT
1195- anv: Enable VK_EXT_image_drm_format_modifier
1196- anv: Remove vkCreateDmaBufINTEL (v4)
1197- anv: Drop unused anv_image_create_info::stride
1198
1199Charmaine Lee (1):
1200
1201- gallivm: increase size of texture target enum bitfield
1202
1203Chia-I Wu (38):
1204
1205- virgl: update headers
1206- virgl: add support for VIRGL_CAP_V2_UNTYPED_RESOURCE
1207- targets/libgl-xlib: add support for virgl
1208- virgl: update headers from virglrenderer
1209- venus: add driver skeleton
1210- venus: add generated venus-protocol headers
1211- venus: add experimental renderers
1212- venus: add a CS encoder/decoder
1213- venus: add a ring buffer
1214- venus: initial support for vkCreateInstance
1215- venus: initial support for VkPhysicalDevice commands
1216- venus: initial support for VkDevice commands
1217- venus: initial support for queue/fence/semaphore
1218- venus: initial support for VkDeviceMemory commands
1219- venus: initial support for buffers/images/samplers
1220- venus: initial support for descriptor sets
1221- venus: initial support for render pass and fb
1222- venus: initial support for events and queries
1223- venus: initial support for module and pipelines
1224- venus: initial support for command buffers
1225- venus: advertise extensions promoted to 1.1
1226- venus: advertise extensions promoted to 1.2
1227- venus: initial support for transform feedback
1228- venus: initial support for WSI
1229- venus: update venus-protocol headers
1230- venus: prepare for splitting vn_device.[ch]
1231- venus: split out vn_command_buffer.[ch]
1232- venus: split out vn_pipeline.[ch]
1233- venus: split out vn_query_pool.[ch]
1234- venus: split out vn_render_pass.[ch]
1235- venus: split out vn_descriptor_set.[ch]
1236- venus: split out vn_buffer.[ch]
1237- venus: split out vn_image.[ch]
1238- venus: split out vn_device_memory.[ch]
1239- venus: split out vn_queue.[ch]
1240- venus: include individual venus-protcol headers
1241- ci: enable venus in some meson build jobs
1242- venus: check vn_renderer_info::vk_xml_version
1243
1244Christian Gmeiner (26):
1245
1246- etnaviv: handle NULL views in set_sampler_views
1247- vc4: add drm-shim
1248- ci: Update baremetal kernel to 5.11 plus patches
1249- nir: add load_texture_rect_scaling
1250- nir: add has_txs flag
1251- nir/lower_tex: 'txs free' tex_rect lowering
1252- nir/lower_tex: wider usage of nir_tex_instr_src_index(..)
1253- gallium: add PIPE_CAP_TEXRECT
1254- gallium/st: lower rectangle textures if not supported
1255- ttn: lower rectangle textures if not supported
1256- etnaviv: nir: support nir_intrinsic_load_texture_rect_scaling
1257- etnaviv: let st lower rect tex
1258- vc4: let st lower rect tex
1259- etnaviv: nir: add ubo lowering pass
1260- etnaviv: use nir_lower_uniforms_to_ubo(..)
1261- etnaviv: fix etna_nir_lower_ubo_to_uniform pass
1262- etnaviv: extend lower ubo tests
1263- gallium: call util_cpu_detect()
1264- etnaviv: use nir_lower_idiv(..) before opt loop
1265- ci/bare-metal: fix fastboot
1266- etnaviv: etnaviv: put sampler limit determination into own function
1267- etnaviv: factor out TS state emitting
1268- etnaviv: add support for NTE
1269- etnaviv: rename struct members
1270- ci/bare-metal: no need to use tee
1271- etnaviv: tell the truth if alpha-test is supported
1272
1273Connor Abbott (61):
1274
1275- nir/lower_tex: Handle sized tex destination types
1276- freedreno/ir3: Handle sized tex destination types
1277- ntt: Handle sized tex destination types
1278- nir/lower_bool: Rewrite dest_type for boolean destinations
1279- brw/vec4: Don't convert tex dest type to glsl_type
1280- radv/meta: Use sized types for nir_tex_instr::dest_type
1281- v3dv/meta: Use sized types for nir_tex_instr::dest_type
1282- intel/blorp: Use sized types for nir_tex_instr::dest_type
1283- anv: Use sized types for nir_tex_instr::dest_type
1284- dxil: Use sized types for nir_tex_instr::dest_type
1285- panfrost/blit: Use sized types for nir_tex_instr::dest_type
1286- d3d12/blit: Use sized types for nir_tex_instr::dest_type
1287- nir: Use sized types for nir_tex_instr::dest_type
1288- st/mesa: Use sized types for nir_tex_instr::dest_type
1289- gallium/nir: Use sized types for nir_tex_instr::dest_type
1290- ttn: Use sized types for nir_tex_instr::dest_type
1291- st/atifs: Use sized types for nir_tex_instr::dest_type
1292- glsl/nir: Use sized types for nir_tex_instr::dest_type
1293- vtn: Use sized types for nir_tex_instr::dest_type
1294- ptn: Use sized types for nir_tex_instr::dest_type
1295- nir: Validate nir_tex_instr::dest_type bitsize
1296- nir/lower_tex: Assume that nir_tex_instr::dest_type is sized
1297- panfrost: Assume that nir_tex_instr::dest_type is sized
1298- ir3: Assume that nir_tex_instr::dest_type is sized
1299- ntt: Assume that nir_tex_instr::dest_type is sized
1300- freedreno/a6xx: Document threadsize-related fields
1301- freedreno/cffdec: Use rb trees for tracking buffers
1302- ir3/parser: Fix parsing of "0.0" in @const line
1303- freedreno/computerator: Fix example assembly
1304- ir3/parser: Support labels
1305- ir3/parser: Add ability to specify branchstack
1306- freedreno/computerator: Add branching example
1307- freedreno/computerator: Fix thrsz type
1308- freedreno/a6xx: Fix compute threadsize type
1309- freedreno/registers: Handle typed registers with fields
1310- freedreno/a6xx: Cleanup SP_XS_CTRL_REG0 definitions
1311- freedreno: Add local_size to ir3_shader_variant
1312- ir3: Calcuate max_waves and threadsize
1313- turnip: Use threadsize calculated by ir3
1314- freedreno: Use threadsize calculated by ir3
1315- freedreno/computerator: Use threadsize calculated by ir3
1316- freedreno: Report max_waves in shaderdb output
1317- freedreno/computerator: Add script for finding reg file size
1318- util/bitset: Avoid out-of-bounds reads
1319- freedreno/a3xx: Fix SP_FS_CTRL_REG1_INITIALOUTSTANDING
1320- ir3/legalize: Fix last input (ss) insertion
1321- ir3: Fix valid flags for STIB
1322- ir3/cp_postsched: Set address of uses for relative mov's
1323- ir3: Don't copy propagate arrays in ir3_cp
1324- ir3/postsched: Make sure to schedule inputs before kill
1325- vtn: Handle ZeroExtend/SignExtend image operands
1326- tu: Expose VK_KHR_spirv_1_4 and VK_EXT_scalar_block_layout
1327- tu: Expose VK_KHR_relaxed_block_layout
1328- ir3/sched: Don't penalize uses of already-waited tex/SFU
1329- ir3/sched: Don't schedule too many tex/SFU instructions
1330- ir3: Fix list corruption in legalize_block()
1331- tu: Correctly preserve old push descriptor contents
1332- ir3: Prevent oob writes to inputs/outputs array
1333- nir/lower_clip_disable: Fix store writemask
1334- tu: Fix SP_GS_PRIM_SIZE for large sizes
1335- ir3/postsched: Fix dependencies for a0.x/p0.x
1336
1337Corentin Noël (1):
1338
1339- ci: Use lavacli from master
1340
1341Daniel Schürmann (67):
1342
1343- aco: fix VOP3P assembly, VN and validation
1344- aco/RA: fix subdword operands on VOP3P instructions
1345- aco: allow constants/literals on every src position for VOP3P
1346- aco: allow SGPRs on every src position for VOP3P
1347- aco: change usesModifiers() considering opsel_hi on packed instructions
1348- aco: create helpers to emit vop3p instructions
1349- aco: emit packed 16bit instructions
1350- radv: vectorize 16bit instructions
1351- aco: simplify multiply-add combining
1352- aco: optimize packed mul+add to v_pk_fma_f16
1353- aco: optimize packed clamp
1354- aco: optimize packed fneg
1355- aco: optimize v_pk_fma_f16 -> v_pk_fmac_f16 on GFX10
1356- aco: propagate swizzles when optimizing packed clamp & fma
1357- aco: remove divergent branches which only jump over very few instructions
1358- aco/optimizer: don't copy-prop logical phis
1359- aco/optimizer: don't propagate subdword temps of different size
1360- aco: generalize subdword constant copy lowering
1361- aco/validate: validate that p_create_vector operands are aligned unless they are subdword operands
1362- aco/validate: ensure that Operand and Definition size matches for parallelcopies
1363- aco/validate: relax subdword restrictions
1364- aco: propagate temporaries into PSEUDO instructions if it can take it
1365- aco/optimizer: expand subdword vectors with SGPRs on all generations
1366- aco/optimizer: convert extract_vector with index 0 into parallelcopies if possible
1367- radv: don't vectorize shift operations
1368- aco: fix VCC hint on boolean subgroup operations
1369- aco: fix nir_intrinsic_ballot with wave32
1370- aco: fix shared VGPR allocation on RDNA2
1371- aco: change gpr_alloc_granule to full alignment
1372- aco: refactor GPR limit calculation
1373- aco: don't decrease the vgpr_limit when encountering bpermute
1374- aco: also consider VCC in get_reg_specified()
1375- aco: check get_reg_specified() on register hints
1376- aco: don't abort() if disassembly fails
1377- aco: use VCC as regular SGPR pair on GFX10
1378- aco: don't create unnecessary exec phi on merge blocks
1379- aco: handle non-temp phi definitions and operands
1380- aco: make all exec accesses non-temporaries
1381- aco: remove dead code for the handling of exec temporaries
1382- aco: fix assertion in insert_exec_mask pass
1383- nir: lower load_helper to is_helper if the shader uses demote()
1384- nir: lower is/load_helper to zero if no helper lanes are needed
1385- aco: remove special handling of load_helper_invocation
1386- aco: don't rematerialize exec
1387- aco: value number VOPC instructions with different exec masks
1388- aco/value_numbering: use can_eliminate() function to avoid unnecessary hashmap lookups
1389- aco/optimizer: set VCC hint on new v_cmp_* definitions
1390- aco/ra: allow VCC on SMEM sbase operand on GFX10+
1391- .mailmap: fix email for Daniel Schürmann
1392- aco/ra: split affinity creation into separate function
1393- aco/ra: split register_file initialization into separate function
1394- aco/ra: refactor SSA repairing during register allocation
1395- aco/ra: iterate backwards when coalescing phis
1396- aco/ra: allow m0 in get_reg_specified()
1397- aco/ra: remove exec handling for phis
1398- aco/spill: refactor spill decision taking
1399- aco/spill: reload spilled exec masks directly to exec
1400- aco/spill: spill phi constants and exec directly to VGPR
1401- aco/spill: don't count phis as variable access
1402- aco/spill: refactor some more spill decision taking
1403- aco/spill: refactor live-in registerDemand calculation
1404- aco/spill: use correct next_use_distances at loop header
1405- aco: lower p_spill with constants correctly
1406- aco: fix kill flags on phi operands
1407- aco: add new reindex_ssa() pass
1408- aco/cssa: rewrite lower_to_cssa pass
1409- aco/cssa: don't create parallelcopies for constants and exec
1410
1411Daniel Stone (3):
1412
1413- CI: Try really hard to get updated Windows TLS certs
1414- CI: Trigger Windows builds for llvmpipe & Vulkan changes
1415- CI: Change LAVA job visibility
1416
1417Danylo Piliaiev (50):
1418
1419- turnip/ir3: handle image load/stores produced by AtomicLoad/Store
1420- turnip: make GS use correct varyings size from previous stage
1421- ir3: add debug option to override shader assembly
1422- freedreno/ir3/parser: add cat7 support
1423- turnip: don't emit tess consts if they are not used
1424- freedreno: clamp scissor bounds
1425- freedreno/a2xx: fix scissors clamp bounds
1426- turnip: enable inheritedQueries
1427- turnip: consider HW limit on number of views when apply multipos opt
1428- turnip: consider tile_max_h when calculating tiling config
1429- turnip,freedreno/a6xx: tell hw the size of shared mem used by CS
1430- turnip/ir3: check for bindless IBOs in atomic dests fixup
1431- turnip: fix leak of tu_shader object during compute pipeline creation
1432- ir3: prevent duplication of instruction's dependencies
1433- ir3: make mark_kill_path exit early if instr is already seen
1434- ir3: disallow moving memory writes over discard
1435- freedreno/hw: fix populating branch targets in isa_decode pre-pass
1436- turnip: fix SP_HS_WAVE_INPUT_SIZE value
1437- freedreno/a5xx: port handling of PIPE_BUFFER textures from a6xx
1438- ir3: use OPC_GETBUF to get size of sampler buffers
1439- turnip: lower device index to zero
1440- turnip: fill VkMemoryDedicatedRequirements
1441- turnip: set zmode to A6XX_EARLY_Z if FS forces early fragment test
1442- turnip: implement intrinsic_vulkan_resource_reindex
1443- ci/freedreno: run freedreno jobs on any change in src/freedreno/
1444- ir3: fix oob access to regs array for getbuf,getinfo,rgetinfo
1445- ir3/isa,parser: fix encoding and parsing of bindless s2en SAM
1446- ir3: match mova1 mnemonic when writing to A1
1447- freedreno/isa: assert if field's range is out of bitset's range
1448- ir3: disallow .sat on SEL instructions
1449- ir3: update info about applicability of saturation modifier
1450- turnip: expose several already implemented extensions
1451- nir: add nir_shader_as_str function
1452- turnip: implement VK_KHR_pipeline_executable_properties
1453- turnip: clamp to zero negative upper left corner of viewport
1454- turnip,ir3: account for dispatch group offsets
1455- freedreno/a6xx: copy full 64bit of primitive counter
1456- freedreno/a6xx: fix primitive counters debug output
1457- ir3/isa: account for randomly set by blob lowest bit of ibo atomics
1458- glsl/linker: Fix attempts to split up 64bit varyings between slots
1459- glsl/linker: Fix xfb with explicit locations and 64bit types
1460- ir3: nir_op_f2f16 should round to even
1461- ir3: convert shift amount to 16b for 16b shifts
1462- turnip: enable infinities for f16 math and document the register
1463- turnip: enable VK_KHR_16bit_storage on A650
1464- turnip: handle format list for compressed formats
1465- docs: mark float_controls,float16_int8,16bit_storage as done on Turnip
1466- turnip: fix alignment of non-32b types in workgroup memory
1467- turnip: implement variableMultisampleRate
1468- turnip: support copying both aspects of D32_SFLOAT_S8_UINT
1469
1470Dave Airlie (163):
1471
1472- device-select-layer: update for vulkan 1.2
1473- lavapipe: fix missing piece of VK_KHR_get_physical_device_properties2
1474- vk-device-select: add device group support
1475- lavapipe: refactor image surface creation
1476- lavapipe: rewrite attachment clearing for conditional rendering.
1477- gallium: add a cond rendering hook for vulkan.
1478- llvmpipe: handle vulkan conditional rendering
1479- lavapipe: add VK_EXT_conditional_rendering support.
1480- CI: add lavapipe to llvmpipe rules.
1481- lavapipe: add support for external memory/fd/sempahore extensions
1482- llvmpipe: handle firstvertex for vulkan draw parameters
1483- lavapipe: handle shader draw parameters
1484- lavapipe: add missing loader interface negoitation
1485- lavapipe: move to subclassing instance/physical device.
1486- lavapipe: add missing wsi entrypoint.
1487- lavapipe: sort extensions in proper order.
1488- lavapipe: use common dispatch layer.
1489- radv: move queue object to a common base object
1490- radv: remove all entrypoint enabled debug option
1491- radv: move to subclassed instance/physical_device structs
1492- radv: port to using common dispatch code.
1493- zink: don't pick a cpu device ever.
1494- llvmpipe: add a mutex around debug resource tracking
1495- llvmpipe: fix use after free with fs variant cleanup
1496- lavapipe: reset shader constant buffers after execution
1497- glsl: fix leak in gl_nir_link_uniform_blocks
1498- llvmpipe: enable GL spir-v support
1499- util/format: add helper to check if a format is scaled.
1500- llvmpipe: don't support scaled formats outside vertex buffers
1501- lavapipe: add support for 2/10/10/10 scaled formats.
1502- lavapipe: add support for missing 10/10/10/2 formats.
1503- lavapipe: add reference counting to descriptor set layout
1504- lavapipe: avoid pointer to pipeline layout in execution
1505- lavapipe: set viewport state dirty on first execute
1506- lavapipe: implement physical device group enumeration
1507- lavapipe/meson: drop megadrivers build req
1508- lavapipe: fix some void ptr arithmetic
1509- lavapipe: use msvc compatible 0 init
1510- lavepipe: some misc msvc fixes
1511- lavapipe: make OPT macro MSVC compatible
1512- lavapipe: use os_time for timing related things
1513- vulkan/util: add api to reset object magic + private data.
1514- radv: reset object base on recycled command buffers
1515- tu: reset object base on recycled command buffers
1516- lavapipe: reset object base on recycled command buffers
1517- util: add optimised memset64
1518- u_surface: use optimised memset64
1519- llvmpipe: zs clear use 64-bit memset
1520- lavapipe: use clear interface for renderpass clears
1521- glx: proposed fix for setSwapInterval
1522- zink: use extensioned draw indirect functions.
1523- zink/ci: update results now that we are testing zink/lavapipe
1524- lavapipe: add calibrated timestamp support
1525- zink/ci: update results for GL 3.3 testing enables
1526- zink/ci: disable arb_timer_query tests
1527- lavapipe: use the common icd generator
1528- lavapipe: fix msvc initialiser
1529- lavapipe: add dll definition file instead of using PUBLIC
1530- lavapipe: fix icd generation for windows
1531- meson/llvm: add native for gallium swrast
1532- lavapipe: handle tessellation domain winding
1533- lavapipe: enable KHR_maintenance2
1534- lavapipe: enable KHR_maintenace3
1535- lavapipe: fix descriptor set layout freeing.
1536- lavapipe: fix depth texturing swizzle
1537- lavapipe: use null probe path on win32
1538- ci: try building lavapipe on windows
1539- zink/instance: work with vulkan 1.0 and later loader.
1540- lavapipe: expose a 1.0 vulkan API for now.
1541- lavapipe: Define supported extensions in C
1542- lavapipe: VK_EXT_extended_dynamic_state support
1543- lavapipe: reorder descriptor set stages to get correct binding
1544- lavapipe: sort bindings before creating descriptor set
1545- clover: fix array images view creation
1546- lavapipe: fix pipeline vp/scissor mixup.
1547- lavapipe: fix dynamic viewport/scissor pipeline emission
1548- draw: fix uses viewport index for tess eval shader
1549- draw/prim_assembler: write correct decomposed primitive lengths
1550- llvmpipe: add support for shader viewport layer
1551- lavapipe: enable EXT_shader_viewport_index_layer
1552- zink/ci: update results after layer extensions enabled in lavapipe
1553- util/panfrost/glsl: rename BITSET_LAST_BIT to BITSET_LAST_BIT_SIZED
1554- util/bitset: add a new last bit api
1555- shader_info: convert textures_used to a bitset.
1556- gallium: add a sampler reduction cap + settings
1557- gallium: add a view mask to the draw command
1558- gallivm: mark subpass input attachments as 2d arrays
1559- gallivm: add support for load_view_index intrinsic
1560- draw: add interface to notify renderer of the current view index
1561- draw: refactor out the instances drawing code
1562- draw: add view_mask rendering support
1563- draw: pass the view index to the render driver
1564- draw/vs: pass the view index to the vertex shader
1565- draw: add tess/gs support for multiview index
1566- llvmpipe: add the view index callback from draw
1567- llvmpipe: add view index support to rasterizer
1568- lavapipe: add clear support for multiview
1569- lavapipe: add draw support for multiview
1570- lavapipe: add input attachment support for multiview
1571- lavapipe: add render pass support for multiview
1572- lavapipe: enable KHR_multiview
1573- llvmpipe: add reduction mode support
1574- lavapipe: add EXT_sampler_filter_minmax support
1575- lavapipe: add support for VK_KHR_create_renderpass2
1576- lavapipe: move queue to base object
1577- lavapipe: move to the common casting interfaces
1578- lavapipe: move to common create render pass code
1579- lavapipe: add single ssbo variable pointer support.
1580- docs: update lavapipe features.txt
1581- lavapipe: enable KHR_uniform_buffer_standard_layout
1582- lavapipe: enable EXT_scalar_block_layout
1583- lavapipe: add missing break
1584- lavapipe: fix writing availability for queries.
1585- lavapipe: add host query reset
1586- gallivm: convert packing to uint64 not double
1587- lavapipe: only init immutable samplers for correct types.
1588- lavapipe: add support for KHR_buffer_device_address.
1589- lavapipe: bump maxMemoryAllocationCount
1590- lavapipe: fix image format properties
1591- lavapipe: add missing sampler minmax properties
1592- lavapipe: add missing device group api
1593- lavapipe: drop unused vk_format in image struct
1594- lavapipe: fix templated descriptor updates
1595- gallivm: fix non-32bit ubo loads
1596- gallivm/nir: handle bool registers.
1597- nir: port fp16 casting code from dxil
1598- nir: lower 64-bit floats to 32-bit first.
1599- gallivm: use fp16 casts lowering
1600- lavapipe: enable 8/16-bit storage extensions
1601- llvmpipe: fix cube image size query
1602- st/glthread: allow for invalid L3 cache id.
1603- util: rework AMD cpu L3 cache affinity code.
1604- gallivm: add 64-bit atomic support for ssbo/shared.
1605- gallivm: add 64-bit atomic global support
1606- lavapipe: enable KHR_shader_atomic_int64
1607- lavapipe: only reference pCounterBuffers if non-NULL
1608- lavapipe: fail out if spirv->nir fails
1609- lavapipe: fix only clearing depth or stencil paths.
1610- zink/ci: update results after lavapipe clear fixes
1611- lavapipe: add support for KHR_imageless_framebuffer
1612- drisw: move zink down the list below the sw drivers.
1613- zink/ci: handle getting correct drisw driver.
1614- llvmpipe: when depth clamp is disable clamp to 0.0/1.0
1615- llvmpipe: always take depth clamping from state tracker
1616- ci: update zink/virgl results for depth clamping fixes
1617- lavapipe: add vulkan 1.1 properties/features apis
1618- lavapipe: fix missing protected memory properties
1619- gallivm: add subgroup vote 64-bit and feq support.
1620- gallivm: move get_flt_bld to header.
1621- gallivm: add subgroup system values support
1622- gallivm: add subgroup elect intrinsic support.
1623- gallivm: add subgroup reduction + in/ex scan support
1624- gallivm: add subgroup ballot support
1625- gallivm: add subgroup read invocation support
1626- gallivm: add subgroup lowering support
1627- gallivm: add compute shader subgroup system values support
1628- lavapipe: enable subgroups features
1629- lavapipe: enable correct workgroup sizing
1630- lavapipe: enable Vulkan 1.1 support
1631- docs: update lavapipe bits for 1.1
1632- lavapipe: add vk1.1 image swapchain support
1633- lavapipe: add dummy sampler ycbcr conversion
1634- lavapipe: fix mipmapped resolves.
1635
1636David McFarland (1):
1637
1638- radv: fix divide by zero with no tesselation params
1639
1640Douglas Anderson (1):
1641
1642- gallium/indices: Use "__restrict" to help the compiler
1643
1644Drew Davenport (1):
1645
1646- radeonsi: Report multi-plane formats as unsupported
1647
1648Dylan Baker (33):
1649
1650- VERSION: bump for 21.1.0 cycle
1651- docs: add release notes for 20.3.3
1652- docs: Add sha256sum for 20.3.3
1653- docs: update calendar and link releases notes for 20.3.3
1654- docs: update calendar for 21.0.0-rc1
1655- bin/post_version: convert the csv.reader into a concrete list
1656- docs: add release notes for 20.3.4
1657- docs: Add sha256sum for 20.3.4
1658- docs: update calendar and link releases notes for 20.3.4
1659- docs: update calendar for 21.0.0-rc2
1660- docs: update calendar for 21.0.0-rc3
1661- Scons: check for timespec_get on windows as well as unices
1662- docs: Remove 21.0 features from features_new.txt
1663- docs: add release notes for 21.0.0
1664- docs: update calendar and link releases notes for 21.0.0
1665- docs: Add calendar entries for 21.0 release.
1666- docs: Extend calendar entries for 21.0 by 1 releases.
1667- docs: Add calendar entries for 21.1 release candidates.
1668- docs: add release notes for 20.3.5
1669- docs: Add hashes for 20.3.5
1670- docs: update calendar and link releases notes for 20.3.5
1671- docs: add release notes for 21.0.1
1672- docs: Add 21.0.1 hashes
1673- docs: update calendar and link releases notes for 21.0.1
1674- docs: add release notes for 21.0.2
1675- relnotes: Add sha256sum for 21.0.2
1676- docs: update calendar and link releases notes for 21.0.2
1677- meson: OpenMP is supposed to be optional
1678- .pick_status.json: Update to ee9b744cb5d1466960e78b1de44ad345590e348c
1679- VERSION: bump for 21.1.0-rc3
1680- .pick_status.json: Update to cbd6e5f2e592a9834a03004a473537f25aea4336
1681- .pick_status.json: Update to ede0b3c643279f4126fb10552a2f1d00be27f16d
1682- .pick_status.json: Update to b80720acb13e1014aea89e6bd25f22d43df85356
1683
1684Edward O'Callaghan (1):
1685
1686- clover: Implement CL_MEM_OBJECT_IMAGE1D
1687
1688Eleni Maria Stea (7):
1689
1690- anv: Added the VK_EXT_sample_locations extension to the anv_extensions list
1691- anv: Implement physical device properties for VK_EXT_sample_locations
1692- anv/state: Take explicit sample locations in emit helpers
1693- anv: Add support for sample locations
1694- anv: Removed unused header file
1695- anv: Enabled the VK_EXT_sample_locations extension
1696- iris: fix in fences backend for ext_external_objects edge case
1697
1698Ella-0 (1):
1699
1700- glsl: build without bison
1701
1702Enrico Galli (2):
1703
1704- microsoft/spirv_to_dxil: Add support for load_vulkan_descriptor
1705- microsoft/spirv_to_dxil: Use non-zero exit code on failed compilations
1706
1707Eric Anholt (207):
1708
1709- gallium/ttn: Add support for TGSI_OPCODE_I64NEG/ABS.
1710- gallium/ntt: Stop lowering integer source mods.
1711- gallium/tgsi: Assert that we don't see integer abs modifiers.
1712- gallium/tgsi: Remove support for f64 src modifiers.
1713- gallium/tgsi: Rewrite the docs on source modifiers.
1714- gallium/tgsi: garbage collect unused TGSI_UTIL_SIGN_MODE.
1715- mesa/st: Make a single helper for the NIR-to-TGSI transfer.
1716- mesa/st: Lower shader images before handing off to NIR-to-TGSI.
1717- mesa/st: Dump nir-to-tgsi output when ST_DEBUG=tgsi or nir is set.
1718- gallium/ntt: Don't vectorize IBFE/UBFE/BFI.
1719- gallium/ntt: Add support for store_per_vertex_output.
1720- gallium/ntt: Avoid referencing undefined channels of system values.
1721- ci/freedreno: Mark some a5xx separate_shader tests as xfails.
1722- ci/freedreno: Fix up the xfail/flake handling of a3xx texture functions.
1723- ci/freedreno: Remove a bunch of stale flakes from a3xx.
1724- ci/freedreno: Drop some long-unseen a5xx flakes.
1725- ci/freedreno: Drop skip list stuff from a5xx flakes.
1726- ci/freedreno: Remove some long-unseen a6xx known flakes.
1727- util/format: Fix pack/unpack of A1R5G5B5_UINT.
1728- mesa: Add some little unit tests showing format unpack behavior.
1729- mesa: Drop incorrect statement about Z unpack behavior.
1730- mesa: Replace the float[4] unpack code with util/format's.
1731- mesa: Make _mesa_unpack_rgba_block() use the u_format pack/unpack.
1732- util: Move most of src/mesa/main/format_utils.h to util/format/
1733- util: Fix UBSan failure on _mesa_unorm_to_unorm.
1734- util: Fix rounding of unpack_unorm8 from small unorm formats.
1735- mesa: Reuse util_format's unpack_8unorm.
1736- mesa: Reuse util_format's unpack functions for pure integer formats.
1737- util: Give a reasonable answer when unpacking z32unorm from floats.
1738- mesa: Use a bunch of util functions for Z/S unpacking.
1739- mesa: Move the rest of format_unpack.py out of code generation.
1740- util/format: Simplify the generated unpack code.
1741- swrast: Use util_format_write_4/4ub for the scattered pixel writes.
1742- mesa/main: Replace float pack function with util_format_pack_rgba().
1743- mesa/main: Replace the uint format packing code with util/format's.
1744- ci/piglit: Upgrade to a newer piglit in our containers.
1745- ci/freedreno: Fix xfail setup for sampler3d_float_vertex.
1746- nir_to_tgsi: Store directly to TGSI outputs when possible.
1747- r300,i915g: Report no shader buffers or images on non-TCL HW.
1748- nir_to_tgsi: Fix buffer overflow in atomic image compswap.
1749- swr: Don't report support for shader images.
1750- panfrost: Stub out set_shader_images().
1751- gallium: Fix leak of shader images on context destruction.
1752- util/bitset: Avoid dereferencing the bitset for size == 0.
1753- ci: Add a fractional deqp run of softpipe with asan enabled.
1754- freedreno/a6xx: Skip the body of emit_state if we're clean.
1755- freedreno: Move blend gmem checks to a blend dirty state check.
1756- freedreno: Move framebuffer state checks under a ctx->dirty flag.
1757- freedreno: Skip some batch dependency tracking if !ctx->dirty.
1758- ci/freedreno: Detect cheza HFI errors and restart the run.
1759- ci/freedreno: Ban vs-clip-vertex-enables which flakes in CI.
1760- ci/freedreno: Ban more flaky clip-enables tests.
1761- ci/freedreno: Make a630 piglit_shader run a manual run, too.
1762- freedreno: Use a real type instead of void * for the fd_batch->key.
1763- freedreno: Early-out from the resource write path when we're the writer.
1764- freedreno: Remove duplicate bc invalidate on flush_write_batch().
1765- ci: Update baremetal kernel to 5.11-rc5 plus patches.
1766- mesa/st: Allocate the gl_context with 16-byte alignment.
1767- ci/freedreno: Drop pointless GIT_STRATEGY setting for a630.
1768- ci/freedreno: Use the new nginx cache for trace downloads.
1769- ci/freedreno: Use the http cache for artifacts downloads, too.
1770- ci/docs: Update CI farm requirements suggestions.
1771- docs/ci: Document setting up the http cache for traces.
1772- ci/lava+baremetal: Add an xserver to the root fs.
1773- ci/freedreno: Do our piglit runs against Xorg.
1774- ci/freedreno: Add Valve games and other traces now that we have GLX.
1775- freedreno: Make sure that queries are disabled during shadow blits.
1776- freedreno: rename batch->active_providers to query_providers_used.
1777- freedreno: Backport a5xx/a6xx fix for active query handling.
1778- freedreno: Drop pointless clear of used providers.
1779- freedreno/a6xx: Skip guessing VSC size with indirect TF draw counts.
1780- docs: Document PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME.
1781- freedreno/a6xx: Add support for glDrawTransformFeedback().
1782- ci/deqp: Bump runner to 0.5.1 for recent runtime perf improvements.
1783- ci/freedreno: bump VK coverage to 1/4 of the CTS.
1784- ci/freedreno: Run a3xx gles3 in parallel and increase coverage.
1785- ci/virgl: Fix GLES31 testing on desktop GL.
1786- freedreno: Force updating active queries on batch reordering.
1787- freedreno: Remove FD_STAGE_* in favor of a "disable_all" flag.
1788- freedreno/a5xx: Don't forget to count our custom blits against queries.
1789- mesa/st: Always precompile the first shader variant.
1790- mesa/st: Assume that the default variant is always first in the list.
1791- vc4: Remove vestiges of alpha test lowering.
1792- vc4: Stop advertising support for PIPE_CAP_TWO_SIDED_COLOR.
1793- vc4: Stop advertising support for VS color clamping.
1794- v3d: Clean up vestiges of alpha test lowering.
1795- v3d: Stop advertising support for PIPE_CAP_TWO_SIDED_COLOR.
1796- v3d: Stop advertising support for PIPE_CAP_*_COLOR_CLAMPED.
1797- v3d: Stop advertising support for flat shading.
1798- gallium: Document behavior of more lowering pipe caps.
1799- freedreno: Add missing dep on freedreno tracepoints.
1800- gallium: Flip the default value of PIPE_CAP_SHAREABLE_SHADERS.
1801- virgl: Drop a context dependency from part of the shader compile path.
1802- v3d/qpu: Avoid leaking memory in the QPU disasm test.
1803- mesa/st: Make sure to unbind cb0 on transition away from gs/tess shaders.
1804- ci: Allow better customization of the name of the artifacts for minio.
1805- ci/freedreno: Add a fractional gles31 run with asan enabled.
1806- ci/freedreno: Drop the "arm64" in front of job names.
1807- ci: Move specific driver testing to separate files in separate dirs.
1808- ci/freedreno: Fix a5xx piglit runs.
1809- ci/freedreno: Remove stray BM_DTB definition.
1810- ci/bare-metal: Use an upstream kernel for db820c.
1811- ci/a5xx: Update the piglit expectations.
1812- ci/a5xx: Increase our dEQP GLES3 fraction by 4x.
1813- ci: Move the dEQP and traces expectations to the per-driver CI dirs.
1814- ci: Move the piglit expectations lists to the per-driver CI dirs.
1815- ci/zink: Add tests of gles2, gles3, and gl33 on lavapipe.
1816- zink: Use mesa_loge() for should-never-be-reached initialization errors.
1817- zink: Remove NULL checks after GET_PROC_ADDR_INSTANCE().
1818- softpipe: Fix the const buffer overflow check.
1819- mesa: Get the FXT1 compressor/decompressor off of GL types.
1820- mesa: Move the FXT1 compressor/decompressor to util/
1821- llvmpipe: Enable FXT1 texture decompression.
1822- v3d: Replace driver lowering of GL_CLAMP with mesa/st's.
1823- ci/piglit: Stop including the test counts at the end of expectations.
1824- ci/iris: Move the traces yml file to the driver-specific dir.
1825- mesa: Always make sure uniform storage doesn't get reallocated.
1826- freedreno: Remove uniform variables after finalizing NIR.
1827- freedreno: Drop custom driver lowering of two-sided color.
1828- freedreno: Drop custom driver lowering of GL's color clamping.
1829- freedreno: Use the mesa/st frontend lowering of GL_CLAMP.
1830- freedreno/a5xx+: Stop recompiling on texture samples changes.
1831- freedreno/a5xx+: Drop the unused no_decode_srgb flag.
1832- freedreno/a5xx: Fix cube image load/stores.
1833- nir: Add a nir_src_is_undef() helper, like nir_src_is_const().
1834- nir/vec_to_movs: Don't generate MOVs for undef channels.
1835- ci: Move deqp-default-skips.txt back to .gitlab-ci/
1836- ci/lava: Move the per-driver gitlab-ci.yml to each driver.
1837- ci/lava: Move the driver expectation files to the per-driver CI dir.
1838- tgsi_exec: Roll the loops for condmask handling.
1839- tgsi_exec: Jump over entirely non-taken THEN or ELSE branches.
1840- ci/freedreno: Also retest when only CI configuration changes.
1841- ci/freedreno: Switch the fastboot boards to using nfsroot.
1842- ci/a5xx: Run all of gles2 in one job.
1843- ci/a3xx: Run all of GLES3 dEQP.
1844- ci/a5xx: Increase the gles3/31 coverage.
1845- ci/a5xx: Update piglit expectations.
1846- ci/zink: Add another primitive restart flake.
1847- ci/turnip: Mark a flaky WSI test.
1848- lima: stop encoding the texture format in the shader key
1849- lima: don't look at dirty bits for setup of FS key
1850- lima: upload the shader to a BO at shader creation
1851- lima: avoid stomping over bound shader state when creating new shaders
1852- nir-to-tgsi: Fix handling of partial writemasks on SSA/REG decls.
1853- docs: Add some documentation of game GL buffer object mapping behavior.
1854- freedreno/a5xx: Introduce an event write helper like a6xx has.
1855- freedreno/a5xx: Flush depth at the end of sysmem, like a6xx does.
1856- ci/freedreno: Mark another a5xx TF flake.
1857- u_format: Mark the generated pack/unpack src/dst args as restrict.
1858- mesa/st: Unify st_get_vp_variant() and st_get_common_variant().
1859- mesa/st: Add perf debug for draw-time variant compiles.
1860- mesa/st: Fix precompile misses on compat GL VSes writing to color outputs.
1861- virgl: Update GLES expectations.
1862- ci/freedreno: Add three more a5xx flakes from the last day.
1863- freedreno/a5xx: Fix the texel buffer alignment requirement.
1864- freedreno/a5xx: Fix the max texture buffer size.
1865- ci/panfrost: Disable t860/radeonsi testing while the runners are struggling.
1866- ci: Bump deqp-runner to v0.6.3.
1867- ci/freedreno: Switch the piglit testing to the new piglit runner.
1868- ci/bare-metal: Restart a run on intermittent kernel lockups.
1869- ci/freedreno: Mark an a630 piglit flake from async shader compiling.
1870- ci/freedreno: Mark the rest of the glx_arb_sync_control@timing as flakes.
1871- nir_to_tgsi: Respect PIPE_SHADER_CAP_TGSI_SQRT_SUPPORTED.
1872- freedreno/a5xx: Fix stream-output binning handling.
1873- freedreno/ir3: Demote centroid usage to pixel on non-msaa.
1874- ci/freedreno: Mark all of dEQP TF as flaky.
1875- ci/bare-metal: Move the db820c lockup detect to the right boot script.
1876- ci/freedreno: Mark glx-swap-copy as a flake on a630.
1877- freedreno/a6xx: Rename the RB_BLIT_INFO.INTEGER field to SAMPLE_0.
1878- freedreno/a6xx: Disable sample averaging on depth/stencil resolves.
1879- freedreno: Move the ir3 linked shader cache to the context.
1880- freedreno/a3xx: Switch to using ir3_cache for looking up our VS/FS.
1881- freedreno/a4xx: Switch to using ir3_cache for looking up our VS/FS
1882- freedreno/a5xx: Switch to using ir3_cache for looking up our VS/FS
1883- turnip: Fix KGSL build since common dispatch rework.
1884- broadcom: Disbale CLIF dumping when libexpat isn't available.
1885- ci/android: Make sure we don't detect system libexpat.
1886- ci/android: Build the v3dv driver.
1887- iris: Flag for resolves when stencil enable changes, too.
1888- freedreno: Assert that TF prims generated can ignore active_queries.
1889- freedreno/ir3: Move max-tf-vtx calculation to a .c file
1890- freedreno: Move max-tf-vtx calculation to just the HW that needs it.
1891- freedreno: Move the primitives generated/written updates after the draw.
1892- freedreno: Don't count SW TF queries on a6xx.
1893- freedreno: Clamp TF prims written to buffer size pre-a6xx.
1894- ci/freedreno: Mark a630 as flaky on arb_draw_indirect-transform-feedback
1895- nir: Update clip_distance_array_size in clip lowering.
1896- freedreno/a6xx: Use the frontend userclip lowering.
1897- freedreno/a5xx: Add support for clip distances and use them for userclip.
1898- freedreno/a5xx: Use VALIDREG/CONDREG like a6xx do.
1899- ci/freedreno: Demote a630-asan to a manual test for now.
1900- ci: Drop the custom db820c kernel/dtb from the kernel+rootfs.
1901- ci/freedreno: Add more new traces for a630 (minetest, TDM, pioneer, glyphy).
1902- ci/freedreno: Rename a306-test and a530-test to drop "arm64" from the name.
1903- ci/freedreno: Add trace testing on a3xx, a5xx.
1904- freedreno/a5xx: Fix alpha test vs early Z bugs.
1905- freedreno/a6xx: Fix alpha tests.
1906- ci/freedreno: Switch to the trimmed glxgears trace.
1907- ci/freedreno: Fix up the a5xx border color flake annotation.
1908- ci: Uprev deqp runner to 0.6.5.
1909- ci: Uprev piglit to 6a4be9e9946d ("piglit: NOTE! Default branch is now main")
1910- ci: bump bare-metal kernel to bring in an a530 stability fix
1911- freedreno: Fix YUV sampler regression.
1912- nir_to_tgsi: Use ARL instead of UARL in the !native_integers case.
1913- nir: Generate load_ubo_vec4 directly for !PIPE_CAP_NATIVE_INTEGERS
1914- freedreno/a6xx: Don't try to do Z-as-RGBA blits for mismatched formats.
1915- ci/virgl: Mark a couple of new Crash tests as flakes.
1916
1917Eric Engestrom (12):
1918
1919- VERSION: bump to 21.1.0-rc1
1920- .pick_status.json: Update to c74d93cf0187e07cdfacc448a947a8cae485eb41
1921- .pick_status.json: Update to 95d9d811c91076d50385b2fbd330335b68688c69
1922- .pick_status.json: Update to fcb5ba58165cd407408f8dd9a102f0c5e16a9956
1923- VERSION: bump for 21.1.0-rc2
1924- .pick_status.json: Mark 8acf361db4190aa5f7c788019d1e42d1df031b81 as denominated
1925- .pick_status.json: Update to 35a28e038107410bb6a733c51cbd267aa79a4b20
1926- .pick_status.json: Update to 7e905bd00f32b4fa48689a8e6266b145662cfc48
1927- .pick_status.json: Update to 72eca47c660b6c6051be5a5a80660ae765ecbaa5
1928- .pick_status.json: Update to f3d2fade82c168a7ffffa4bd7bf22585c45c711b
1929- .pick_status.json: Update to f5d6a1b916fb163ee72e6a6f356937b1fbac53e0
1930- .pick_status.json: Update to 1d418e79b8a0f4270775277b7115b88ac4c77113
1931
1932Erico Nunes (15):
1933
1934- lima: introduce fs and vs shader cache
1935- lima/ppir: fix creation of mov node for non-ssa tex dest
1936- lima: set yuv formats as external_only
1937- lima: enable r and rg pixel formats again
1938- lima: always set stride in texture descriptor
1939- lima: implement GL_EXT_texture_swizzle
1940- docs/features: add lima features
1941- lima: fix max sampler views
1942- lima: run nir dce after nir_lower_vec_to_movs
1943- lima/ppir: remove liveness info from blocks
1944- lima/ppir: remove use of live_out
1945- lima/ppir: rework liveness data structures to bitset
1946- lima: fix half float render
1947- lima: enable rg formats for fp16 render
1948- lima: increase epsilon for depthrange near == far
1949
1950Erik Faye-Lund (158):
1951
1952- zink: handle NULL views in zink_set_sampler_views
1953- zink: fix vertex-stride wrangling
1954- docs: fix sphinx-warnings due to lacking escaping
1955- docs: fix broken link
1956- docs: turn non-code into comment
1957- docs/features: add missing features for zink
1958- docs/features: remove a few redundant zink mentions
1959- zink: always expose linear float textures
1960- zink: respect feature-cap for robust buffer access
1961- zink: respect feature-cap for independent blending
1962- zink: respect feature-cap for sample-shading
1963- zink: respect feature-cap for multi-draw indirect
1964- zink: check for extension instead of function
1965- zink: require vulkan memory model for tesselation
1966- zink: make all xfb caps depend on extension
1967- zink: respect fragment-shader depth-layout
1968- zink: clone shader before lowering clip_halfz
1969- docs/zink: add missing colon
1970- docs/zink: add two missing required features
1971- docs/zink: document the independentBlend requirement for GL3
1972- docs/zink: fix phrasing of GL 3.3 requirements
1973- docs/zink: add GL 4.0 requirements
1974- docs/zink: add GL 4.1 requirements
1975- docs/zink: add GL 4.2 requirements
1976- docs/features: mark off two more extensions for zink
1977- docs/zink: correct vk version for GL 4.2
1978- mesa/main: remove leftover bumpmap code
1979- compiler/nir: add texcoord replace lowering pass
1980- gallium/st: lower point-sprites if not supported
1981- zink: request texcoord replace lowering
1982- docs/features: mark ssbos as done for zink
1983- zink: remove stale TODO
1984- zink: be more careful about limits when unsupported
1985- zink: correct return-type for function
1986- zink: only emit SpvCapabilityDerivativeControl when needed
1987- zink: only emit cap when needed
1988- zik: correct spir-v caps for textures and images
1989- zink: do not insist shaders come from glsl
1990- zink: add a get_primitive_mode-helper
1991- zink: add a get_spacing-helper
1992- zink: refactor vertex-order emitting
1993- zink: wrap some long lines
1994- docs: fix invalid rst syntax
1995- zink: check for error when calling vkEnumeratePhysicalDevices
1996- zink: explicitly check for VK_NULL_HANDLE
1997- zink: support using lavapipe
1998- CI: always expose docs artifacts
1999- ci: make sure all lava-builders have libvulkan
2000- ci: run piglit on zink with lavapipe
2001- lavapipe: report correct value for minMemoryMapAlignment
2002- ci: document arm oddity in build-rules
2003- zink: correctly handle 64 valid timestamp bits
2004- zink: enable excluded test
2005- ci: enable max texture size tests for zink
2006- lavapipe: handle null-buffers for xfb
2007- ci: disable sporadically failing test
2008- zink: drop extra set of parens
2009- zink: do not use extra staging resource unless needed
2010- zink: don't always require linear display-targets
2011- zink: limit host-visible bind-flags
2012- zink: ignore irrelevant bind-flags
2013- zink: use gallium api to copy to display-target
2014- zink: add X32_S8X24 format
2015- zink: correct inaccurate comment
2016- lavapipe: fix primitive-restart for uint8 indices
2017- zink: fix emulation of no mipfilter
2018- zink: fix free of ralloced pointer
2019- gallium/st: fix shader_has_one_variant
2020- gallium/st: fix shader_has_one_variant
2021- gallium/st: reserve space in default uniform block for lowered constants
2022- docs: remove stray newline
2023- docs: remove excessive wrapping
2024- docs: remove excessive quoting
2025- docs: document zink GL 4.3 requirements
2026- docs: document zink GL 4.4 requirements
2027- docs: document zink GL 4.5 requirements
2028- docs: document zink GL 4.6 requirements
2029- docs: simplify format requirements
2030- zink: factor out interpolation to helper
2031- zink: emit all interpolation modes
2032- zink: check for pipeline statistics feature
2033- zink: check for depth-bias-clamp feature
2034- zink: check for stores and atomics features
2035- zink: add missing required feature
2036- zink: check for mirror-clamp extension
2037- zink: fix vector comparison
2038- zink: drop bool attempt in float vector compares
2039- zink: do not open-code vector-compares
2040- zink: follow spir-v 1.0 spec
2041- docs: Add 21.0.0 hashes
2042- zink: tighten emitted image spir-v caps
2043- zink: remove no-longer-needed clipdist1 patching
2044- frontends/va: correct check for invalid format
2045- zink: handle errors in nir_to_spirv
2046- zink: pre-populate locations in variables
2047- zink: do not depend on shader_slots_reserved for xfb
2048- zink: use pre-populated shader-locations
2049- lavapipe: report correct value for maxTexelBufferElements
2050- docs: do not try to copy missing file
2051- compiler/glsl: avoid null-pointer deref
2052- docs: remove bogus zink-requirement
2053- docs: remove zink incorrect requirement
2054- zink: do not enable unused extension
2055- docs: clarify VK_KHR_external_memory requirement
2056- zink: check base-requirements
2057- zink: assert that pstage is within range
2058- zink: simplify shader-removal
2059- zink: document why we're calling pipe_shader_type_from_mesa
2060- docs: appling -> applying
2061- docs: sytem -> system
2062- docs: ie. -> i.e.
2063- docs: vulkan -> Vulkan
2064- zink: do not request scoped memory barriers
2065- docs: optimisation -> optimization
2066- docs: opencl -> OpenCL
2067- docs: Xorg -> X.Org
2068- docs: nops -> NOPs
2069- docs: lod -> LOD
2070- docs: lex / yacc -> Lex / Yacc
2071- docs: dfsm -> DFSM
2072- docs: fix incorrect possessive form
2073- docs: fix invalid rst
2074- docs: fix rst-quoting issues in release-notes
2075- docs: spell out full name of gitlab instance
2076- docs: spell out development
2077- docs: spell out environment
2078- docs: spell out freedesktop.org
2079- docs: no-op'd -> disabled
2080- docs: fix release notes for 20.3.5
2081- ci: turn sphinx-build warnings into errors
2082- bin/gen_release_notes.py: more robust rST escaping
2083- compiler/glsl: correct the number of string-arguments
2084- compiler/glsl: fix volatile string
2085- compiler/glsl: clean up output
2086- glsl: fix is_integer_16_32
2087- glsl: fix int16 type
2088- glsl: tolerate int16 loop counters
2089- gallium/st: correct range for float16
2090- gallium/st: correct range for int16
2091- zink: document scalarBlockLayout requirement
2092- zink: fix typo in function name
2093- compiler/glsl: drop rogue argument to _mesa_glsl_error
2094- compiler/glsl: do not cast struct to string
2095- lavapipe: do not subtract 8 from enum
2096- lavapipe: check all vertex-stages
2097- lavapipe: check all graphics stages
2098- lavapipe: ask pipe-driver for int16 support
2099- zink: do not clear on cpu
2100- zink: fall back from cached to non-cached memory
2101- zink: do not dereference NULL pointer
2102- zink: verify that src/dst support blitting
2103- zink: verify that source-format support linear-filter
2104- zink: fix stencil-export cap emission
2105- gallivm: handle 16-bit input in i2b32
2106- zink: do not read outside of array
2107- zink: do not require vulkan memory model for shader-images
2108- zink: correct image cap checks
2109- zink: fix shader-image requirements
2110
2111Fan Yugang (1):
2112
2113- intel/tools: Show unknown instructions in decoded state.
2114
2115Francisco Jerez (9):
2116
2117- intel/gen12: Fix memory corruption issues in fused Gen12 parts.
2118- intel/genxml: Fix pixel hashing 3DSTATE_3D_MODE field definitions for Gen12 and Gen12.5.
2119- intel/genxml: Define 3DSTATE_SUBSLICE_HASH_TABLE command for Gen12 and Gen12.5.
2120- intel/dev: Implement pixel pipe subslice counting for Gen12+.
2121- iris/gen11+: Calculate pixel hashing tables instead of hardcoding.
2122- iris/gen12: Implement programming of pixel pipe hashing tables.
2123- anv/gen11+: Calculate pixel hashing tables instead of hardcoding.
2124- anv/gen12: Implement programming of pixel pipe hashing tables.
2125- iris/gen12: Work around push constant corruption on context switch.
2126
2127Georg Lehmann (1):
2128
2129- vulkan/device_select: Only call vkGetPhysicalDeviceProperties2 if the device supports it.
2130
2131Gert Wollny (89):
2132
2133- r600/nir: clone shader before first query to shader key
2134- r600/sfn: fix use of b32all/and
2135- r600: Add flags to INTERP_X and INTERP_Z two-slot ops
2136- r600/sb: Add support for INTERP_X and INTERP_Z ops
2137- r600/nir: pass array info to r600_shader for sb
2138- r600/sfn: update shader array info
2139- r600/sfn: Keep array registers alive for the whole shader
2140- r600/sb: fix boundary assert for mem-instruction decoding
2141- r600/sb: fall back to un-optimized byte code when ra_init fails
2142- r600: Enable sb also for NIR
2143- compiler/nir: Add support for lowering stores with nir_lower_instruction
2144- r600/sfn: Fix use of cnde_int for bcsel
2145- r600/sfn: Set unnormalized flag for z-coordinate when fetching from array
2146- r600/sfn: Add the position input as varying
2147- r600/sfn: Fix FS inputs when reading from the same position
2148- r600/sfn: Fix dual source blend lowered to FRAG_DATA
2149- r600/sfn: Use the constant buffer ID when given
2150- gallium/tgsi_to_nir: Handle SAMPLE_MASK output in FS
2151- gallium/tgsi-to-nir: Take property NUM_CLIPDIST_ENABLED into account
2152- r600/sfn: Handle memory_barrier_atomic_counters
2153- r600/sfn: Fix indirect_file flag for IMAGES
2154- r600/sfn: remove duplicate barriers
2155- r600/sfn: Base instr lowering class on nir_lower_instruction code
2156- nir: Add flag to tex instruction to indicate lowering cube to array
2157- nir: Add r600 specific CUBE opcode to evaluate cube texture coords and face
2158- r600/sfn: Add support for cube_r600 instruction
2159- r600/sfn: add lowering pass for cube textures
2160- r600/sfn: fix gather with cube lowering
2161- r600/sfn: use lower bool to int32 and lower int_tg4 only on shader clone
2162- r600/sfn: use lowering pass for cube textures
2163- r600/sfn: remove old cube texturing code
2164- r600/sfn: Lower FS inputs to temps late and, and lower interpolate at
2165- r600/sfn: set info about using helper_invocation to skip sb
2166- r600/sfn: lower isign and iabs in nir
2167- r600/sfn: Allow any channel for the helper invocation evaluation
2168- r600: unify nir shader options evaluation
2169- r600/sfn: remove code for nir_op_fsign since it is lowered
2170- r600/sfn: remove unused emit_alu_op2_split_src_mods
2171- r600/sfn: remove some old debug output
2172- r600/sfn: encode component in address for local IO
2173- nir: disaallow reordering for r600 shared load and remove component field
2174- r600/sfn: handle querying the number of layers in cube arrays
2175- r600/sfn: Fix loading TES gl_PatchVerticesIn
2176- r600: Don't optimize using source modifiers on literals
2177- r600: Enable GLSL 450 for nir shaders.
2178- r600/sfn: Update status
2179- nir: Add r600 specific intrinsic for loading the tesselation coords
2180- r600/sfn: lower intrinsic_load_tess_coord to driver version
2181- r600/sfn: eliminate loading unused component loads from shared memory
2182- virgl: implement support for  PIPE_CAP_STRING_MARKER
2183- r600/sfn: sort alu opcodes in switch statememt
2184- r600/sfn: remove unused code
2185- r600/sfn: fix buffer offset for ssbo writes
2186- r600/sfn: Fix including/not including c++ parts of header
2187- r600/sfn: lower bitfield_extract and bitfield_insert in NIR
2188- r600/sfn: lower idiv, imod, etc in nir
2189- r600/sfn remove some leftover debug output
2190- nir: add opcodes for \*find_msb_rev and lowering
2191- nir: Add opcodes for fused comp + csel and optimizations
2192- nir: Add r600 specific sin and cos variants
2193- r600/sfn: Add algebraic lowering for fsin and  fcos
2194- r600/sfn: optimize comp+csel using fused ops
2195- r600/sfn: lower find_msb variants to find_msb_rev
2196- r60/sfn: don't lower scomp
2197- r600: Handle negate of second operator in TGSI_OPCODE_UADD
2198- r600/sfn: Make some value pool functions private
2199- r600/sfn: Add skelton for visitor pattern
2200- r600/sfn: remove extra parameter from alu assemebly emission
2201- r600/sfn: fix some formatting
2202- r600/sfn: switch assembler creation to use visitor
2203- r600/sfn: drop the local register mep
2204- r600/sfn: lower VS IO and drop old deref code
2205- r600/sfn: lowered FS output IO
2206- r600/sfn: remove old deref code
2207- r600/sfn: force dual source blend output handling in some cases
2208- r600/sfn: remove find_msb lowering in driver
2209- r600/sfn: clean up multi-sample texture load
2210- r600/sfn: clean up value pool interface usage in emit_stream
2211- r600/sfn: use get_temp_vec4 directly when fetching
2212- r600/sfn: make allocate_temp_register private to valuepool
2213- r600: Fix texture buffer offset alignment
2214- r600: don't set an index_bias for indirect draw calls
2215- r600: Don't advertise support for scaled int16 vertex formats
2216- r600/sfn: allocate number of VS outputs based on max driver_location
2217- r600/sfn: Only fetch the constants that are needed in fdd*
2218- r600: Enable sb for nir only on specific request
2219- Revert "r600: don't set an index_bias for indirect draw calls"
2220- Revert "r600: Don't advertise support for scaled int16 vertex formats"
2221- r600: don't set an index_bias for indirect draw calls
2222
2223Giovanni Mascellani (2):
2224
2225- disk_cache: Fail creation when cannot inizialize queue.
2226- anv: Allow null handle in DestroyDescriptorUpdateTemplate.
2227
2228Greg V (1):
2229
2230- meson: Add missing lavapipe dep
2231
2232Gustavo Padovan (4):
2233
2234- gitlab-ci: extend x86_64 kernel config to suport Intel devices
2235- gitlab-ci: build the iris gallium driver as well
2236- gitlab-ci: add intel APL and GLK devices with manual triggers
2237- traces-iris: fix expectation for Intel GLK
2238
2239Hans-Kristian Arntzen (2):
2240
2241- radv: Take image alignment into account when allocating MUTABLE pool.
2242- radv: Allocate buffer list for MUTABLE descriptor types as well.
2243
2244Hoe Hao Cheng (19):
2245
2246- zink/codegen: add some new attributes to Extension
2247- zink/codegen: make 'struct' optional in Version
2248- zink/codegen: enable instance extension unconditionally if promoted
2249- zink/codegen: codegen-ize load_instance_extensions()
2250- zink/codegen: adding/fixing comments and copyright notice
2251- zink/codegen: find promotion version using vulkan registry
2252- zink: remove excessive checks for loader version
2253- zink: expose PIPE_CAP_ANISOTROPIC_FILTER
2254- zink: VK_KHR_draw_indirect_count is a device extension
2255- zink/codegen: introduce ExtensionRegistry
2256- zink/codegen: more validation in zink_instance
2257- zink/codegen: introduce notion of non-standard extensions
2258- zink/codegen: make zink_device_info accept vk.xml
2259- zink/codegen: perform basic validation in zink_device_info
2260- zink/codegen: validate has_properties and has_features
2261- zink/codegen: fix type annotations
2262- zink/codegen: do not enable extensions that are now core
2263- zink: enable KHR_shader_draw_parameters on Vulkan <1.2
2264- zink: fix detection of KHR_maintenance1/2
2265
2266Hyunjun Ko (5):
2267
2268- turnip: Return correct value of tu6_load_state_size
2269- nir: Set access at lower_ubo_vec4
2270- ir3: fix has_src() to return correctly in ir3_nir_lower_tex_prefetch
2271- ir3: Add nonuniform encodings to ir3 encoder and parser
2272- turnip: Enable nonuniform descriptor indexing
2273
2274Iago Toral Quiroga (93):
2275
2276- v3dv: only update uniforms for dirty descriptors if stage has descriptors
2277- v3dv: disable early Z writes if Z writes are disabled
2278- v3dv: don't wait for idle on occlusion query pool resets
2279- v3dv: use PIPE_TIMEOUT_INFINITE
2280- v3dv: refactor checks for subpass attachment clears
2281- v3dv: refactor checks for subpass attachment loading
2282- v3dv: refactor checks for subpass attachment stores
2283- v3dv: do not emit full tile buffers clears to handle Z/S clears
2284- v3dv: enable early Z/S clears
2285- v3dv: fix disabling Early Z for the whole frame
2286- broadcom/compiler: fix indentation with TABs
2287- broadcom/compiler: prepare TMU spilling code to account for TMU pipelining
2288- broadcom/compiler: implement pipelining for general TMU operations
2289- broadcom/compiler: support pipelining of tex instructions
2290- broadcom/compiler: refactor image load/store TMU emission code
2291- broadcom/compiler: support pipelining of image load/store instructions
2292- broadcom/compiler: disable TMU pipelining if we fail to register allocate
2293- broadcom/compiler: disallow spilling if TMU pipelining was enabled
2294- broadcom/compiler: log spilling shaders to perf output
2295- broadcom/compiler: let QPUs stall on TMU input/config overflows
2296- v3dv: handle D/S buffer to image copies with the texel buffer path
2297- v3dv: batch copies in the copy_buffer_to_image_blit path
2298- v3dv: allow a component swizzle in copy_buffer_to_image_shader
2299- v3d/common: use spaces instead of TABs
2300- v3dv: serialize pipeline compilation when debugging shaders
2301- v3dv: add a perf trace when a device is created with robust buffer access
2302- v3d/compiler: fix QPU scheduler TMU sequence shuffling
2303- broadcom/compiler: add V3D_QPU_WADDR_UNIFA
2304- broadcom/compiler: pass a devinfo to check if an instruction writes to TMU
2305- broadcom/compiler: name registers correctly based on V3D version
2306- broadcom/compiler: don't check for GFXH-1633 on V3D 4.2.x
2307- broadcom/compiler: add a helper to check if an instruction writes unifa
2308- broadcom/compiler: disallow unifa overlap with thread switch/end
2309- broadcom/compiler: preserve ordering of unifa/ldunifa sequences
2310- broadcom/compiler: ensure 3-slot delay between unifa and ldunifa
2311- broadcom/compiler: disallow reading two uniforms in the same instruction
2312- broadcom/compiler: do not DCE ldunifa
2313- broadcom/compiler: emit ldunifarf when needed
2314- broadcom/compiler: use unifa for UBO loads from uniform addresses
2315- broadcom/compiler: don't emit redundant ldunif
2316- broadcom/compiler: use a helper function to decide on TMU spilling
2317- broadcom/compiler: don't dump shader-db stats for failed shaders
2318- broadcom/compiler: fix ldunif optimization
2319- broadcom/compiler: allow dead code elimination of unused trailing ldunifa
2320- broadcom/compiler: remove unused leading ldunifa
2321- broadcom/compiler: add a constant alu optimization pass
2322- broadcom/compiler: skip unnecessary unifa writes
2323- broadcom/compiler: use nir_opt_sink
2324- v3dv: fix branching to large secondaries with more than one BCL buffer.
2325- broadcom/compiler: fix DAG pre-remove for merged instructions
2326- broadcom/compiler: fix indentation style
2327- broadcom/compiler: track pipelineable ldvary sequences
2328- broadcom/compiler: pipeline smooth ldvary sequences
2329- broadcom/compiler: allow pipelining of flat and noperspective varyings
2330- broadcom/compiler: ldvary pipelining tracking and documentation clean-ups
2331- broadcom/compiler: drop the destination for unused ldunifa
2332- broadcom/compiler: be more aggressive skipping unifa writes
2333- broadcom/compiler: always restart ldvary pipelining when scheduling ldvary
2334- broadcom/compiler: ldvary doesn't implicitly write to r3 since V3D 4.1
2335- broadcom/compiler: fix flags check for ldvary merge
2336- broadcom/compiler: add an additional sanity check assert to the ldvary fixup
2337- broadcom/compiler: move code block around
2338- broadcom/compiler: simplify ldvary pipelining
2339- broadcom/compiler: disallow ldunif during ldvary sequences if possible
2340- v3dv: call util_cpu_detect() when initializing the instance
2341- broadcom/compiler: flag wrtmuc with a read dependency on last_tmu_config
2342- broadcom/compiler: be more flexible scheduling TMU writes
2343- vulkan/util: call glsl_type_singleton_init_or_ref from vk_instance_init
2344- compiler/glsl: call util_cpu_detect from glsl_type_singleton_init_or_ref
2345- broadcom/compiler: fix end of tmu sequence detection
2346- broadcom/compiler: use nir_opt_load_store_vectorize
2347- broadcom/compiler: use nir_lower_wrmasks to simplify TMU general stores
2348- broadcom/compiler: handle implicit uniform loads when optimizing constant alu
2349- broadcom/compiler: optimize constant vfpack
2350- broadcom/compiler: use nir_lower_undef_to_zero
2351- v3dv/pipeline_cache: fix assert
2352- broadcom/compiler: convert add to mul when possible to allow merge
2353- broadcom/compiler: add a v3d_qpu_writes_accum helper
2354- broadcom/compiler: try to fill up delay slots after a thrsw
2355- broadcom/compiler: flag TMU read dependencies against last TMU config
2356- broadcom/compiler: flag TMU reads with a read dependency on last TMU config
2357- broadcom/compiler: dump instruction index when failing to pack instructions
2358- broadcom/compiler: add a NOP count stat to shader-db
2359- broadcom/compiler: try to fill up delay slots after a branch instruction
2360- broadcom/compiler: try to fill up delay slots after unconditional branch
2361- broadcom/compiler: implement restriction for branch after setmsf
2362- broadcom/compiler: optimize branch emission for uniform break/continue
2363- v3dv: fix index buffer binding
2364- broadcom/compiler: add a definition for the unifa skip distance
2365- broadcom/compiler: allow compilation strategies to limit minimum thread count
2366- broadcom/compiler: sort constant UBO loads by index and offset
2367- broadcom/compiler: rename unifa tracking fields
2368- v3dv: fix descriptor set limits
2369
2370Ian Romanick (33):
2371
2372- i965: Don't advertise OpenGL 3.3+ if driconf disables GL_ARB_blend_func_extended
2373- i965: Use allow_higher_compat_version option during screen initialization
2374- i965: Don't parse driconf again
2375- nir/algebraic: Fix a >> #b << #b for sizes other than 32-bit
2376- nir/algebraic: add patterns for a >> #b << #b and a << #b >> #b
2377- nir/algebraic: Partially revert 3f782cdd2591
2378- intel/eu/validate: Add some checks for CMP and CMPN
2379- intel/compiler: Enable the ability to emit CMPN instructions
2380- intel/compiler: Make the CMPN builder work like the CMP builder
2381- intel/compiler: Use CMPN for min / max on Gen4 and Gen5
2382- nir/algebraic: Fix some min/max of b2f replacements
2383- nir/algebraic: Remove some redundant b2f logic-op reduction patterns
2384- nir/algebraic: Add some max/min optimizations with 3 variables
2385- nir/range-analysis: C++ linkage
2386- nir/range_analysis: Handle vectors better in ssa_def_bits_used
2387- intel/compiler: Silence unused parameter warnings in files that include brw_eu.h
2388- intel: Silence unused parameter warnings in files that include gen_device_info.h
2389- intel: Silence unused parameter warnings in files that include genX_pack.h
2390- intel/compiler: Relax some conditions in try_copy_propagate
2391- gallium/dri: Remove dri2_format_mapping::cpp
2392- nir/search: Constify instruction parameter to search helpers
2393- nir/algebraic: Apply addition property of equality more conservatively
2394- nir/algebraic: Apply addition property of equality to the other ordering too
2395- nir/range_analysis: Refactor fsat handling
2396- nir/range_analysis: Add "is finite" range analysis tracking
2397- nir/range_analysis: Add "is a number" range analysis tracking
2398- nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN source
2399- nir/search: Use range analysis for is_finite
2400- nir/range_analysis: Simplify analysis of bcsel
2401- mesa: Add anything dynamically indexed before any non-dynamically indexed
2402- mesa: Clean up _mesa_layout_parameters after previous commit
2403- tgsi_exec: Fix NaN behavior of saturate
2404- tgsi_exec: Fix NaN behavior of min and max
2405
2406Icecream95 (60):
2407
2408- pan/bi: Lower 64-bit integers
2409- pan/bi: Handle 64-bit pack and unpack operations
2410- pan/bi: Add some compute intrinsic loads
2411- pan/bi: Set compute lowering options
2412- pan/bi: Improve interoperability of the command-line disassembler
2413- pan/bi: Implement load/store intrinsics
2414- pan/bi: Implement load_kernel_input
2415- panfrost: Set bifrost_props for compute shaders
2416- pan/bi: Improve unknown intrinsic error
2417- panfrost: Use the correct NIR options for OpenCL on Bifrost
2418- pan/bi: Use pan_nir_lower_64bit_intrin
2419- panfrost: Add a sysval for local_group_size
2420- panfrost: Add a sysval for local_work_dim
2421- panfrost: Assert on sysval overflow
2422- pan/mdg: Limit int64 vectorization
2423- pan/mdg: Don't reorder loads/stores past each other
2424- pan/mdg: Allow 64-bit src_bitsize for comparison operations
2425- pan/bi: Add w0 to the 'h01' swizzle bucket
2426- pan/bi: Lower umul_high
2427- panfrost: Set TLS for compute jobs
2428- pan/bi: Implement saturated add/sub operations
2429- pan/bi: Implement ihadd/irhadd operations
2430- pan/bi: Implement packing ops between 32-bit vec1 and 16-bit vec2
2431- pan/mdg: Fix spilling when scratch memory is used
2432- pan/bi: Iterate from zero when setting RA interference
2433- panfrost: Add a function to determine if a resource is 2D
2434- panfrost: Only checksum resources when it makes sense to
2435- panfrost: Add a debug flag to disable checksumming
2436- panfrost: Transaction elimination support
2437- panfrost: Fix the tile size assertion
2438- pan/decode: Free mapped memory objects on BO unreference
2439- panfrost: Add support for INTEL_blackhole_render
2440- panfrost: Use normal malloc/free instead of ralloc for surfaces
2441- panfrost: Add the tiler heap to fragment jobs
2442- pan/bi: Return the size of the last clause from bi_pack
2443- pan/bi: Fix shader prefetch size
2444- panfrost: Fix clear color packing for 12-byte formats
2445- pan/bi: Don't check liveness unless the index is valid
2446- pan/bi: Use the correct size for UBO loads
2447- pan/bi: Remove check for first_ubo_is_default_ubo
2448- pan/bi: Implement image load/store
2449- pan/bi: nir_intrinsic_image_size support
2450- st/mesa: Update constants on alpha test change if it's lowered
2451- panfrost: Disable early-z when alpha test is used
2452- pan/mdg: Rename load/store operations
2453- pan/mdg: Use appropriate sizes for global loads/stores
2454- pipe-loader,gallium/drm: Fix the kmsro pipe_loader target
2455- pipe-loader: Stop trying to use kmsro for vgem
2456- panfrost: Implement panfrost_set_global_binding
2457- panfrost: Flush output after disassembling shaders
2458- panfrost: Only do point coord replacement for PIPE_PRIM_POINTS
2459- panfrost: Only add resource checksum BOs to the batch once
2460- panfrost: Align BO size to 4096 bytes
2461- panfrost: Add fast path for graphics work group computation
2462- panfrost: Unset shared/scanout binding flags for staging resources
2463- pan/bi: Skip nir_opt_move/sink for blend shaders
2464- panfrost: Fix shader texture count
2465- pan/decode: Allow frame shader DCDs to be in another BO than the FBD
2466- pan/mdg: Fix calculation of available work registers
2467- panfrost: Fix viewport scissor for preload draws
2468
2469Ilia Mirkin (55):
2470
2471- nv50/ir: ignore FS_BLEND_EQUATION_ADVANCED
2472- nv50,nvc0: explicitly list recently-added caps
2473- st/mesa: fix broken moves for u2i64 and related ops
2474- nv50/ir: clear dnz flag when converting mul/mad to simpler ops
2475- glsl: only expose int64 atomics when extension is enabled
2476- cso: set index_bounds_valid = true for arrays draws
2477- nvc0: index_bias is now only set for indexed draws
2478- nvc0/ir: add fixup to deal with interpolateAtSample with non-MSAA
2479- nv50,nvc0: clear internal vbo masks based on the trailing slots
2480- ci: remove nouveau from shader-db runs
2481- nouveau: reinstate fencing on screen destroy
2482- nv50: add PIPE_CAP_NIR_IMAGES_AS_DEREF to unsupported list
2483- nv50,nvc0: add scissored clear support
2484- st/mesa: do scissored clears on depth/stencil as well when supported
2485- i965: support GL_EXT_color_buffer_half_float
2486- mesa: fix conditions for fp16 render format eligibility
2487- mesa: fix fbo attachment size check for RBs, make it trigger in ES2
2488- mesa: add tracking of reduction mode
2489- st/mesa: add EXT_texture_filter_minmax support
2490- nvc0: enable minmax reductions on gm200+
2491- docs: add notes about nvc0 support of ARB/EXT_texture_filter_minmax
2492- mesa: only report INCOMPLETE_FORMATS for GLES1 / desktop
2493- gallium,st: add missing viewport swizzles
2494- nv50: initialize target for blit source surfaces
2495- nv50,nvc0: remove explicit target argument from view creation
2496- nv50: add appropriate space check before adding new pushbuffer
2497- nvc0: ensure sufficient push space for indirect data
2498- nvc0: fix reported driver queries for Pascal and later GPUs
2499- mesa: fix restoring of texture attributes for msaa binding points
2500- nv50: adapt texture and constbuf paths for compute shaders
2501- nv50: add resource tracking for shader images and buffers
2502- nv50: implement memory barrier handling
2503- nv50: add texture, constbuf, image, buffer validation
2504- nv50: pass in third axis via user param
2505- nv50/ir: retrieve (n)ctaid.z from first user param
2506- nv50/ir: force shared memory indirect to be an address
2507- nv50/ir: do not use inline offsets for global, ensure indirect access
2508- nv50/ir: fix emission of RED
2509- nv50/ir: lower buffer to global
2510- nv50/ir: fix emitting movs from imm to short registers
2511- nv50/ir: fix emission of cvt with half-reg destinations
2512- nv50/ir: fix emission of logic ops on half-regs
2513- nv50/ir: fix emission of shifts on half-regs
2514- nv50/ir: logic ops on half-regs can't take an immediate
2515- nv50/ir: add support for 16-bit immediates
2516- nv50/ir: fix emission of 16-bit add
2517- nv50/ir: fix emission of cas without a destination
2518- nv50: fix expression for ucp offset
2519- nv50/ir: avoid inlining results of a locked load
2520- nv50/ir: fix emission of ld/st lock/unlock
2521- st/mesa: adapt for the case where buffers are not supported in frag
2522- nv50/ir: fix texture size for msaa textures
2523- nv50: emulate indirect draws
2524- nv50/ir: fake SV_THREAD_KILL support
2525- nv50: enable ARB_framebuffer_no_attachments
2526
2527Italo Nicola (15):
2528
2529- panfrost: fix attribute continuation decoding
2530- panfrost: add 3d attribute buffer continuation to XML
2531- panfrost: decode 3d attribute continuation
2532- panfrost: add resource modifier conversion
2533- panfrost: implement gallium->set_shader_images
2534- panfrost: emit shader image attribute descriptors
2535- panfrost: implement image_size sysval
2536- pan/mdg: create nir pass to lower image coord bitsize
2537- pan/mdg: enable image bitsize lowering pass
2538- pan/mdg: add ld_image opcodes
2539- pan/mdg: rename st_image opcodes and add float16 versions
2540- pan/mdg: implement shader image instructions
2541- pan/mdg: implement nir_intrinsic_image_size
2542- panfrost: advertise images for midgard
2543- pan/mdg: prevent csel_v from being scheduled alongside writeout
2544
2545Iván Briano (4):
2546
2547- anv: don't advertise mipmaps for linear 3D surfaces on BDW
2548- anv: move buffer size alignment into helper function
2549- anv: use helper function to get the buffer size
2550- intel, anv: propagate robustness setting to nir_opt_load_store_vectorize
2551
2552James Jones (4):
2553
2554- nouveau: Stash supported sector layout in screen
2555- nouveau: Use DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D
2556- nouveau: no modifier != the invalid modifier
2557- nouveau: Use format modifiers in buffer allocation
2558
2559James Park (22):
2560
2561- radv: Use linker pragma to simulate weak functions
2562- radv: Remove unnecessary headers
2563- radv: Stub sections that don't have _WIN32 support
2564- radv: Modify radv_descriptor_set for MSVC legality
2565- radv: Pointer arithmetic on char/uint8_t, not void
2566- radv: Disable calibrated timestamps on Windows
2567- radv: Use typed outarray API
2568- radv: Fix struct initialization for MSVC
2569- gallium/tessellator: Fix warning suppression
2570- microsoft: Fix comma in variadic macro for MSVC
2571- ac: Remove unnecessary header
2572- radv: Use typed outarray API
2573- amd: Disable surface modifier test on Windows
2574- radv: Pointer arithmetic without void*
2575- radv: Update JSON generator if Windows
2576- vulkan: Use typed outarray API
2577- vulkan: Fix source list for vulkan_wsi on Windows
2578- vulkan: Update dispatch table gen for Windows
2579- vulkan/util: Use util_bitcount
2580- ac/rgp: BSD elf library compatibility
2581- amd: Hide amdgpu_drm.h on Windows
2582- amd: Hide drm_fourcc.h on Windows
2583
2584James Zhu (1):
2585
2586- amd: add Aldebaran chip enum
2587
2588Jan Beich (2):
2589
2590- ac: prefer system EM_AMDGPU definition
2591- ac/rgp: define EM_AMDGPU if missing for compatibility
2592
2593Jason Ekstrand (195):
2594
2595- intel/tools: Decode COMPUTE_WALKER
2596- intel/fs: Allow compute dispatch without a pushed subgroup ID on Gen12-HP
2597- anv: Add a general state pool
2598- intel/fs: Emit code for Gen12-HP indirect compute data
2599- anv: Enable push constants on gen12-hp
2600- intel/genxml,anv,iris: Drop the legacy compute path from gen125.xml
2601- anv: Add a trivial implementation of VK_KHR_deferred_host_operation
2602- anv: Exit early from cmd_buffer_apply_pipe_flushes
2603- anv: Take the set of stages to flush in flush_descriptor_sets
2604- anv: Only flush descriptors used by the pipeline
2605- anv: Early-exit from cmd_buffer_flush_state
2606- nir/lower_int64: Add a level of wrapper functions
2607- nir/lower_int64: Fix lowering of f2[ui]64 for 16-bit float
2608- nir/lower_int64: Add lowering for some 64-bit subgroup ops
2609- nir/lower_int64: Add lowering for 64-bit iadd shuffle/reduce
2610- nir/lower_int64: Lower 64-bit vote_ieq
2611- intel/compiler: Return 1 for immediates in regs_read
2612- intel/compiler: Move brw_reg_type_for_bit_size to brw_reg_type.h
2613- intel/reg,fs: Handle immediates properly in subscript()
2614- intel/fs: QUAD_SWIZZLE requires packed data
2615- intel/fs: Support 64-bit SEL_EXEC on Gen11+
2616- intel/fs: Support 64-bit SHUFFLE on Gen11+
2617- intel/fs: Support 64-bit CLUSTER_BROADCAST on Gen11+
2618- intel/fs: Properly lower 64-bit MUL on 64-bit-incapable platforms
2619- intel/fs: Refactor our shuffle emit code
2620- intel/fs: Implement umin/umax shuffle
2621- anv: Advertise shaderInt64 on Gen11+
2622- anv: Break SAMPLE_PATTERN and MULTISAMPLE emit into helpers
2623- intel/fs: Add an ex_desc field to fs_inst for SHADER_OPCODE_SEND
2624- anv: Drop anv_dump
2625- anv: Fix an old parameter name in GetDeviceQueue
2626- anv: Refactor anv_queue_finish()
2627- anv: Add an anv_queue_family struct
2628- nir/from_ssa: Respect and populate divergence information
2629- vulkan/meson: Add missing dependencise on vk_extensions_gen.py
2630- anv: Clean up the vk_device on the CreateDevice error path
2631- radv: Properly clean up vk_device
2632- turnip: Properly clean up vk_device
2633- v3dv: Properly clean up vk_device
2634- lavapipe: Properly clean up vk_device
2635- vulkan: Move vk_device to its own file
2636- vulkan: Add a return code to vk_device_init
2637- vulkan: Add common extension tables
2638- anv: Use the common extension table struct
2639- vulkan: Add common dispatch table generation
2640- vulkan: Add dispatch table loading helpers
2641- vulkan-overlay-layer: Use the new dispatch tables
2642- vulkan: Add dispatch table lookup helpers
2643- vulkan: Add common instance and physical device structs
2644- vulkan: Add generators for instance trampoline functions
2645- vulkan: Add entrypoint tables and related helpers
2646- vulkan: Add common Get*ProcAddr implementations
2647- vulkan: Add a common entrypoint table generator
2648- anv: Add a single anv_genX switch macro
2649- anv: Use the common dispatch framework
2650- vulkan: Add framework for common entrypoints
2651- vulkan,anv: Move GetDeviceProcAddr to common code
2652- vulkan,anv: Add common entrypoints for VK_EXT_private_data
2653- anv: Make Get*MemoryRequirements a wrapper
2654- vulkan,anv: Move a bunch of trivial wrappers to common code
2655- vulkan,anv: Move VK_KHR_copy_commands2 wrappers to common code
2656- vulkan: Add a truly common VK_EXT_debug_report implementation
2657- anv: Switch to the common VK_EXT_debug_report
2658- turnip: Use the common dispatch framework
2659- turnip: Use common entrypoints for VK_EXT_private_data
2660- turnip: Drop some legacy wrappers in favor of common code
2661- turnip: Switch to the common VK_EXT_debug_report
2662- lavapipe: Drop some wrappers in favor of common code
2663- v3dv: Drop v3dv_instance::app_info
2664- v3dv: Use common entrypoints for VK_EXT_private_data
2665- v3dv: Switch to the common VK_EXT_debug_report
2666- radv: Use common entrypoints for VK_EXT_private_data
2667- radv: Switch to the common VK_EXT_debug_report
2668- vulkan: Make vk_debug_report_callback derive from vk_object_base
2669- anv: Use vk_object_base::type for debug_report
2670- vulkan: Use vk_object_base::type for debug_report
2671- vulkan: Make the debug_report implementation internal
2672- anv,radv: Use common entrypoints for VK_KHR_deferred_operation
2673- vulkan: Rework vk_device_init and friends
2674- vulkan: Drop the type_prefix parameter from gen_extensions
2675- nir: Add some ssa-only fast-paths for nir_src rewrite
2676- nir: Drop the lower_mem_constant_vars declaration
2677- vulkan: Add a common helper for enumerating instance extension properties
2678- vulkan: Rework extension disabling on Android
2679- anv: Pull the patch version from the XML
2680- anv: Make anv_icd.py more generic and independent
2681- anv,vulkan: Move anv_icd.py to a common location
2682- anv: Move extension tables to anv_device.c
2683- anv: Add fake graphics-only and compute-only queue families
2684- nir: Add a couple helpers for phis and cursors
2685- nir/lower_bit_size: Support phi instructions
2686- intel/nir: Lower 8-bit phis on Gen11+
2687- nir: Add some range analysis for used bits
2688- nir/algebraic: Clean up up-cast of down-cast when we can
2689- nir/algebraic: Covert up-cast of down-cast to extract on Intel
2690- spirv: Store the nir_function in vtn_function
2691- spirv: Delete the impl for prototype-only functions
2692- nir: Don't optimize bcsel-of-shuffle across blocks
2693- nir: Fix parameter order in the bcsel-of-shuffle optimization
2694- nir/opt_large_constants: Handle generic pointers
2695- intel/fs: Shuffle can't handle source modifiers
2696- anv/formats: Advertise linear sampling on depth formats
2697- anv/android: Re-implement AcquireImageANDROID
2698- intel/mi_builder: Create a context in the tests
2699- intel/mi_builder: Delete a bogus comment
2700- intel/mi_builder: Fix a misleading comment
2701- intel/mi_builder: Short-circuit shifts in more cases
2702- intel/mi_builder: Add constant folding
2703- intel/mi_builder: Rewrite unit tests in terms of constant folding
2704- intel/mi_builder: Add tests for gen_mi_z and gen_mi_nz
2705- intel: Rename gen_mi_builder.h to mi_builder.h
2706- intel/mi_builder: Drop the gen\_ prefix
2707- intel/mi_builder: Use AddCSMMIOStartOffset for LRI
2708- intel/mi_builder: Add ieq/ine helpers
2709- intel/mi_builder: Support inverted values in mi_store
2710- intel/mi_builder: Add load/store_offest on GFX 12.5+
2711- genxml: Clean up MI_SET_PREDICATE
2712- intel/batch_decoder: Don't follow predicated MI_BATCH_BUFFER_START
2713- intel/mi_builder: Use softpin for tests on gen8+
2714- intel/mi_builder: Return an address from __gen_get_batch_address
2715- intel/mi_builder: Add control-flow support
2716- nir: Add and use a new nir_ssa_def_rewrite_uses_src helper
2717- nir: Make nir_ssa_def_rewrite_uses take an SSA value
2718- nir: Make nir_ssa_def_rewrite_uses_after take an SSA value
2719- intel/mi_builder: Fix some indentation
2720- intel/mi_builder: Fix a couple of #ifs
2721- anv: Drop anv_extensions.py
2722- turnip: Move the CreateRenderPass wrapper to common code
2723- anv: Move multialloc to common code
2724- vulkan: Use VK_MULTIALLOC in CreateRenderPass
2725- anv: Move vk_format helpers to common code
2726- vulkan: Use correct aspectMask in CreateRenderPass
2727- vulkan: Add some asserts and checks for multiview in CreateRenderPass
2728- vulkan: Preserve preserve attachments in CreateRenderPass
2729- anv: Drop CreateRenderPass
2730- radv/meta: Use CreateRenderPass2
2731- radv: Drop CreateRenderPass
2732- intel/fs: Use INTEL_MASK for pushish constant address masking
2733- intel/fs: Handle payload node interference in destinations
2734- vulkan: Use ALWAYS_INLINE for multialloc
2735- vk/alloc: Handle zero sizes better in vk_multialloc_add
2736- vulkan/alloc: Add VK_MULTIALLOC_DECL macros
2737- vulkan/util: Add a type parameter to vk_multialloc_add
2738- vulkan/alloc: Use char * for pointer arithmetic
2739- anv,genxml: Handle L3SQCREG1_SQGHPCI in GenXML
2740- anv: Add an anv_batch_write_reg macro
2741- iris: Add an iris_write_reg macro
2742- genxml: Make 1-bit L3$ config register fields bool on Gen7
2743- intel/fs,rt: Add a predicate to load_global_const_block
2744- anv: Use load_global_constant for shader constants
2745- anv: Use nir_shader_instructions_pass in apply_pipeline_layout
2746- anv/apply_pipeline_layout: Refactor descriptor chasing code
2747- anv/apply_pipeline_layout: Rework the early pass index/offset helpers
2748- anv/apply_pipeline_layout: Lower UBO loads in the early pass
2749- anv/apply_pipeline_layout: Run DCE between the early and late passes
2750- anv/apply_pipeline_layout: Move bounds checking later for index/offset
2751- anv/apply_pipeline_layout: Plumb through a UBO address format
2752- anv/apply_pipeline_layout: Add some switch statements
2753- nir: Add a new 64+32-bit address format
2754- anv: Use 64bit_global_32bit_offset for SSBOs
2755- anv: Rework the 64bit_bounded_global resource index format
2756- anv: Zero out the last dword of UBO/SSBO descriptors in the shader
2757- anv/apply_pipeline_layout: Apply dynamic offsets in load_ssbo_descriptor
2758- anv/apply_pipeline_layout: Refactor all our descriptor address builders
2759- anv/apply_pipeline_layout: Rework the desc_addr_format helper
2760- anv/apply_pipeline_layout: Use the new helpers for early lowering
2761- anv/apply_pipeline_layout: Use the new helpers for images
2762- nir/lower_io: Support global addresses for UBOs in nir_lower_explicit_io
2763- anv: Add a pass for lowering A64 UBO access
2764- anv: Do UBO loads with global addresses for bindless
2765- anv/apply_pipeline_layout: Add support for A64 descriptor access
2766- nir: Add image atomic_fmin/fmax intrinsics
2767- spirv: Add support for SPV_EXT_shader_atomic_float_min_max
2768- intel/fs: Add support for 16-bit A64 float and integer atomics
2769- intel/genxml: Binding table pointers are 15 bits on GFX version 12.5+
2770- intel/tools: Handle milti-LRI in the batch decoder
2771- intel/tools: Handle GT_MODE in the batch decoder
2772- intel/genxml: Make BindingTablePoolEnable a bool
2773- intel/tools: Handle BINDING_TABLE_POOL_ALLOC in batch decoding
2774- anv: Align inline uniform data to ANV_UBO_ALIGNMENT
2775- anv: Implement VK_EXT_conservative_rasterization
2776- anv: Fix coverage masks for VK_EXT_conservative_rasterization
2777- intel: Drop gen_device_info::has_resource_streamer
2778- anv: Clean up anv_device_memory::base on failure
2779- anv: Refactor framebuffer creation
2780- anv: Clean up anv_descriptor_pool::base on the error path
2781- anv: Clean up anv_semaphore::base on the error path
2782- vulkan: Add a vk_object_multialloc helper
2783- anv: Use vk_object_alloc/free
2784- anv: Make memory type and queue family pointers const
2785- intel: fix querying mip levels on null surfaces on SKL and prior
2786- intel/compiler: Don't insert barriers for NULL sources
2787- anv: Use the same re-order mode for streamout as for GS
2788- intel/isl: Fix isl_color_value_unpack to match the prototype
2789- intel/nir: Set lower txs with non-zero LOD
2790
2791Jeremy Huddleston (5):
2792
2793- darwin: Use the system libexpat
2794- util: Fix pointer to integer conversion error when using libunwind
2795- darwin: Use the system libunwind
2796- Fall back on clock_gettime when timespec_get() is unavailable
2797- Adjust dylib compatibility versions to match what was set by mesa-18.3's autotools-based builds
2798
2799Jesse Natalie (76):
2800
2801- nir: Work around MSVC x86 internal compiler error
2802- main: Undefine MemoryBarrier for Windows
2803- glapi: Undefine MemoryBarrier
2804- mapi: Undefine MemoryBarrier
2805- drisw: Disable automatic use of layered drivers with LIBGL_ALWAYS_SOFTWARE
2806- wgl: Refactor screen creation to a function
2807- wgl: Add a loop for screen creation with an ordered list of fallbacks
2808- d3d12: Fail screen creation if a shader validator is needed and can't be created
2809- wgl: Disable automatic use of layered drivers with LIBGL_ALWAYS_SOFTWARE
2810- CI: Use a sha for the Windows SPIRV-LLVM-Translator dependency
2811- microsoft/clc: Add -fgnu89-inline to clang args
2812- microsoft/clc: Add test with inline function
2813- clover: Add -fgnu89-inline to Clang command line
2814- microsoft/clc: Only apply float scaling to 32bit fdiv
2815- microsoft/clc: Let lower_vars_to_explicit_types fill kernel input driver_location
2816- microsoft/clc: Fix wrap modes for inline samplers for integer textures
2817- microsoft/clc: Move inline samplers to the end of the variable list
2818- microsoft/clc: Use driver_location for metadata instead of re-computing offsets
2819- microsoft/clc: Re-order dead variable removal after uniform vars_to_explicit_types
2820- microsoft/clc: Add a test with an unused kernel arg
2821- glapi: Support "ELF" TLS on Windows
2822- docs: Document USE_ELF_TLS can work on Windows too
2823- meson/gallium: Add an option to not use LLVM for gallium draw module
2824- d3d12: Handle null constant buffers
2825- nir: Add a nir_after_instr_and_phis helper
2826- microsoft/compiler: Don't separate phis while inserting upcasts
2827- d3d12: Move descriptor pools to screen, and add lock
2828- d3d12: Handle is_new_style_shadow comparison filtering
2829- d3d12: Really handle null constant buffers
2830- u_format: Add restrict to fn pointer and manual format pack/unpack/fetch
2831- panfrost: Add a Meson dependency on bi_opcodes.h for bifrost_compiler
2832- meson, util: Make zlib optional again
2833- nir: Temporarily disable optimizations for MSVC ARM64
2834- wgl: Fix wglCreatePbufferARB pixel format lookup
2835- d3d12: Use ID3D12Device9::CreateCommandQueue1 when available
2836- d3d12: Use CreateDXGIFactory2 and use the debug flag when appropriate
2837- wgl: Add unit test infrastructure for OpenGL32.dll on Windows
2838- wgl: Add a context to framebuffer destruction
2839- d3d12: Add a constant for num_buffers
2840- d3d12: Clean up swapchains on framebuffer destruction
2841- wgl, d3d12: Add a d3d12-specific test for swapchain leaks
2842- microsoft/compiler: Move blob_init earlier to prevent crash on failure
2843- microsoft/compiler: Add copy_prop_vars to optimization loop
2844- microsoft/compiler: Add a lowering pass to split clip/cull distance compact arrays
2845- microsoft/compiler: Enable dxil_nir.h to be included from C++
2846- microsoft/compiler: Support compact arrays for clip/cull in nir_to_dxil
2847- d3d12: Use compact arrays for clip/cull distance
2848- microsoft/spirv_to_dxil: Implement TODO for removing dead functions
2849- spirv_to_dxil: Handle clip/cull distance
2850- microsoft/compiler: Fix barrier flag for shared memory
2851- microsoft/spirv_to_dxil: Lower globals to function_temp
2852- microsoft/spirv_to_dxil: Lower io arrays
2853- microsoft/compiler: Support fp16 i/o vars
2854- nir: Add a new opcode for [un]packing doubles
2855- microsoft/compiler: Add a lowering pass to emit double [un]pack instructions
2856- microsoft/compiler: Implement new double pack/unpack alu ops
2857- microsoft/spirv_to_dxil: Support doubles
2858- microsoft/compiler: Add some more float16 support
2859- meson: Refuse to build lavapipe without llvmpipe
2860- vtn: Don't warn about linkage capability if we're creating a NIR library
2861- vtn: Add a cap for CL drivers to support read-write images
2862- microsoft/clc: Update unit test to always use COMMON state for buffers
2863- meson: For MSVC, suppress warnings generated by useless delayloads
2864- driconf: Remove default values from string driconf entries
2865- CI: Enable -werror for Windows
2866- vtn: Support scoped control barriers for OpenCL too
2867- nir_opt_deref: ptr_as_array(deref_cast<T*>(x))[0] isn't the same as x[0] if the cast has alignment
2868- nir: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2869- d3d12: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2870- microsoft/clc: Fix MSVC unreferenced variable warnings
2871- microsoft/clc: Fix undeclared function warning
2872- microsoft/compiler: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2873- shader_enums: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2874- gallium/aux: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2875- llvmpipe: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2876- xmlconfig: Fix MSVC warning C4334 (32bit shift cast to 64bit)
2877
2878Jesse Schwartzentruber (1):
2879
2880- glcpp: Fix undefined behaviour in glcpp
2881
2882Joel Linn (2):
2883
2884- freedreno/a2xx: fix/add RBBM perfcounter
2885- freedreno/a2xx: add RB perfcounter 1-3
2886
2887Jonathan Marek (13):
2888
2889- turnip: fix logicOp
2890- turnip: delete unused vk_format_parse.py file
2891- turnip: use vk_format_is_int to disable COLOR_ATTACHMENT_BLEND_BIT
2892- turnip: IMAGE_FILTER_{LINEAR,CUBIC}_BIT only for non-integer formats
2893- turnip: don't always use 3d ops for blit_image
2894- turnip: add missing register write to disable dithering
2895- freedreno/registers: use macro instead of inline function for array regs
2896- freedreno/a6xx: update perfcntr registers (declare as arrays)
2897- freedreno/a6xx: always use reg64 for address registers (no LO/HI)
2898- freedreno/a6xx: update some registers
2899- freedreno/a6xx: set SP_PERFCTR_ENABLE in computerator
2900- turnip: implement VK_KHR_shader_float_controls
2901- turnip: enable VK_KHR_shader_float16_int8
2902
2903Jordan Justen (28):
2904
2905- intel/genxml/gen125: Add CFE_STATE and COMPUTE_WALKER
2906- intel/compiler: Disable push constants on gen12-hp
2907- anv: Emit CFE_STATE for gen12-hp
2908- anv: Don't use MEDIA_INTERFACE_DESCRIPTOR_LOAD for gen12-hp
2909- anv: Use COMPUTE_WALKER for gen12-hp
2910- iris: Add support for COMPUTE_WALKER
2911- iris: Fix android build due to missing link to libmesa_iris_gen125
2912- anv: Add exec_flags to anv_queue
2913- anv: Turn device->queue into an array
2914- anv: Print queue number with INTEL_DEBUG=bat
2915- anv: Support i915 query (DRM_IOCTL_I915_QUERY) from Linux v4.17
2916- anv: Gather engine info from i915 if available
2917- anv: Add anv_gem_count_engines
2918- anv: Support multiple engines with DRM_IOCTL_I915_GEM_CONTEXT_CREATE_EXT
2919- anv: Add ANV_QUEUE_OVERRIDE env-var to override advertised queues
2920- i965: Call util_cpu_detect() early in screen creation
2921- intel: Use GEN_VERSIONx10 in more places
2922- intel/dev: Add devinfo genx10 field
2923- intel: Use devinfo genx10 field
2924- anv: Restructure mem heap/type init code
2925- anv: Put cache memory type first on non-llc platforms
2926- anv: Add mem heap/type support for local-mem
2927- anv: Drop has_slm in emit_l3_config for gen11+
2928- anv: Use fallback paths if DRM_I915_QUERY_ENGINE_INFO fails
2929- i965/gen11: Fix must-be-ones bit positions in 3D_MODE
2930- genxml/gen12: 3D_MODE bits 31:16 are no longer must-be-one
2931- Revert "intel/compiler: Silence unused parameter warning in update_inst_scoreboard"
2932- intel/compiler: Fix INTEL_DEBUG=hex
2933
2934Jose Maria Casanova Crespo (4):
2935
2936- v3d: Enables DRM_FORMAT_MOD_BROADCOM_SAND128 support
2937- v3d: YUV formats at is_dmabuf_modifier_supported are external_only
2938- v3d: YUV formats at query_dmabuf_modifiers are external_only
2939- v3d: DRM_FORMAT_MOD_BROADCOM_SAND128 only available for NV12 format.
2940
2941Joshua Ashton (3):
2942
2943- lavapipe: handle NULL pStrides in CmdBindVertexBuffers2EXT
2944- lavapipe: implement CmdBindVertexBuffers with CmdBindVertexBuffers2EXT
2945- lavapipe: enable KHR_image_format_list
2946
2947José Fonseca (6):
2948
2949- scons: Add u_format_pack.h include path.
2950- wgl: Match opengl32.dll export ordinals.
2951- util: Always use timespec_get on Windows.
2952- appveyor: Remove integration.
2953- scons: Remove.
2954- gitlab-ci: Build meson-mingw32-x86_64 w/o zlib.
2955
2956Juan A. Suarez Romero (40):
2957
2958- v3d: fix dest offset in TFU setup
2959- v3d: use a compatible supported format for TFU-based blit
2960- vc4/ci: Replace expect script by python script
2961- ci/vc4: allow custom timeout values for activity
2962- ci/vc4: rename stage to Broadcom
2963- ci/vc4: Add piglit job
2964- ci: Bump deqp to current vulkan-cts-1.2.5.1
2965- ci: add option to overwrite CPU arch
2966- ci/v3d: Add V3D and V3DV testing
2967- ci/v3d: Update expected resuls for piglit
2968- ci/piglit: allow parallel piglit jobs
2969- ci/vc4/v3d: Parallelize piglit jobs
2970- ci/piglit: fix parallel piglit results
2971- ci/baremetal: highlight message errors
2972- ci/broadcom: retry always when serial log timeout
2973- ci: Bump deqp to vk-gl-cts 1.2.5.2
2974- ci/broadcom: allow custom kernels
2975- vc4: destroy renderonly object if present
2976- ci/armXX: add libgl1-mesa-dev dependency
2977- ci/v3dv: add flaky test in the skip list
2978- ci/vc4/v3d: run piglit testsuite against Xorg
2979- ci/broadcom: use new piglit runner
2980- ci/broadcom: update piglit expected results
2981- ci/v3d: run full GLES3 and GLES31 testsuite
2982- broadcom/compiler: fix unused value
2983- v3dv: fix unused value
2984- ci/v3dv: update flaky tests
2985- broadcom/cle: do not leak spec
2986- ci/broadcom: update expected list
2987- v3d: use uint type in _gen_unpack_uint
2988- broadcom/compiler: fix first_component assertion
2989- broadcom/compiler: use signed pointers for packed condition
2990- ci/broadcom: use SNMP to turn on/off devices
2991- broadcom/compiler: use VPM offsets in GS load_per_vertex input
2992- v3d: use GS_BIN inputs as VS_BIN outputs
2993- v3dv: fix assertion
2994- ci: Update VK-GL-CTS to 1.2.6.0
2995- v3d: do not emit attribute if has no resource
2996- ci/v3dv: skip Vulkan waiver tests
2997- util/hash_table: do not leak u64 struct key
2998
2999Jérôme Glisse (3):
3000
3001- gallium: add support for SVM (Share Virtual Memory) migrate
3002- clover: implement clEnqueueSVMMigrateMem
3003- nouveau: add support for SVM migrate
3004
3005Karol Herbst (9):
3006
3007- clover: track allocated svm pointers
3008- clover/api: make use of validate_mem_migration_flags in clEnqueueMigrateMemObjects
3009- nouveau: print warning about unhandled cap only once
3010- clover: simplify image arguments
3011- clover: rework quering image max sizes
3012- clover: Fix build with llvm-12.
3013- clover: Add missing include for llvm-12 build fix
3014- tegra/context: fix regression in tegra_draw_vbo
3015- tegra/context: unwrap indirect_draw_count as well
3016
3017Keith Packard (2):
3018
3019- glx: Provide glvnd wrapper for glXSwapIntervalEXT
3020- wsi/x11: Fix type of target_msc argument to x11_present_to_x11_dri3
3021
3022Kenneth Graunke (62):
3023
3024- vbo: Don't set node->min_index = max_index = indices_offset when merging
3025- vbo: Only mark merged line strips as lines when actually converting them
3026- tnl: Try not to botch index buffer munging when start > 0.
3027- tnl: Respect \`start` when converting indices to GLuint
3028- tnl: Reset nr_bos to 0 between map/unmap cycles.
3029- Revert "mesa: allow half float textures based on ARB_half_float_pixel"
3030- iris: Consider resolves after changing a resource's aux state
3031- iris: Drop find_existing_assembly optimization from program cache
3032- iris: Drop iris_print_program_cache().
3033- iris: Refactor iris_debug_recompile interface to take a shader.
3034- intel: Produce a "constrained" output from gen_get_urb_config()
3035- iris: Reconfigure the URB only if it's necessary or possibly useful
3036- iris: Move the URB size checks into iris_update_compiled_xs
3037- iris: Properly handle new unbind_num_trailing_slots parameters
3038- iris: Use shader_info rather than vs_prog_data for draw parameter checks
3039- iris: Minor code restyling in iris_bind_vs_state
3040- iris: Move VS draw parameter dirty flagging to iris_bind_vs_state
3041- iris: Refcount shader variants
3042- iris: Store a list of shader variants in the shader itself
3043- iris: Enable PIPE_CAP_SHAREABLE_SHADERS.
3044- iris: add hooks to call INTEL_MEASURE
3045- iris: Fill out scratch base address dynamically
3046- iris: Remove context from iris_debug_recompile
3047- iris: Remove context from iris_upload_shader()
3048- iris: Remove context from iris_compile_vs and friends
3049- iris: Remove context from iris_create_uncompiled_shader
3050- iris: Remove context from iris_disk_cache_retrieve
3051- iris: Make a pin_scratch_space() helper
3052- iris: Reference the shader variant for last_vue_map as well
3053- iris: Pin surface state buffers after possibly updating the clear color
3054- i965: Rename use_intel_mipree_map_blit to use_blitter_to_map
3055- i965: Rename intel_batchbuffer_* to brw_batch_*.
3056- i965: Rename intel_screen to brw_screen
3057- i965: Rename intel_texture_{object,image} to brw_texture_{object,image}
3058- i965: Rename intel_renderbuffer to brw_renderbuffer
3059- i965: Rename intel_mip* to brw_mip*.
3060- i965: Use __func__ in blorp perf_debug macros
3061- i965: Rename intel_buffer_object to brw_buffer_object
3062- i965: Rename intel_image_format and intel_buffer to brw_*
3063- i965: Rename the rest of intel_* functions to brw_*
3064- i965: Rename intelInit and brwInit camel-case functions to brw_*
3065- i965: Rename some camel-case local variables
3066- i965: Rename more camel-case functions to brw and underscore style
3067- i965: Rename DRI extension structs to be "brw" instead of "intel"
3068- i965: Eliminate all tabs except in brw_defines.h
3069- tnl: Call _mesa_matrix_analyse to make sure the inverse MVP is updated
3070- glsl/float64: Bump #version to 400
3071- iris: Defer uploading of surface states
3072- iris: Defer stream output target space allocation until set time
3073- iris: Rework zeroing of stream output buffer offsets
3074- iris: Support rebinding of stream output targets
3075- iris: Use different shader uploaders for precompile vs. draw time
3076- iris: Make various classes inherit from u_threaded_context base classes
3077- iris: Use thread safe slab allocators in transfer_map handling
3078- iris: Enable u_threaded_context
3079- vbo: Fix vbo_sw_primitive_restart for start > 0
3080- intel/genxml: Add a partial GT_MODE definition for Gen11+.
3081- iris: Delete stale comment in iris_lost_context_state
3082- intel: Fix release build breakage
3083- Half-revert "gallium/dri2: Pass the resource that corresponds to the plane"
3084- intel: Mark an otherwise unused variable in intel_dump_gpu as ASSERTED
3085- ci: Enable iris testing in meson-release
3086
3087Kristian Høgsberg (1):
3088
3089- macros: Add thread-safety annotation macros
3090
3091Leo Liu (12):
3092
3093- radeon/vcn: clean the message buffers and their indexes logic
3094- radeon/vcn: add dynamic dpb interface
3095- radeon/vcn: add dynamic dpb buffer Tier1 support
3096- radeon/vcn: enable dynamic dpb Tier1 support
3097- radeon/vcn: add dynamic dpb Tier2 message buffer interface
3098- radeon/vcn: implement dynamic dpb Tier2 support
3099- radeon/vcn: enable dynamic dpb Tier2 support
3100- meson: bump drm amdgpu version to 2.4.105
3101- ci: Fix meson-i386 build failed after libdrm bump version
3102- include/drm-uapi: bump AMDGPU headers
3103- ac: add function for querying video capabilities
3104- radeonsi: replace the hard coded video decode and encode caps
3105
3106Lepton Wu (3):
3107
3108- virgl: Don't destroy resource while it's in use.
3109- virgl: Use atomic operation directly.
3110- virgl: move new added field to the end.
3111
3112Lionel Landwerlin (96):
3113
3114- anv: add transfer usage for color/depth/stencil attachments
3115- anv: don't disable KHR_performance_query in debug mode
3116- intel/mi_builder: optimize 64bit immediate register loads & memory stores
3117- intel/mi_builder: fix self modifying batches
3118- intel/perf: restructure i915 perf version checks
3119- intel/perf: add definition for generic perf counters
3120- intel/perf: link queries back to the gen_perf_config object
3121- intel/perf: move gt_frequency to results
3122- anv: Fix stencil layout in render passes
3123- intel: silence unused var warnings in release builds
3124- anv: fix invalid programming of BLEND_STATE
3125- intel/common: store sample position in plain arrays
3126- anv: pass context to reset stats helper
3127- anv: store queue creation flags on anv_queue
3128- genxml: PERFCNT registers are available since HSW
3129- intel/perf: prep work to enable new perf counters
3130- intel/perf: query register descriptions
3131- intel/perf: add performance query layout using MI_SRM
3132- intel/perf: switch query code to use query layout
3133- anv: fix layout comment
3134- anv: remove unused query pool field
3135- intel/perf: rename lkf into ehl
3136- intel/perf: add reorder script
3137- intel/perf: reorder xml files
3138- intel/perf: remove reordering script
3139- intel/perf: update files from IGT
3140- intel/perf: small ICL equation refactor
3141- intel/perf: add async compute metrics
3142- intel/dev: identify tigerlake
3143- intel/perf: break TGL perf configs in GT1/2
3144- intel/dev: identify rocketlake
3145- intel/perf: add RKL support
3146- intel/perf: add DG1 support
3147- intel/perf: drop the special READ_REG operator
3148- anv: compute commands required to implement perf queries
3149- anv: switch khr perf query code to use query layout
3150- anv: switch intel perf queries to query layout
3151- anv: add a comment describing has_relocs field
3152- anv: break up internal queueing function
3153- anv: only signal wsi fence BO on last command buffer
3154- drm-shim: report support for timeline semaphores
3155- intel/stub: plug some gaps in our ioctl faking
3156- anv: print out perf permission warning only once
3157- anv: discard all timeline wait/signal value=0
3158- vulkan: document flags choice for vkGetDeviceQueue
3159- genxml: add MI_SET_APPID on Gen12+
3160- genxml: Add PIPE_CONTROL protected memory bits
3161- isl: add external parameter to isl_mocs()
3162- anv: track command buffer pool flags
3163- anv: track buffer creation flags
3164- intel/dev: identify alderlake
3165- intel/perf: Add Alderlake metrics
3166- intel/perf: fix roll over PERF_CNT counter accumulation
3167- anv: reset binary syncobj to be signaled before submission
3168- anv: don't wait for completion of work on vkQueuePresent()
3169- anv: Fix wait_count missing increment
3170- anv: make use of new helper function directly in anv_QueueSubmit()
3171- anv: track the end of the command buffers
3172- anv: end command buffer with a potential jump
3173- anv: allow multiple command buffers in anv_queue_submit
3174- anv: group as many command buffers into a single execbuf
3175- anv: fix missing general state pool in validation list
3176- anv: implement INTEL_DEBUG=submit
3177- anv: fix MI_PREDICATE_RESULT write
3178- intel/tools: fix meson warning
3179- intel/dev: add helpers to compute subslice/eu total
3180- intel/dev: add warning on missing kernel uAPI for Gen8+
3181- iris: use gen_device_info helper to get subslice total
3182- i965: stop using get_param for things queried by gen_device_info
3183- anv: stop using get_param for things queried by gen_device_info
3184- intel/dev: switch over to mesa log infrastructure
3185- anv: move L3 config emission to genX_state.c
3186- anv: move L3 initialization to device init on Gen11+
3187- intel: install intel_device_info
3188- intel/fs/vec4: add missing dependency in write-on-write fixed GRFs
3189- intel/dev: store size of CS prefetch
3190- intel/mi_builder: use device info to use the right CS prefetch size
3191- anv: use the device size of CS prefetch to pad secondary buffer calls
3192- meson: switch vulkan layer to list of choices
3193- intel: Add null hw layer
3194- gitlab-ci: fix vulkan build layer enabling
3195- intel/nullhw: fix build
3196- etnaviv/drm: only print out fence error on non timeout
3197- intel/fs/copy_prop: check stride constraints with actual final type
3198- intel/fs: implement another copy propagation restriction
3199- intel/compiler: lower bit sizes in NIR postprocessing
3200- anv: put correct number of BT prefetch for compute on XeHP+
3201- intel/fs: limit OW reads to 8 owords on XeHP+
3202- microsoft: fixup clc_log() define
3203- anv: bump internal descriptor index fields to 32bits
3204- anv: fix 3DSTATE_MULTISAMPLE emission on gen8+
3205- anv: disable baked in pipeline bits from dynamic emission path
3206- spirv: fix uToAccelerationStructure handling
3207- spirv: fixup pointer_to/from_ssa with acceleration structures
3208- vulkan/wsi/display: don't report support if there is no drm fd
3209- i965/bufmgr: fix invalid assertion
3210
3211Lucas Stach (5):
3212
3213- renderonly: remove layering violations
3214- renderonly: close the gpu fd when destroying renderonly
3215- etnaviv: don't try to copy PIPE_BUFFER with the 3D engine
3216- etnaviv: remove stale comment in etna_resource_copy_region
3217- Revert remaining half of "gallium/dri2: Pass the resource that corresponds to the plane"
3218
3219Lukas Feller (2):
3220
3221- v3dv: fix assertion in job_compute_frame_tiling
3222- v3dv: fix stride in buffer copy
3223
3224Marcin Ślusarz (20):
3225
3226- intel/perf: export information about units of performance counters
3227- intel/compiler: cache computed register pressure benefit
3228- intel/tools/aub: print better error message when mmap fails
3229- intel/tools/aub: handle truncated input file
3230- intel/tools/aub: remove superfluous new line from error messages
3231- intel/dump_gpu: mark bo as unmapped if its address changes
3232- anv: fix memory allocation error handling
3233- iris: fix decode_get_bo
3234- i965: fix decode_get_bo
3235- intel/batch_decoder: catch invalid sampler state pointer
3236- intel/batch_decoder: drop bogus check
3237- intel/batch_decoder: fix decoding of sampler states
3238- intel/batch_decoder: assert on invalid sampler pointer
3239- intel/aub_viewer: catch invalid sampler state pointer
3240- intel/aub_viewer: drop bogus check
3241- intel/aub_viewer: fix decoding of sampler states
3242- gallium: add PIPE_CAP_ALLOW_DYNAMIC_VAO_FASTPATH
3243- iris: disable dynamic VAO fastpath on GFX version 9
3244- gallium/u_threaded: implement INTEL_performance_query hooks
3245- gallium/u_threaded: offload begin/end_intel_perf_query
3246
3247Marek Olšák (406):
3248
3249- mesa: always set valid index bounds for non-indexed draws for classic drivers
3250- st/nine: stop using cso_set_sampler_views
3251- st/xa: stop using cso_set_sampler_views
3252- gallium/tests: stop using cso_set_sampler_views
3253- gallium/api: add state invalidate interface as alternative to cso_save/restore
3254- gallium/hud: don't use cso_context to restore VBs, constbuf 0 and sampler views
3255- gallium/pp: don't use cso_context to restore VBs, constbuf 0 and sampler views
3256- st/mesa: don't use cso_context to restore VBs, sampler views for glBitmap
3257- st/mesa: don't use cso_context to restore VBs for glClear
3258- st/mesa: don't use cso_context to restore VBs, sampler views for glDrawPixels
3259- st/mesa: don't use cso_context to restore VBs, sampler views for glDrawTex*OES
3260- st/mesa: don't use cso_context to restore VBs, etc. for PBO glReadPixels
3261- st/mesa: don't use cso_context to restore VBs, etc. for PBO glTexSubImage
3262- st/mesa: don't use cso_context to set const bufs, sampler views and images
3263- st/mesa: replace st->pipe with pipe in a few places
3264- cso_context: remove ability to restore VBs, const bufs, sampler views, images
3265- st/mesa: unbind sampler views, images, and vertex buffers after meta ops
3266- st/mesa: optimize binding and unbinding shader images
3267- radeonsi: constant buffer cleanups
3268- radeonsi: don't clear unaligned bits when unbinding vertex buffers
3269- radeonsi: move emit_cache_flush functions into si_gfx_cs.c
3270- radeonsi: don't pass pipe_draw_info into si_emit_vs_state
3271- radeonsi: don't pass pipe_draw_info into si_emit_ia_multi_vgt_param
3272- radeonsi: translate pipe_prim_type only when it changes
3273- radeonsi: don't pass pipe_draw_info into si_emit_derived_tess_state
3274- radeonsi: don't compute average vertex count in si_draw_vbo
3275- radeonsi: fix si_num_prims_for_vertices for PIPE_PRIM_POLYGON
3276- radeonsi: make cik_emit_prefetch_L2 templated and move it to si_state_draw.cpp
3277- radeonsi: add a specialized function for CP DMA L2 prefetch
3278- radeonsi: make sctx->vertex_elements always non-NULL
3279- radeonsi: remove MRT-draw-calls, spill-draw-calls, spill-compute-calls
3280- radeonsi: get out of si_emit_vs_state early for blit vertex shaders
3281- radeonsi: rearrange condition for streamout workaround on gfx7 and gfx8
3282- radeonsi: don't use si_get_vs_state in most places
3283- radeonsi: trim the size of si_vgt_param_key and si_vgt_stages_key
3284- mesa: fix alpha channel of ETC2_SRGB8 decompression for !bgra
3285- radeonsi: unify uploaders on APUs too
3286- radeonsi: don't pass pipe_draw_info into si_emit_draw_registers
3287- radeonsi: don't set context_roll for non-gfx9 in templated functions
3288- radeonsi: add si_get_user_data_base selecting user data registers
3289- radeonsi: evaluate sh_base in si_emit_vs_state at compile time
3290- radeonsi: inline the last use of si_get_vs_state
3291- radeonsi: evaluate si_get_vs in si_draw_vbo at compile time
3292- radeonsi: enable the GS tri strip adj workaround with primitive_restart
3293- radeonsi: clear dirty_atoms and dirty_states only if we entered the emit loop
3294- radeonsi: move variables closer to their use in most draw state functions
3295- radeonsi: don't validate inlinable uniforms at draw time
3296- radeonsi: allow instance_count == 0 on chips that handle it correctly
3297- glthread: remove marshal="draw" because it doesn't do much
3298- glthread: don't sync with NV_half_float vertex attrib functions
3299- glthread: add specialized versions of unmarshal_Draw funcs without user buffers
3300- glthread: track all matrix stack depths
3301- glthread: implement glGetIntegerv for states that glthread tracks
3302- glthread: rename inside_dlist to ListMode for future use
3303- glthread: remove if (COMPAT) conditions from functions that are GL-compat-only
3304- mesa: add _mesa_get_list helper
3305- glthread: add display list support to fix state tracking with display lists
3306- mesa: remove _mesa_initialize_exec_dispatch from draw.c by autogenerating it
3307- mesa: remove redundant glRect functions for display lists
3308- mesa: optimize glCallLists by using loops inside a switch
3309- mesa: simplify handling OPCODE_CONTINUE for display lists
3310- mesa: simplify terminating display list loops
3311- mesa: remove STATE_INTERNAL
3312- mesa: combine STATE_ENV, STATE_LOCAL enums with STATE_xxx_PROGRAM
3313- mesa: flatten STATE_MATERIAL and STATE_LIGHTPROD tokens
3314- mesa: eliminate the switch statement for STATE_TEXGEN
3315- glsl: remove unused internal builtin gl_CurrentAttribVertMESA
3316- glsl: split gl_CurrentAttribFragMESA into elements
3317- mesa: skip memmove in optimize_state_parameters if it's no-op
3318- mesa: rename STATE_LIGHT_ATTRIBS -> STATE_LIGHT_ARRAY for consistency
3319- mesa: optimize get_local_param_pointer and program_local_parameters4fv
3320- mesa: don't allocate local parameters in fetch_state
3321- mesa: merge local and env program parameters for faster uploads
3322- mesa: sort state vars with constant indexing for ARB programs
3323- mesa: add upper bound to limit program state var iterations
3324- mesa: compute gl_program_parameter_list::UniformBytes accurately
3325- mesa: don't handle STATE_* enums in fetch_state that don't do anything
3326- mesa: sort and tightly pack STATE_* enums to generate better switch code
3327- mesa: merge equivalent switch cases in prog_statevars.c
3328- st/mesa: enable state var merging to improve fetch_state performance
3329- radeonsi: add new possibly faster command submission helpers
3330- radeonsi: clear dirty_states if si_pm4_bind_state is unbinding or no-op
3331- radeonsi: don't mark NULL states as dirty in si_pm4_reset_emitted
3332- radeonsi: optimize translating index_size to index_type
3333- radeonsi: don't use rasterizer_discard to validate draws, only check ps_shader
3334- radeonsi: add internal blitter_running flag
3335- radeonsi: simplify determining whether render condition is enabled at draw time
3336- radeonsi: inline si_blend_color and si_clip_state structures
3337- radeonsi: move y_inverted out of si_viewports
3338- radeonsi: don't set vertex buffer dirty flags when they don't do anything
3339- radeonsi: move if (sctx->vertex_buffers_dirty) into the upload function
3340- radeonsi: rename SI_SGPR_RW_BUFFERS to SI_SGPR_INTERNAL_BINDINGS
3341- radeonsi: skip some code for ALLOW_PRIM_DISCARD_CS if tess or GS is enabled
3342- radeonsi: enable accidentally disabled fast launch with non-indexed tri strips
3343- radeonsi: iterate from draw 1 for total/min_direct_count computation
3344- st/mesa: don't enable smoothing if multisampling is enabled
3345- Revert "gallium/u_upload_mgr: allow use of FLUSH_EXPLICIT with persistent mappings"
3346- gallium: add take_ownership param into set_constant_buffer to eliminate atomics
3347- gallium: add unbind_num_trailing_slots to set_vertex_buffers
3348- gallium: add unbind_num_trailing_slots to set_shader_images
3349- gallium: add unbind_num_trailing_slots to set_sampler_views
3350- gallium: add take_ownership param into set_vertex_buffers to eliminate atomics
3351- cso_context,u_vbuf: add take_ownership param into set_vertex_buffers
3352- st/mesa: eliminate all atomic ops when setting vertex buffers
3353- st/mesa: skip atomics when binding UBOs
3354- gallium/u_upload_mgr: eliminate all atomics for the upload buffer
3355- gallium/u_threaded: add a null constant buffer codepath
3356- gallium/u_threaded: unify user and non-user codepaths in set_constant_buffer
3357- gallium/util: optimize pipe_vertex_buffer_reference binding the same buffer
3358- gallium,u_threaded: add pipe_draw_info::take_index_buffer_ownership
3359- st/mesa: set take_index_buffer_ownership to skip an atomic in u_threaded
3360- gallium/u_vbuf: skip draws with 0 vertices
3361- radeonsi: fix centroid with VRS coarse shading
3362- glthread: fix interpreting vertex size == GL_BGRA for vertex attribs
3363- glthread: fix glVertexAttribDivisor calls not being tracked by non-VBO uploads
3364- glapi: guard against invalid XML definitions for glthread
3365- ac,radeonsi: track memory usage in KB to reduce types from uint64 to uint32
3366- radeonsi: optimize no-op cases in si_upload_shader_descriptors
3367- radeonsi: mark shader_pointers dirty once outside the upload descriptors loop
3368- radeonsi: move si_pm4_delete_state logic into si_pm4_free_state
3369- radeonsi: delete si_pm4_delete_state
3370- radeonsi: don't check for redundant and NULL states in si_emit_all_states
3371- radeonsi: optimize si_emit_prefetch_L2 when it's split
3372- radeonsi: reorganize si_draw_vbo for lower register pressure (part 1)
3373- radeonsi: reorganize si_draw_vbo for lower register pressure (part 2)
3374- radeonsi: set VB user SGPRs in si_upload_vertex_buffer_descriptors
3375- radeonsi: prefetch VB descriptors right after uploading
3376- radeonsi: precompute NGG cull flags in si_create_rs_state
3377- mesa: remove/replace FLUSH_VERTICES when it doesn't do anything
3378- mesa: optimize most _mesa_ActiveTexture calls in glPopAttrib
3379- mesa: optimize glPopAttrib(GL_VIEWPORT_BIT)
3380- mesa: don't push/pop gl_texture_object::Target
3381- mesa: fix glPopAttrib for many texture fields
3382- mesa: flush glBegin/End before changing GL_DEPTH_STENCIL_TEXTURE_MODE
3383- mesa: for every state change, remember states we changed for glPopAttrib
3384- mesa: fix trivial bugs in glPopAttrib
3385- mesa: optimize out _NEW_ALL in glPopAttrib(GL_ENABLE_BIT)
3386- mesa: only pop states in glPopAttrib that have been changed since glPushAttrib
3387- mesa: partially skip glPush/PopAttrib for MSAA textures and texture buffers
3388- mesa: don't save gl_shared_state in glPushAttrib
3389- mesa: don't push/pop default texture attributes redundantly
3390- mesa: pop all textures up to NumCurrentTexUsed, not just MaxTextureUnits
3391- mesa: don't count buffer references for the context that created them
3392- radeonsi: set current_rast_prim at bind time for tess and GS
3393- radeonsi: simplify the NGG culling condition in si_draw_vbo
3394- radeonsi: tune NGG shader culling vertex threshold for each chip
3395- radeon: decrease the size of radeon_cmdbuf by switching prev fields to uint16
3396- Revert "gallium/u_vbuf: skip draws with 0 vertices"
3397- gallium/u_vbuf: skip non-indirect draws with 0 vertices
3398- winsys/amdgpu,radeonsi: add HUD counters for how much memory is wasted by slabs
3399- winsys/amdgpu: clean up slab alignment code, handle small buffers better
3400- winsys/amdgpu,pb_slab: add slabs with 3/4 of power of two sizes to save memory
3401- winsys/amdgpu: expand the slab allocation range to [256 B, 1 MB]
3402- winsys/amdgpu: optimize out conditionals in amdgpu_lookup_buffer
3403- winsys/amdgpu: remove amdgpu_winsys_bo::num_cs_references to remove atomics
3404- winsys/amdgpu: pack amdgpu_winsys_bo::is_shared and protect it by a mutex
3405- winsys/amdgpu: move amdgpu_winsys_bo::cpu_ptr into the u.real union
3406- winsys/amdgpu: move amdgpu_winsys_bo::is_shared to the u.real union
3407- winsys/amdgpu: move amdgpu_winsys_bo::is_user_ptr to the u.real union
3408- winsys/amdgpu: move amdgpu_winsys_bo::use_reusable_pool to the u.real union
3409- winsys/amdgpu: don't inc/dec num_active_ioctls for backing BOs of sparse BOs
3410- winsys/amdgpu: don't set unused usage for backing BOs of sparse BOs
3411- winsys/amdgpu: try not to skip any code with RADEON_NOOP=1 to test CPU perf
3412- tgsi_to_nir: translate SAMPLEID
3413- tgsi_to_nir: translate FBFETCH
3414- gallium/u_tests: test no-op fragment shader instead of NULL fragment shader
3415- winsys/amdgpu: disallow pb_cache for backing buffers of sparse buffers
3416- ac/gpu_info: print use_late_alloc
3417- ac/gpu_info: rename num_tcc_blocks -> max_tcc_blocks
3418- ac/gpu_info: add radeon_info::num_tcc_blocks
3419- ac/gpu_info: remove redundant radeon_info::num_sdp_interfaces
3420- ac/gpu_info: inline get_l2_cache_size and set cache sizes farther down
3421- ac/gpu_info: conceal L2 cache sizes
3422- amd: sort chip enums based on hw revision
3423- radeonsi: skip s_sendmsg(gs_alloc_req) for NGG passthrough on new chips
3424- radeonsi: add debug options nodisplaytiling and nodisplaydcc
3425- amd: update addrlib
3426- mesa: optimize draw index type checking
3427- mesa: precompute all valid primitive types at context creation
3428- mesa: precompute draw time prim validation during state changes
3429- mesa: move check_valid_to_render call into _mesa_valid_prim_mode
3430- mesa: fold most of check_valid_to_render into _mesa_update_valid_to_render_state
3431- mesa: inline check_valid_to_render
3432- mesa: add skeleton code for DrawPixels/CopyPixels/Bitmap precomputed validation
3433- mesa: don't report 1 for GL_VALIDATE_STATUS if user didn't validate pipeline
3434- mesa: move shader pipeline validation from draws to state changes
3435- mesa: move sampler uniform validation from draws to state changes
3436- mesa: move some uniform debug code from draws to state changes
3437- mesa: move FBO completeness checking from draws to state changes
3438- mesa: move ARB program and integer FBO validation from draws to state changes
3439- mesa: move GL_FILL_RECTANGLE validation from draws to state changes
3440- mesa: move blending validation from draws to state changes
3441- mesa: inline _mesa_valid_to_render now that it doesn't do validation
3442- mesa: optimize the dual source blend error checking using a bitmask
3443- mesa: remove VERBOSE_DRAW
3444- mesa: remove optional draw validation code to increase performance
3445- mesa: call _mesa_update_state() before validation
3446- mesa: remove an optional GL error about mapped buffers during execution
3447- mesa: skip MultiDrawArrays with primcount == 0
3448- mesa: don't skip draws with count == 0 or numInstances == 0
3449- mesa: add a separate valid primitive mask just for glDrawElements
3450- mesa: move disallowed TFB in DrawElements on GLES from draws to state changes
3451- mesa: validate numInstances in common functions to unify code
3452- mesa: optimize set_varying_vp_inputs by precomputing the conditions
3453- mesa: move gl_context::varying_vp_inputs into ctx->VertexProgram._VaryingInputs
3454- mesa: set _DrawVAOEnabledAttribs only when it changes
3455- mesa: precompute _mesa_get_vao_vp_inputs
3456- mesa: precompute draw time determination of enabled vertex arrays
3457- mesa: gather errors and call _mesa_error only once in validate_Draw
3458- mesa: inline _mesa_set_draw_vao and set_varying_vp_inputs for draw calls
3459- mesa: inline draw validate functions
3460- mesa: add debug code to catch missing _mesa_update_valid_to_render_state calls
3461- ac/surface: use family_id so as not to crash with SI_FORCE_FAMILY in addrlib
3462- radeonsi: for tess, determine the minimum num_patches before optimizing tg size
3463- radeonsi: improve comments in si_emit_derived_tess_state
3464- radeonsi: allocate filled_size for streamout targets in set_streamout_buffers
3465- radeonsi: do late NIR optimizations after uniform inlining
3466- radeonsi: fix the value of uses_bindless_samplers
3467- radeonsi: gather info about bindless images and memory stores with strstr(intr)
3468- radeonsi: gather shader info about indirect UBO/SSBO/samplers/images
3469- radeonsi: gather shader info about VMEM usage for MEM_ORDERED
3470- radeonsi: set MEM_ORDERED optimally
3471- glthread: assume all parameters are fixed if marshal_sync is present
3472- glthread: don't declare pointers with const in unmarshal functions
3473- glthread: don't sync when using pixel buffer objects
3474- glthread: ignore the return value of glUnmapBuffer, don't sync, and return true
3475- i915: use align_calloc for the context to fix m32 crashes
3476- radeon,r200: use align_calloc for the context to fix m32 crashes
3477- nouveau_vieux: use align_calloc for the context to fix m32 crashes
3478- mesa: remove unnecessary NewState flagging for glPopAttrib(GL_ENABLE_BIT)
3479- mesa: move fixed-func-related _mesa_update_state code closer together
3480- mesa: split _NEW_LIGHT into 3 flags: _NEW_LIGHT_(FF_PROGRAM|CONSTANTS|STATE)
3481- mesa: rework _MESA_NEW_NEED_EYE_COORDS to reduce fixed-func program updates
3482- mesa: don't compute the inverted projection matrix if not used
3483- mesa: don't compute the ModelView * Projection matrix if not used
3484- mesa: add _NEW_MATERIAL to reduce the weight of _NEW_LIGHT_CONSTANTS
3485- mesa: don't update derived material values in _mesa_update_state and elsewhere
3486- mesa: remove _NEW_VARYING_VP_INPUTS in favor of _NEW_FF_(VERT|FRAG)_PROGRAM
3487- mesa: remove _NEW_LIGHT_FF_PROGRAM in favor of _NEW_FF_(VERT|FRAG)_PROGRAM
3488- mesa: don't push/pop ctx->Texture._* derived states
3489- mesa: remove the fixed-func vert prog dependency on all texture states
3490- mesa: sort state parameters for ffvp to enable better parameter merging
3491- mesa: merge STATE_LIGHTPROD parameters
3492- mesa: merge STATE_LIGHT_ATTENUATION and STATE_LIGHT_POSITION_* parameters
3493- vbo: optimize copy_to_current functions
3494- vbo: don't call update_color_material in copy_to_current if it's a no-op
3495- mesa: be precise about when to set _NEW_CURRENT_ATTRIB and _NEW_MATERIAL
3496- mesa: move _mesa_update_pixel out of _mesa_update_state
3497- mesa: only update fixed-func programs on texture matrix enablement changes
3498- mesa: don't update fixed-func vert prog on irrelevant _NEW_TRANSFORM changes
3499- mesa: don't update fixed-func programs on irrelevant _NEW_POINT changes
3500- mesa: don't update fixed-func programs on irrelevant _NEW_FOG changes
3501- mesa: don't update fixed-func programs on irrelevant _NEW_RENDER_MODE changes
3502- mesa: don't update the fixed-func frag prog on irrelevant _NEW_COLOR changes
3503- mesa: don't update tnl spaces on irrelevant _NEW_POINT/TEXTURE_STATE changes
3504- mesa: skip a subset of _mesa_update_state if no relevant flags are set
3505- radeonsi: don't index si_context::shaders with enum gl_shader_stage
3506- ac/llvm: fix ac_build_atomic_rmw with LLVM 13
3507- radeonsi: don't crash on NULL images in si_check_needs_implicit_sync
3508- ac/llvm: add support for 16-bit source operands for samplers
3509- ac/llvm: implement 16-bit and 64-bit fpow correctly
3510- ac/llvm: fix visit_load_ubo_buffer to use SMEM for 16 bits instead of VMEM
3511- ac/llvm: add type parameter into ac_build_buffer_load to fix 16-bit TES inputs
3512- ac/llvm: open code fpow on LLVM 12 using fmul.legacy
3513- driconf: add performance tweaks for viewperf
3514- ac/surface: select best swizzle mode for 3D sampler performance
3515- ac,radeonsi: add sampler changes for Aldebaran
3516- ac: set the TCC line size for Aldebaran
3517- ac/llvm: unpack thread IDs on Aldebaran
3518- ac: handle bigger instruction prefetch for Aldebaran
3519- ac,radeonsi: use correct VGPR granularity on Aldebaran
3520- ac: remove switch cases for pc_lines for compute-only chips
3521- radeonsi: enable RGP on gfx10.3
3522- gallium/u_threaded: don't sync in create_stream_output_target
3523- gallium: add pipe_screen::num_contexts for skipping mutex locking in util_range
3524- radeonsi: update pipe_screen::num_contexts
3525- ac/llvm: handle demote in LLVM 13 that just added support for it
3526- ac/gpu_info: fix more non-coherent RB and GL2 combinations
3527- radeonsi: use pipe_sampler_state::border_color_is_integer to simplify stuff
3528- mesa: fix Blender crash due to optimizations in buffer reference counting
3529- mesa: add assertions for buffer reference counts
3530- mesa: fix a oldNum typo in reallocation in _mesa_reserve_parameter_storage
3531- mesa: don't overallocate ParameterValues 4 times (v2)
3532- mesa: clear reserved parameter storage because it's stored in the shader cache
3533- mesa: fix parameter reservation size
3534- st/mesa: add a driconf option to transcode ETC2 to DXTC
3535- util: add most missing formats with reversed RGB channel order
3536- util: fail assertion in util_format_linear if it can't translate SRGB format
3537- util: add new helper util_format_rgb_to_bgr
3538- radeonsi: select the optimal micro tile mode at clear regardless of fast clear
3539- radeonsi: add a fast path for MSAA resolving with RGB -> BGR swizzling
3540- amd/addrlib: add back the incorrect original DCC checking
3541- amd/addrlib: prevent defining regparm differently
3542- amd/addrlib: define endianess differently
3543- amd: update addrlib
3544- ac/llvm: don't set unsupported xnack options to fix LLVM crashes on gfx6-8
3545- radeonsi: disable sparse buffers on gfx7-8
3546- radeonsi: set the clear/copy cache policy based on L2 cache size
3547- radeonsi: don't insert start/stop pipeline stat events if it has no effect
3548- radeonsi: never set DISABLE_WR_CONFIRM for CP DMA clears and copies
3549- radeonsi: rename internal compute sync flags
3550- radeonsi: remove unused SI_CP_DMA_SKIP_* definitions
3551- radeonsi: merge CP DMA flags with internal compute flags
3552- radeonsi: inline clear_buffer in si_screen_clear_buffer
3553- radeonsi: set compute/cpdma sync flags in the outermost caller
3554- radeonsi: reduce syncing in si_dcc_decompress
3555- radeonsi: reduce syncing for initializing new buffers
3556- radeonsi: reduce syncing in si_compute_expand_fmask when it's already idle
3557- radeonsi: don't do an L2 flush in compute_do_clear_or_copy if we're not syncing
3558- radeonsi: rename and apply SI_OP_CPDMA_SKIP_CACHE_FLUSH to compute as well
3559- radeonsi: use the optimal packet order before draw packets for VS flushes too
3560- radeonsi: add SI_CONTEXT_PFP_SYNC_ME to skip syncing PFP for image operations
3561- radeonsi: return false from si_is_format_supported instead of NULL
3562- radeonsi: don't use constbuf and set cache policy for 12-byte clear shader
3563- radeonsi: don't use a constant buffer for the copy_image compute shader
3564- radeonsi: decrease the maximum variable block size
3565- radeonsi: pack the variable block size in one SGPR, 10 bits per component
3566- amd: fix parsing the last dword of DMA_DATA packets
3567- ac/surface: add CMASK info for level 0
3568- radeonsi: determine accurately whether the framebuffer state has DCC MSAA
3569- radeonsi: remove si_screen::dcc_msaa_allowed
3570- radeonsi: parallelize CMASK and DCC clears
3571- radeonsi: return success/failure from si_alloc_separate_cmask
3572- radeonsi: add num_layers variable into si_do_fast_color_clear
3573- radeonsi: group and parallelize all clears in si_texture_create_object
3574- radeonsi: set better default depth clear value
3575- radeonsi: enable HTILE with mipmapping on gfx9+
3576- radeonsi: unset PIPE_CLEAR_* flags for non-existent buffers
3577- radeonsi: turn the loops over color buffers into while loops in si_clear
3578- radeonsi: don't use CP DMA for clears/copies except for very small ones
3579- ac/surface/tests: move shareable code into ac_surface_test_common.h
3580- radeonsi: fix si_compute_copy_image if DCC decompression happens before a copy
3581- gallium/pb: pass an optional winsys pointer to the buffer destroy function
3582- winsys/radeon: rename radeon_bo_reference -> radeon_ws_bo_reference
3583- radeon_winsys.h: add a new function radeon_bo_reference that takes a winsys
3584- radeon_winsys.h: add a winsys parameter to most winsys buffer functions
3585- winsys/amdgpu: remove amdgpu_winsys_bo::ws
3586- winsys/amdgpu: add amdgpu_cs::ws to reduce dereferences
3587- gallium/pb: change pb_buffer::alignment to alignment_log2
3588- gallium/pb: remove 8 bytes from pb_buffer by packing variables
3589- winsys/amdgpu: remove another 8 bytes from amdgpu_winsys_bo by packing better
3590- ac/surface: split dcc level info from surface_info to save space
3591- ac/surface: overlap color and Z/S fields using a union in legacy_surf_layout
3592- ac/surface: change legacy_surf_level::offset to 32-bit offset_256B shifted by 8
3593- ac/surface: inline and reorder gfx9_surf_flags for better packing
3594- ac/surface: pack gfx9_surf_meta_flags better
3595- ac/surface: pack gfx9_surf_layout:resource_type better to save 8 bytes
3596- ac/surface: pack radeon_surf::num_htile_levels better
3597- ac/surface: pack alignments by storing log2 in radeon_surf
3598- ac/surface: overlap color and Z/S fields using a union in gfx9_surf_layout
3599- ac/surface: pack radeon_surf better
3600- ac/surface: unify htile_levels and dcc_levels as meta_levels
3601- ac/surface: unify htile_* and dcc_* fields as meta_* fields
3602- ac/surface: use named "color and "zs" structures in unions
3603- radeonsi: don't cache FMASK transactions from CB in L2
3604- radeonsi: restructure DCC disablement into a switch
3605- radeonsi: allow trivial DCC clears for shared textures with DCC constant encode
3606- radeonsi: implement per-level DCC and CMASK fast clears for gfx10+
3607- radeonsi: implement Z/S fast clear for non-zero mipmap levels
3608- radeonsi: parallelize Z/S conversion into TC-compatible with fast color clears
3609- radeonsi: clean up some mess around htile_stencil_disabled
3610- radeonsi: add si_can_fast_clear_depth/stencil helpers
3611- radeonsi: indent the code for TC-compatibility HTILE transition
3612- radeonsi: implement fast Z/S clears using clear_buffer on HTILE
3613- radeonsi: enable DCC fast clears for non-zero mipmap levels and 0/1 clear values
3614- radeonsi: when transitioning to TC-compat HTILE, try to do a proper clear
3615- radeonsi: do Z-only or S-only HTILE clear using a compute shader doing RMW
3616- radeonsi: refine fast clears for small buffers, always use them for large HTILE
3617- radeonsi: try to fix DCC coherency issues with DCC decompression
3618- radeonsi: allow DCC_DECOMPRESS via CB with MSAA textures
3619- ac/surface: only apply the 3D swizzle mode tuning to gfx10+
3620- ac/surface/tests: test Sienna Cichlid and Navy Flounder
3621- ac/surface/tests: fix a random segfault in the modifier test
3622- amd/addrlib: expose DCC address equations to drivers
3623- meson: add an optional OpenMP dependency for AMD tests
3624- ac/surface: add a test of DccAddrFromCoord prototype outside of addrlib
3625- ac/surface: limit the number of swizzle modes that can have displayable DCC
3626- ac,radeonsi: rewrite DCC retiling without the DCC retile map
3627- radeonsi: fix and enable full DCC with MSAA 2x on gfx9
3628- radeonsi: implement DCC MSAA 4x/8x fast clear using DCC equations on gfx9
3629- radeonsi: enable DCC for MSAA 4x and 8x on gfx9
3630- radeonsi: move binding the internal compute shader into si_launch_grid_internal
3631- radeonsi: unify internal compute with SSBOs in si_launch_grid_internal_ssbos
3632- compiler: move TEXTURE_COORD/VERTEX_GENERIC_ATTRIB limits into shader_enums.h
3633- nir: add src and dest types to all IO loads and stores for mediump
3634- nir: add new VARYING_SLOTs and shader info for packed 16-bit varyings
3635- nir: add many passes that lower and optimize 16-bit input/outputs and samplers
3636- glsl: pack 16-bit uniforms in the NIR linker
3637- mesa: implement glUniform for packed FP16 uniforms
3638- mesa: implement glGetUniform for FP16 uniforms
3639- mesa: implement glGetActiveUniform for FP16 uniforms
3640- glsl: lower mediump uniforms to FP16 based on an option
3641- gallium: add PIPE_SHADER_CAP_FP16_CONST_BUFFERS for FP16 uniforms
3642- st/mesa: fix nir_lower_io if it's done right after IO vectorization
3643- ac/llvm: implement 16-bit packed VS outputs and FS inputs
3644- radeonsi: implement 16-bit VS->PS varyings
3645- radeonsi: implement 16-bit VS inputs
3646- radeonsi: optimize and legalize 16-bit samplers
3647- radeonsi: kill 16-bit VS outputs if PS doesn't use them or doing Z-only draw
3648- radeonsi: enable FP16 for mediump on gfx9+ if radeonsi_fp16=true
3649- nir: handle mediump varyings in varying compaction helpers
3650- radeonsi: don't decompress DCC for float formats in si_compute_copy_image
3651- radeonsi: fix automatic DCC retiling after DCC clear and DCC decompression
3652- radeonsi: fix automatic DCC retiling after compute image stores
3653- radeonsi: make the gfx9 DCC MSAA clear shader depend on the number of samples
3654- util: fix (re-enable) L3 cache pinning
3655
3656Marek Vasut (2):
3657
3658- compiler/nir: Increment shader input count and mark as used when adding new gl_PointCoord
3659- etnaviv: Fix point sprite Z,W coordinate replacement
3660
3661Mark Janes (12):
3662
3663- intel: Print GPU timing data based on INTEL_MEASURE
3664- anv: enable timestamp for INTEL_MEASURE
3665- anv: implement anv layer of INTEL_MEASURE
3666- blorp: add hook for INTEL_MEASURE
3667- anv: add hooks to call INTEL_MEASURE
3668- iris: implement iris layer of INTEL_MEASURE
3669- iris: add a iris_context reference to iris_batch
3670- intel: stop tracking submission state in INTEL_MEASURE
3671- intel: support secondary command buffers in INTEL_MEASURE
3672- intel: combine common gather routines in INTEL_MEASURE
3673- intel: check setuid before writing output file in INTEL_MEASURE
3674- Revert "blorp/gen12: Don't use aux address if implicit CCS"
3675
3676Matt Turner (8):
3677
3678- docs/freedreno: Fix a few typos
3679- turnip: Remove unused TU_DEBUG_IR3 flag
3680- docs: Mark VK_KHR_maintenance1 as done on turnip
3681- ci: Use CI_PROJECT_ROOT_NAMESPACE
3682- tu: Skip tu_tiling_config_update_tile_layout() if not using gmem
3683- ci: Disable panfrost g52
3684- Remove Scons leftovers
3685- ir3: Don't count (nopX) towards the wrong category
3686
3687Matti Hamalainen (2):
3688
3689- gallium: Fix broken trace XML output
3690- gallium/tools: update trace scripts to Python 3
3691
3692Mauro Rossi (29):
3693
3694- android: r600/sfn: add sfn_nir_lower_64bit.cpp to Makefile.sources
3695- android: freedreno/hw/isa: Add description of ir3 ISA
3696- android: freedreno/ir3: Switch over to new encoder/decoder
3697- android: pan/mdg: create nir pass to lower image coord bitsize
3698- android: intel: Print GPU timing data based on INTEL_MEASURE
3699- android: anv: implement anv layer of INTEL_MEASURE
3700- android: iris: implement iris layer of INTEL_MEASURE
3701- android: radv: port to using common dispatch code.
3702- android: radv: fix building error in radv_android.c
3703- android: util/fossilize_db: add missing sources to Makefile.sources
3704- android: ac/rgp: fix building error
3705- android: mesa: Move the FXT1 compressor/decompressor to util/
3706- android: pan/bi: reorder static dependencies in gallium/dri
3707- driconf: avoid Non-ASCII character error in driconf_static.py
3708- android: driconf: Generate a static table when no xmlconfig
3709- android: i965: Rename files with "intel\_" prefix to "brw\_"
3710- android: util: create some standalone compression helpers
3711- android: anv: add libcutils shared dependency
3712- android: r600/sfn: fix sfn_nir_algebraic.c gen rules
3713- android: vulkan/util: add vk_descriptors.{c,h} to Makefile.sources
3714- android: amd/addrlib: define endianess to build
3715- android: panfrost: Use the blend shader cache attached to the device
3716- vulkan/util: Fix implicit declaration of ffs for Android build
3717- android: anv: Remove anv_intel.c from Makefile.sources
3718- android: anv: fix build error in anv_android.c
3719- compiler/glsl: fix include for Android build
3720- android: panfrost/lib: add pan_cs.c to Makefile.sources
3721- android: gallium/radeonsi: add nir include path
3722- android: amd/common: add nir include path
3723
3724Michael Tang (5):
3725
3726- microsoft/compiler: Make resource_state_manager only build with_gallium_d3d12
3727- util: Make os_read_file use O_BINARY on Windows
3728- microsoft/spirv_to_dxil: Fix spirv2dxil I/O to use binary mode
3729- microsoft/spirv_to_dxil: Add lowering pass to handle gl_PerVertex
3730- microsoft/spirv_to_dxil: Add extra lowering functions according to the docs on nir_inline_functions
3731
3732Michel Dänzer (53):
3733
3734- ci: Remove .gitlab-ci/meson-build.bat
3735- ci: Use meson test directly instead of ninja test
3736- wsi/x11: Use get_screen_resources_current in wsi_x11_detect_xwayland
3737- ci: Enable process isolation for softpipe & freedreno piglit jobs
3738- ci: Use GNU time as meson test wrapper
3739- ci: Run 'time' in the background and propagate signals to test process
3740- ci: Fix MESA_TEMPLATES_COMMIT value
3741- ci: Update to newer ci-fairy
3742- ci: Set GALLIVM_PERF=no_filter_hacks for llvmpipe-piglit-quick_shader
3743- ci: Set GALLIVM_PERF=no_filter_hacks for llvmpipe-piglit-quick_gl
3744- ci: Set GALLIVM_PERF=nopt,no_filter_hacks for llvmpipe-gles2
3745- ci: Use MESA\_ namespace for image variables in Windows jobs
3746- ci: Use MESA_IMAGE_TAG everywhere
3747- ci: Move FDO_DISTRIBUTION_TAG assignment to template
3748- ci: Add and use .set-image template to construct docker image name
3749- ci: Incorporate base image tag into dependent image tags
3750- ci: Append build image tag to LAVA tag used for minio path
3751- ci: Add trailing slash to path for documentation preview
3752- ci: Restrict meson-gallium job to gstreamer runners
3753- ci: Disable scons-win64 job
3754- ci: Move meson-build.sh to meson/build.sh
3755- ci: Drop SIGINT handling from meson test wrapper script
3756- ci: Move /usr/bin/time check from meson test wrapper to build script
3757- aco/tests: Use _exit in child process
3758- ci: Add strace to the x86_build docker image
3759- ci: Run meson tests in strace if it's available and can be used
3760- ci: Don't run meson tests in strace for meson-mingw32-x86_64 job
3761- intel/tools: Use subprocess.Popen to read output directly from a pipe
3762- Revert "ci: Restrict meson-gallium job to gstreamer runners"
3763- glcpp: Fully initialize struct gl_context
3764- ci: Disable valgrind in some build jobs
3765- glsl/tests: Bump glcpp valgrind test timeout to 240 seconds
3766- glsl/tests: Don't use tempfiles
3767- glsl/tests: Use exit code 126 to detect valgrind errors
3768- Revert "ci: disable glcpp tests for now"
3769- Revert "meson: add enable-glcpp-tests option"
3770- Revert "glsl/test: Don't run whitespace tests in parallel"
3771- ci: Remove INCLUDE_PIGLIT
3772- ci: Build ARM baremetal rootfs in native container
3773- ci: Merge ARM testing docker images to a single arm_test one
3774- wsi/x11: Wait for fences with IMMEDIATE on Xwayland
3775- ci: Fix HTML summary path for piglit OpenCL job artifacts
3776- intel/blorp: Initialize texture_data[0]
3777- ci: Do not install armhf LLVM packages
3778- ci: Bump LLVM/clang from 10 to 11
3779- ci: Move docker images from Debian buster to bullseye
3780- ci: Install librenderdoc from Debian bullseye
3781- ci: Install spirv-tools from Debian bullseye
3782- ci: Install llvm-spirv from Debian bullseye
3783- ci: Install GLVND from Debian bullseye
3784- ci: Install Rust & cargo from Debian for x86_test* images
3785- ci: Do not append ci-templates commit hash to Windows docker image tag
3786- ci: Update to latest ci-templates
3787
3788Michel Zou (25):
3789
3790- vulkan/lavapipe: add missing VKAPI_ATTR/CALL
3791- vulkan: Fix windows api conflict
3792- zink: Fix win32 build
3793- vulkan: Fix windows api conflict
3794- meson: invalid keyword argument dependencies
3795- zink: fix win32 build
3796- util: fix gcc vsnprintf overflow
3797- glapi: keep declspec(thread) msvc-specific
3798- vulkan: implement wsi_win32 backend
3799- lavapipe: add mingw32 def file
3800- lavapipe: set empty dll prefix
3801- gallium: remove DROP_PIPE_LOADER_MISC
3802- meson/xmlconfig: win32 regex fallback
3803- meson: detect winflex/bison only on native win32
3804- turnip: update features.txt
3805- lavapipe: update features.txt
3806- vulkan: fix CreateRenderPass prototype
3807- swr: extern declaration for win32 intrinsics
3808- swr: fix win32 intrinsics
3809- swr: Fix SWR_CONTEXT pre-declaration
3810- swr: fix unused SplitString warning
3811- swr: fix deprecated llvm 11 declaration warning
3812- swr: fix array-bounds warning
3813- lavapipe: Fix type narrowing
3814- docs: missing lvp win32surface ext in features.txt
3815
3816Mike Blumenkrantz (775):
3817
3818- zink: clamp sampler+samplerview limits
3819- util/hash_table: optimize rehash for empty table and no-func clears
3820- util/set: optimize rehash for empty table and no-func clears
3821- util/set: add the found param to search_or_add
3822- util/set: split off create() into an init() function
3823- zink: optimize renderpass hash table
3824- nir/lower_uniforms_to_ubo: set explicit_binding on uniform_0
3825- zink: add spirv builder function for runtime array type
3826- zink: add util function for emitting ntv atomic ops
3827- zink: add set_shader_buffers pipe_context method
3828- zink: hook up ssbo shader bindings
3829- zink: emit ssbo variables in ntv
3830- zink: modify ubo loading in ntv to work for ssbos
3831- zink: start supporting atomic shader ops
3832- zink: split UBOs and samplers into 'read' batch references during draw
3833- zink: flag ssbo buffer resources as having pending writes on batch
3834- zink: add more usage bits for buffer types
3835- zink: partially enable SSBO pipe cap
3836- zink: only emit streamout targets during draw if we have them
3837- zink: rework framebuffer state
3838- zink: add batch flag for checking renderpass state
3839- zink: remove renderpass refcounting
3840- zink: ralloc zink_framebuffer structs
3841- zink: rename param in zink_create_framebuffer
3842- zink: use 'fb' variable name for zink_framebuffer objects in zink_framebuffer.c
3843- zink: decouple renderpass from framebuffer state
3844- zink: move zink_clear to zink_clear.c
3845- zink: start to refactor clearing
3846- zink: handle clears with scissor regions
3847- zink: break out scissor region testing for clear functions
3848- zink: break out color/zs no_rp clear into separate functions
3849- zink: break out some of the u_blitter setup into util function
3850- zink: add a pipe_context::clear_texture hook
3851- zink: enable PIPE_CAP_CLEAR_TEXTURE
3852- zink: reduce blendfactor when alpha_to_one is set
3853- zink: tweak xfb slot mapping in ntv
3854- zink: process ubos with location values set as long as they're actually ubos
3855- zink: add VK_KHR_driver_properties
3856- zink: enable WSI-faking for RADV too
3857- zink: rename zink_context::\*image_views -> sampler_views
3858- zink: add ntv util function for getting image type
3859- zink: rewrite image/sampler glsl -> vk type functions for robustness
3860- zink: add spirv_builder function for hexops
3861- zink: add spirv builder functions for image ops
3862- zink: add ntv function for emitting variable access decorations
3863- zink: verify format caps and add storage image usage when possible in creation
3864- zink: add 'has_draw' flag to batch struct
3865- zink: add a pipe_context::memory_barrier hook
3866- zink: add shader image support to zink_binding()
3867- zink: add new 'sampler_types' variable to ntv_context struct
3868- zink: handle image variable types in ntv
3869- zink: handle more atomic ops in ntv
3870- zink: handle nir_intrinsic_memory_barrier in ntv
3871- zink: add nir_var_uniform case to get_storage_class()
3872- zink: expand ntv array derefs to track image derefs
3873- zink: add handling for all basic image ops in ntv
3874- zink: enable early frag test execution in ntv when necessary
3875- zink: enable image caps in ntv when a shader has images
3876- zink: handle image descriptors during zink_shader creation
3877- zink: break out bufferview creation into separate function
3878- zink: add a pipe_context::set_shader_images hook
3879- zink: handle shader image descriptor updates during draw
3880- zink: check if multisample support exists for shader image formats
3881- zink: export shader image caps using features
3882- zink: GLSL 420
3883- docs/features: mark off GL 4.2 for zink
3884- zink: set PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATS
3885- zink: force per-sample interpolation
3886- zink: set PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT
3887- zink: set PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR
3888- zink: support VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL usage
3889- zink: add layout member to barrier setup in draw
3890- zink: support PIPE_FORMAT_X24S8_UINT
3891- zink: check correct caps for PIPE_CAP_IMAGE_LOAD_FORMATTED
3892- zink: enable PIPE_CAP_SAMPLER_VIEW_TARGET
3893- gallium/u_inlines: add helper for simplifying pipe_context::resource_copy_region
3894- zink: add function for waiting on a specific batch's fence
3895- zink: don't force a renderpass start when setting framebuffer state
3896- spirv: handle NoContraction in GLSL450 alu ops
3897- zink: fix streamout for clipdistance
3898- zink: add a VkExternalMemoryImageCreateInfo for PIPE_BIND_SHARED images
3899- zink: set lower_mul_2x32_64 when 64bit int support is available
3900- zink: enable PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE
3901- zink: flag gfx pipeline dirty using newer mechanism
3902- zink: guarantee surface lifetimes for shader images
3903- st/bitmap: use GL_CLAMP_TO_EDGE for bitmap samplers
3904- st/drawpixels: use GL_CLAMP_TO_EDGE instead of GL_CLAMP for samplers
3905- zink: don't export PIPE_CAP_MAX_COMBINED_SHADER_BUFFERS value
3906- zink: add spirv builder function for OpAtomicStore
3907- zink: flag ssbo buffer resources as having pending writes per stage
3908- zink: handle null ssbo attachments without crashing
3909- zink: handle more ssbo ops in ntv
3910- zink: rework ssbo indexing and binding
3911- zink: support nir_intrinsic_store_ssbo
3912- zink: implement get_ssbo_size nir intrinsic
3913- zink: flatten out ssbo/ubo variable decls in ntv
3914- zink: export ssbo caps
3915- Revert "glcpp: disable 'windows' tests"
3916- meson: add enable-glcpp-tests option
3917- ci: disable glcpp tests for now
3918- zink: add barrier helper for buffer resources
3919- zink: add a stage param for buffer resource barriers
3920- zink: add helper function for checking if access flags include write access
3921- zink: improve barrier helper for buffer resources and add check for barrier need
3922- zink: flag previous vertex stages as dirty when toggling a later stage
3923- zink: add shader key for vs shaders
3924- zink: flag shaders as needing update when clip_halfz changes
3925- zink: move tess/geom shader info to vs shader key
3926- glsl: support 64bit integer loop iterators
3927- radv: print image array size in debug mode
3928- zink: move maintenance2 extension to right file
3929- zink: unify shader image unbind codepath
3930- zink: be a little more precise about query types in one conditional
3931- radv: null bo list pointer for null descriptors on update
3932- radv: zero the bo descriptor array when allocating a new set
3933- zink: force 4 component formats for samplerview/render textures
3934- zink: support nir_intrinsic_memory_barrier_buffer
3935- zink: add defines for compute batch and gfx batch count
3936- zink: bump resource usage flags to allow 5 batches
3937- zink: make get_resource_usage() public
3938- zink: make zink_batch_reference_resource_rw return usage info
3939- zink: wait on compute batch when necessary during transfer map
3940- zink: add spirv_builder function for emitting a 3word literal exec mode
3941- zink: handle COMPUTE bindings in compiler/ntv
3942- zink: handle COMPUTE setup in ntv
3943- zink: handle COMPUTE glsl variables
3944- zink: implement shared load/store nir ops in ntv
3945- zink: add handling for shared atomic ops in ntv
3946- zink: handle nir_intrinsic_memory_barrier_shared in ntv
3947- zink: ignore compute batch when starting/ending batches
3948- zink: take a pipe_reference param in zink_batch_reference_program
3949- zink: refactor batch creation
3950- zink: make allocate_descriptor_set() take more params instead of a gfx_program
3951- zink: explicitly get shader stage from shader during binding setup in draw
3952- zink: rename pipeline_cache_entry -> gfx_pipeline_cache_entry
3953- zink: add compute programs and pipelines
3954- zink: break out descriptor updating into separate function
3955- zink: setup compute batch and add handling
3956- zink: handle memory barriers for compute batch
3957- zink: handle descriptor set updates for compute operations
3958- zink: flush gfx/compute batches when the other pipeline needs resource sync
3959- zink: add launch_grid pipe_context hook for compute handling
3960- zink: export compute-specific shader/compute caps
3961- zink: enable compute
3962- zink: GLSL 430
3963- features: mark off GL 4.3 for zink
3964- zink: add spirv_builder wrapper for vote intrinsics
3965- zink: handle vote intrinsics in ntv
3966- zink: rework viewport handling
3967- zink: handle nir_texop_texture_samples
3968- zink: add a texture barrier hook
3969- zink: use = and not \|= for VkMemoryPropertyFlags during resource creation
3970- zink: set HOST_COHERENT bit for coherent resource creation
3971- zink: track persistent, non-coherent, writable transfer map count for resources
3972- zink: slightly refactor batch resource referencing in update_descriptors()
3973- zink: flush all resources with persistent maps on work batch before draw/compute
3974- zink: enable PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT
3975- zink: rename zink_context::dummy_buffer -> dummy_vertex_buffer
3976- zink: create dummy xfb buffer
3977- zink: handle null xfb buffers
3978- zink: use better mapping for PIPE_FORMAT_X24S8_UINT
3979- zink: handle nir_intrinsic_load_helper_invocation
3980- zink: fix xfb buffer refcounting
3981- zink: add PIPE_BIND_QUERY_BUFFER to the all-purpose resource creation path
3982- zink: add a get_query_result_resource hook
3983- zink: enable PIPE_CAP_TGSI_ARRAY_COMPONENTS
3984- zink: enable PIPE_CAP_QUERY_BUFFER_OBJECT
3985- zink: GLSL 440
3986- zink: enable PIPE_CAP_CONDITIONAL_RENDER_INVERTED
3987- zink: enable PIPE_CAP_CLIP_HALFZ
3988- zink: enable PIPE_CAP_TGSI_TXQS
3989- zink: enable PIPE_CAP_TEXTURE_BARRIER
3990- zink: GLSL 450
3991- features: mark off GL 4.5 for zink
3992- zink: add spirv interfaces for bo and image/sampler/push variables
3993- zink: lower flrp64 and ffma64 when in softfp64 mode
3994- zink: always use query->type for starting/stopping xfb queries
3995- zink: make the xfb_query_pool into an array
3996- zink: break out cpu query reading for qbos into separate function
3997- zink: put SO_OVERFLOW queries on the primgen list
3998- zink: support SO_OVERFLOW pipe query types
3999- zink: fix streamout for tess stage
4000- zink: flag exact alu op results in ntv with NoContraction
4001- zink: unset generated TCS if its parent TESS is unset
4002- zink: hook up cs invocation queries to the compute batch
4003- zink: add support for pipeline statistics queries
4004- zink: fix slot mapping for legacy gl io with tess stages
4005- zink: handle 1bit undef values in ntv
4006- zink: add handling for ARB_shader_draw_parameters variables in ntv
4007- zink: create a struct for tracking push constant layout
4008- zink: rework tcs injection to be more compatible with new push const struct
4009- zink: add push constant value to indicate whether the current draw is indexed
4010- zink: wrap shader gl_BaseVertex access with a bcsel based on push constant state
4011- zink: add a draw_id param to vs push constants
4012- zink: add a vs shader key for rewriting gl_DrawID
4013- zink: break out push constant creation in compiler and add drawid value
4014- zink: rewrite drawid based on shader key value
4015- zink: add util function for submitting the compute batch
4016- zink: enable PIPE_CAP_TGSI_VOTE
4017- zink: enable PIPE_CAP_DRAW_PARAMETERS
4018- zink: enable PIPE_CAP_POLYGON_OFFSET_CLAMP
4019- zink: enable PIPE_CAP_QUERY_SO_OVERFLOW
4020- zink: enable pipeline statistics cap
4021- zink: PIPE_CAP_GL_SPIRV
4022- zink: GLSL 460
4023- features: mark off GL 4.6 and ES 3.1 for zink
4024- zink: support nir_intrinsic_group_memory_barrier
4025- zink: fix device codegen extension detection
4026- zink: add nir_intrinsic_memory_barrier_image handling
4027- zink: use nir_shader_instructions_pass for draw params pass
4028- zink: add flag for no-oping fence finish
4029- zink: hook up valid_buffer_range for buffer resources using util_range
4030- zink: create a VkPipelineCache object on the screen and use it
4031- zink: add a disk cache for pipeline objects
4032- gallium/trace: add a pipe_screen::get_compiler_options method
4033- zink: handle dual blending override from driconf
4034- zink: move command pool to the batch
4035- nir/lower_tex: rewrite tex/txb -> txd/txl before saturating srcs
4036- mesa/st: add pipe_sampler_state::border_color_is_integer
4037- mesa/st: add PIPE_CAP_GL_CLAMP
4038- zink: enable GL_CLAMP cap
4039- gallium/trace: remove transfer_map assert
4040- zink: add helper function for getting pipeline stage from shader stage
4041- zink: set buffer resource barriers for descriptor resources in update_descriptors()
4042- zink: rework xfb counter resource barriers
4043- zink: rework xfb barrier transitions when reusing as vertex inputs
4044- zink: remove aspect param from zink_resource_barrier
4045- zink: add a VkPipelineStageFlags param to zink_resource_barrier()
4046- zink: add helper for image resource barriers and avoid unnecessary barriers
4047- zink: use define for max descriptor array size
4048- zink: add generic wrapper for checking whether a resource needs a barrier
4049- zink: avoid emitting unnecessary pipeline barriers during update_descriptors
4050- zink: break out barrier transitioning in update_descriptors
4051- zink: combine resource barriers where possible during update_descriptors
4052- zink: take struct zink_batch param instead of direct cmdbuf in barrier helpers
4053- zink: assert batch is not in a renderpass when emitting pipeline barrier
4054- zink: add barriers for index and draw param buffers
4055- zink: add access param for image resource barriers
4056- zink: add access info for update_descriptor image barriers
4057- zink: add batch references for resources in clear functions
4058- zink: improve barrier usage for clear functions
4059- zink: zink_resource_barrier -> zink_resource_image_barrier
4060- zink: add general zink_resource_barrier() wrapper
4061- zink: be more explicit with image barriers for copy operations
4062- zink: fix surface creation for cube slices
4063- zink: tag some missing ES features
4064- zink: update relnotes
4065- zink: just call context destructor on creation fail
4066- zink: add buffer barriers for resource_copy_region
4067- zink: break out buffer copying into util function with batch param
4068- zink: just end the current renderpass in zink_batch_no_rp()
4069- zink: break out even more of zink_blit state saving
4070- zink: use vkGetFenceStatus when we're obviously checking for status
4071- zink: fix buffer resource usage flags
4072- zink: break out query result buffer copying into util function
4073- zink: simplify some of the qbo direct buffer write code
4074- zink: better handling for availability queries on qbos when query/resource is busy
4075- zink: improve batch flushing for queries when compute batches are involved
4076- zink: always use 64bit flag for query results
4077- zink: handle scissor+viewport states dynamically if extension is available
4078- zink: remove 'scissors' member of viewport state
4079- zink: always set VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT for non-staging resources
4080- zink: add available|visible masks to all barriers in ntv
4081- zink: set conformant ubo/ssbo size limits
4082- zink: destroy renderpass objects on context destroy
4083- zink: rename 'has_draw' flag on batches and set it when the batch is used
4084- zink: move gfx pipeline creation closer to the bind point
4085- zink: only reset pipeline hash conditionally when updating fb state
4086- zink: simplify barrier usage
4087- zink: beef up zink_transfer_flush_region
4088- zink: only wait on last write-batch for resources during transfer_map
4089- zink: change some transfer_map cases of waiting on cs batch to flushing cs
4090- zink: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE in transfer_map for buffers
4091- zink: update ci results
4092- zink: disable some builtin-gl-sample-mask sample shading tests on ci
4093- zink: actually disable sample mask tests on ci
4094- lavapipe: fix color-only renderpass clears
4095- zink: ralloc the main context
4096- zink: create framebuffer and renderpass objects just before vkCmdBeginRenderPass()
4097- zink: defer pipe_context::clear calls when not currently in a renderpass
4098- zink: also defer fb clears when conditional render is active
4099- zink: break out region overlap testing function into helper
4100- zink: add helper for converting pipe_box -> u_rect
4101- zink: add another helper for checking whether one rect covers another
4102- zink: break out fb clear apply into helper function
4103- zink: add helper for applying/discarding clears based on a rect
4104- zink: discard pending clears during blit/copy if we'll overwrite the data
4105- zink: add yet another clear helper, this time for applying overlap regions
4106- zink: optimize the remaining read cases of applying pending clear calls
4107- zink: move all the clear stuff to zink_clear.h
4108- zink: always do full-fb clears in renderpass begin when possible
4109- zink: ci changes
4110- zink: improve descriptor set oom handling
4111- zink: ci updates
4112- zink: set PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK
4113- zink: force 128 fs input components on intel drivers
4114- zink: add some spirv builder functions for spec constants
4115- zink: support gl_LocalGroupSize
4116- zink: add more ci flakes
4117- util/bitscan: add u_foreach_bit macros
4118- v3dv: remove for_each_bit() macro
4119- radv: for_each_bit -> foreach_bit
4120- freedreno/vulkan: for_each_bit -> foreach_bit
4121- anv: for_each_bit -> foreach_bit
4122- zink: use 0 as default for spec constants
4123- zink: no-op descriptor updating for draws without descriptors
4124- nir/texcoord_replace: add a yinvert param
4125- zink: store prim mode to context during draw
4126- zink: handle point sprite
4127- zink: ci updates
4128- zink: avoid memset during update_descriptors() for resources refs
4129- zink: move samplerview referencing around in update_descriptors()
4130- zink: reorder zink_bind_vertex_buffers()
4131- zink: create a single fence per batch on startup and then reuse
4132- zink: only flush batches in pipe_context::flush if they actually have work
4133- zink: add a define for compute batch count
4134- zink: add util function for returning previous batch
4135- zink: handle PIPE_FLUSH_DEFERRED
4136- zink: handle VK_IMAGE_LAYOUT_PRESENT_SRC_KHR barriers
4137- zink: set VK_IMAGE_LAYOUT_PRESENT_SRC_KHR on fb resources at eof flush
4138- zink: setup CmdBindVertexBuffers2EXT member in screen for dynamic state
4139- zink: make dynamic state usage in pipeline creation more explicit/flexible
4140- zink: use dynamic vertex buffer strides
4141- zink: rename zink_context::buffers -> vertex_buffers (and usage mask)
4142- zink: add zink_program struct as a base class for compute/gfx structs
4143- zink: use zink_program in zink_batch_reference_program()
4144- zink: ralloc zink program structs
4145- zink: unref programs last in batch reset
4146- zink: properly size descriptorset layout binding stack array
4147- zink: increment batch->descs_used during update_descriptors flushing
4148- zink: do batch-program tracking after possibly cycling batch in update_descriptors()
4149- zink: add spirv builder methods for OpImageQueryLevels
4150- zink: hook up nir_texop_query_levels
4151- zink: relax tessellation shader reqs
4152- zink: ci updates
4153- zink: fix dynamic bo lowering for ssbo stores
4154- zink: pre-fetch all format properties during screen init
4155- zink: use pre-fetched format properties everywhere
4156- zink: don't start renderpasses during descriptor update
4157- zink: add more usage bits for buffer resource creation
4158- zink: handle null src for fb refs
4159- zink: track all framebuffers per batch
4160- zink: store total memory size on zink_screen
4161- zink: track resource mem usage per batch
4162- zink: force batch flush if batches are using more than 1/10 total system memory
4163- mesa/st: clamp scissored clear regions to fb size
4164- mesa/st: no-op scissored clear calls with size zero
4165- zink: handle GLSL_SAMPLER_DIM_EXTERNAL in ntv
4166- zink: ci updates
4167- mesa/st: even better no-oping for clears
4168- zink: apply only the pending zs clear bits during deferred clears
4169- zink: enable PIPE_CAP_CLEAR_SCISSORED
4170- zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT
4171- zink: use staging resource for write transfer_map in order to not stall
4172- zink: ci updates
4173- zink: rewrite macro for getting KHR device functions
4174- zink: add vk/spirv caps/extension for shader LAYER variable
4175- zink: remove ntv streamout assert
4176- zink: fix streamout emission for super-enhanced layouts
4177- zink: fix slot mapping for fat io variables
4178- zink: fix location usage for explicit xfb outputs
4179- zink: run more nir passes for tess shaders
4180- zink: stop allocating xfb slot map
4181- zink: handle direct xfb output from output variables
4182- zink: evaluate existing slot map during program init and force new map as needed
4183- zink: rename variable in update_so_info()
4184- zink: use info.has_transform_feedback_varyings to determine xfb enablement
4185- zink: pass so_info directly to update_so_info()
4186- zink: use slightly stricter check for update_so_info() callsite
4187- zink: only export necessary xfb outputs to ntv
4188- zink: don't pass so_info to ntv at all unless it's necessary
4189- zink: unref ctx->framebuffer on context destroy
4190- zink: fix instance/device versioning (for real this time)
4191- zink: simplify some update_descriptor code
4192- zink: move descriptor sets/pools from batches to programs
4193- zink: store and reuse descriptorsets after batch completion
4194- zink: move descriptor set alloc function to zink_program.c
4195- zink: use more precise sizing for descriptor pools
4196- zink: add helper function for cycling a batch
4197- zink: even better handling for descriptor oom
4198- zink: remove flushes for batch descriptor use
4199- zink: add bucket allocating for descriptor sets
4200- zink: add scaling factor for descriptor set bucket allocations
4201- zink: add caching for descriptor sets
4202- zink: add second level cache for descriptor sets
4203- zink: move streamout to draw_vbo
4204- zink: reorder descriptor barrier applying during updating
4205- zink: move surface refs to the end of descriptor updating
4206- zink: split descriptor sets based on usage
4207- zink: use dynamic offsets for first ubo
4208- zink: introduce descriptor states
4209- zink: add a null sampler view descriptor hash to the screen
4210- zink: pre-hash sampler views and states
4211- zink: store last-used descriptor set for each type of set for quick reuse
4212- zink: actually flag all used resources as used during update_descriptors
4213- zink: add program pointer to desc set struct
4214- zink: move descriptor set allocation near the top of update_descriptors
4215- zink: only batch-reference the program in use once per descriptor update
4216- zink: improve descriptor cache invalidation
4217- zink: add flag for recycled descriptor sets
4218- zink: don't double iterate all the per-batch sets on reset
4219- zink: add VkPipelineLayout to zink_program meta struct
4220- zink: split out ubo descriptor updating
4221- zink: break out ssbo descriptor updating
4222- zink: break out sampler descriptor updating
4223- zink: break out image descriptor updating
4224- zink: deduplicate VkWriteDescriptorSet setup
4225- zink: break out descriptor stuff into new files
4226- zink: break out all the descriptor pool/layout stuff into a new struct
4227- zink: change program pointer on struct zink_descriptor_set to pool pointer
4228- zink: track number of sets currently allocated per descriptor pool
4229- zink: move descriptor type to pool object from set
4230- zink: allow reuse of zink_descriptor_pools between programs
4231- zink: remove intermediate func for descriptor set getting
4232- zink: simplify check for knowing whether descriptor updating is needed
4233- zink: pre-size descriptor transition hash table
4234- zink: move descriptor binding out of the update codepath
4235- zink: reuse descriptor barriers across draws
4236- zink: track resource count on descriptor pool object
4237- zink: directly use resource count from pool instead of accumulating every time
4238- zink: remove struct zink_descriptor_resource from descriptor updating
4239- zink: don't create descriptor barrier hash tables for cached descriptor set
4240- zink: always use VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL for sampler bindings
4241- zink: remove image layouts from descriptor states
4242- zink: avoid hashing states without descriptors
4243- zink: fix arrays of texel buffer descriptors
4244- zink: ci updates
4245- zink: move vertex_buffers_enabled_mask to non-hashed part of pipeline state
4246- zink: optimize pipeline hashing
4247- zink: implement an extremely dumb resource memory cache
4248- zink: ci updates
4249- zink: remove debug printf
4250- lavapipe: fix push descriptor set indexing
4251- lavapipe: set more resource bind flags using image/buffer usage bits
4252- zink: move buffer<->image copying to pipe_context::resource_copy_region hook
4253- zink: remove extraneous flush in transfer_map_region_flush
4254- zink: ci updates
4255- zink: optimize resource usage tracking
4256- zink: use _mesa_set_search_and_add() for set management
4257- zink: ralloc screen objects
4258- zink: implement a surface cache
4259- zink: use a safer iteration for fb surfaces during rp init
4260- zink: determine whether the vulkan driver requires mesa flush wsi
4261- zink: force mesa flush implicit fencing on ANV
4262- zink: force explicit fence only on first frame flush
4263- zink: use internal api for first-frame fence
4264- zink: return true from program ref functions upon free
4265- zink: unset ctx->program pointers when an unref destroys the object
4266- zink: stop leaking programs
4267- vk/util: add unified shader module struct/functions
4268- vk/util: add a util macro for initializing stack vk_shader_module structs
4269- lavapipe: use common interfaces for shader modules
4270- radv: use common interfaces for shader modules
4271- v3dv: use common interfaces for shader modules
4272- tu: use common interfaces for shader modules
4273- anv: use common interfaces for shader modules
4274- zink: add batch usage flags for sampler views/states and desc sets
4275- zink: avoid looping for non-ubo descriptor updates based on set usage
4276- zink: break out batch id finding for resource usage into util function
4277- zink: move resource internals to a separate struct
4278- zink: split out backing resource object create/destroy
4279- zink: track resource_object usage instead of resource usage
4280- zink: handle cached descriptor set punting
4281- zink: add some asserts for pipeline barriers to check renderpass state
4282- zink: add util function for checking whether a shader descriptor is a buffer
4283- zink: add util function for checking whether shader descriptor is buffer from program
4284- zink: use an explicit zink_buffer_view struct
4285- zink: explicitly use zink_surface objects for sampler/image view objects
4286- zink: store bufferview hash to bufferview struct
4287- zink: simplify bufferview and imageview descriptor state hashing
4288- zink: add extra batch tracking for sampler views
4289- zink: massively beef up batch tracking for shader images
4290- zink: add helper function for getting a resource for a descriptor
4291- zink: update null descriptor hashes to reflect current descriptor states
4292- zink: improve debug asserts for samplers/images during descriptor updates
4293- zink: properly handle null bufferview descriptor states
4294- zink: cache bufferviews
4295- zink: add missing null check
4296- zink: unset last_vertex_stage_dirty after applying it
4297- zink: run nir_convert_from_ssa last during compile
4298- zink: use intermediate var for glsl non-array type during shader create
4299- zink: break out bo array type construction into ntv util function
4300- zink: also break out whole ntv bo struct pointer construction
4301- zink: add unsized array type to get_glsl_type_element() handling
4302- zink: add debug info about missing atomic ops
4303- zink: add ntv util function for checking if a glsl type is an atomic counter
4304- zink: break out sized uint array construction into util function
4305- zink: flatten binding numbers a bit
4306- zink: directly set nir variable bindings for reuse during ntv
4307- zink: move zink_binding() to compiler.c
4308- zink: apply Delete All The Code methodology to the ubo/ssbo variables
4309- zink: set ntv variable descriptor sets during compile phase
4310- zink: ci updates
4311- ci/panfrost: disable the rest of these jobs temporarily
4312- zink: hook up resource bind history
4313- zink: remove direct samplerview batch-tracking
4314- zink: add a pipe_context::invalidate_resource hook
4315- zink: set valid region for streamout buffers on bind
4316- zink: handle streamout buffer rebinds
4317- zink: invalidate resources on map when discarding range
4318- zink: enable PIPE_CAP_INVALIDATE_BUFFER
4319- zink: switch to deqp-runner for piglit jobs
4320- zink: always use requested format for sampler view creation
4321- zink: ci updates
4322- zink: more consolidation for null sampler/image view hashing
4323- zink: add a pipe_context::fence_server_sync hook
4324- zink: add enum for different queues
4325- zink: refactor resource_sync_writes_from_batch_usage() to manage batch id internally
4326- zink: convert ZINK_RESOURCE_ACCESS defines to enum
4327- zink: abstract zink_get_resource_usage() and move it to be internal
4328- zink: return enum zink_queue from zink_batch_reference_resource_rw()
4329- zink: split out batch resource-set clearing into separate function
4330- zink: move active query pruning to batch reset
4331- zink: move batch init into zink_batch.c
4332- zink: also move batch destructor into zink_batch.c
4333- zink: move other batch-tracking implementations to unified codepath
4334- zink: use macro to streamline batch struct member init
4335- zink: remove query batch-tracking init from begin_query()
4336- zink: move fence reset to zink_fence_init()
4337- zink: clear framebuffer state on context destroy
4338- zink: enable spirv extension for post depth coverage
4339- compiler/spirv: fix image sample queries
4340- zink: handle nir_intrinsic_image_deref_samples
4341- zink: flatten 2d_array surfaces when necessary
4342- lavapipe: support VK_KHR_copy_commands2
4343- lavapipe: rewrite cmdbufs to always do descriptor binds/pushes first
4344- lavapipe: force state updates when beginning queries
4345- llvmpipe/setup: force fs constant updating upon beginning queries
4346- zink: break out surface viewtype clamping into util function
4347- zink: improve surface viewtype clamping
4348- zink: correctly clamp samplerview surface types
4349- ci/lavapipe: split out lavapipe ci into lavapipe dir
4350- llvmpipe/setup: use bigger hammer to force fs constant updating correctly
4351- zink: split off a bunch of batch struct members to new batch state struct
4352- zink: rewrite queue dispatch to use monotonic batch ids instead of hardcoded ones
4353- zink: more accurately check samplecount caps for shader images
4354- zink: make fb ref func return bool on free
4355- zink: add explicit surface/bufferview batch-tracking functions
4356- zink: use surface references for fb attachments
4357- zink: break out surface destroy function into a screen function
4358- zink: use a custom surface referencing function whenever unrefing a surface
4359- zink: implement a global framebuffer cache
4360- vk: consolidate dynamic descriptor binding sorting
4361- ci: update xfails for ppc64le and s390x
4362- zink: break out buffer mapping part of zink_transfer_map
4363- zink: cache transfer maps
4364- zink: unify clear color conversion code
4365- nir: add nir_lower_indirect_builtin_uniform_derefs()
4366- st/glsl_to_nir: lower indirect derefs of builtins in non-packed uniform case
4367- softpipe: ci updates
4368- zink: move 'batch_id' and 'is_compute' members to fence
4369- zink: make batch usage unsetting function public
4370- zink: always reset batch states when finding a new one
4371- zink: move batch-tracked resources to fence object
4372- zink: fix spirv image operand ordering
4373- zink: fix multisampled shader image load/store
4374- zink: force PIPE_SWIZZLE_1 for X channels in samplerviews
4375- zink: handle blitting of color formats with ignored alpha channels
4376- zink: emulate PIPE_FORMAT_R8G8B8X8_UNORM
4377- zink: ci updates
4378- zink: relax unreachable() to debug_printf when waiting on batch
4379- zink: rework public batch flush function to be useful again
4380- zink: move zink_flush_compute() users to zink_flush_queue()
4381- zink: always flag xfb barrier on gfx flush when appropriate
4382- zink: simplify some queue-related query code
4383- zink: refactor clears a little to track a bitfield of enabled clears on the context
4384- zink: trigger pending clears during flush
4385- zink: ci updates
4386- zink: add wrapper to reset batch state structs
4387- zink: call clear() instead of reset() for batch states on context destroy
4388- zink: unify gfx and compute batches
4389- zink: isolate gfx stage bits when updating shader modules
4390- zink: store conditional render predicate to query and split out start/stop
4391- zink: only update conditional render buffer when it needs to be updated
4392- zink: toggle conditional render when beginning/ending a renderpass
4393- zink: ci updates
4394- zink: handle gallium multi draws more effectively
4395- zink: create separate upload mgr for constants
4396- zink: explicitly use stream uploader for staging buffers
4397- zink: add buffer_subdata hook
4398- zink: avoid unnecessary resource refs during descriptor update
4399- zink: remove handling for resource flushing between compute/gfx batches
4400- zink: remove unnecessary flush during image maps
4401- zink: add more rp cache asserts
4402- compiler/spirv: use undefs when extending image coords
4403- zink: don't generate sampled image type for non-sampled images
4404- util/set: stop leaking u32 key sets which pass a mem ctx
4405- lavapipe: fix CmdCopyQueryPoolResults for partial pipeline statistics queries
4406- lavapipe: use the passed offset for CmdCopyQueryPoolResults
4407- lavapipe: stop tracking draw start/count on rendering state
4408- zink: ci updates
4409- lavapipe: ignore templateType when descriptor template isn't for push descriptors
4410- lavapipe: remove lvp_descriptor_update_template::descriptor_set_layout
4411- zink: fix handling for image types in resource_copy_region hook
4412- zink: also fix image buffer layer copying
4413- lavapipe: fix array texture region copies
4414- zink: only do shader updates when relevant stages are dirty
4415- zink: use correct surface ref function for context destroy
4416- zink: stall when we start getting a lot of uncompleted batches
4417- zink: reset all fences when waiting on batch state
4418- zink: fix format support detection for storage texel buffers and shader images
4419- zink: break out image/buffer create info structs into helper funcs
4420- zink: make descriptor state invalidate public
4421- zink: reorder barrier util functions to set up barrier struct before batch
4422- zink: break out barrier struct initializing into helper funcs
4423- zink: create separate vk image/buffer objects for shader image use
4424- zink: incrementally add image usage flags based on device caps
4425- zink: add color output bit and/or use linear tiling for sampled images
4426- zink: check image format props before creating image
4427- zink: toggle between linear/optimal tiling during image creation
4428- zink: flatten out buffer creation usage flags codepath
4429- zink: ralloc shader cache and keys
4430- zink: rework border color handling
4431- zink: clean up query creation failure paths
4432- zink: create result buffers for all query streams
4433- zink: remove flush from query buffer copy
4434- zink: manually handle more bool query types for copying
4435- zink: remove special casing for occlusion qbos
4436- zink: rewrite query internals
4437- zink: bump pools up to 5k queries each
4438- zink: don't use PARTIAL bit for query results with time queries
4439- zink: reorder availability handling for (user) qbos
4440- zink: remove explicit fencing for query results
4441- zink: ci updates
4442- lavapipe: refactor base draw dispatch to handle multidraws
4443- lavapipe: refactor indexed draw dispatch to handle multidraws
4444- aux/draw: stop copying draw params unnecessarily
4445- aux/draw: rewrite PRIM_RESTART_LOOP macro as a function
4446- aux/draw: pass the full draw params through to draw_instances()
4447- aux/draw: pass the full draw params through to draw_pt_arrays_restart()
4448- aux/draw: move draw param sanitization to end of function
4449- aux/draw: track increment_draw_id value from draw info
4450- aux/draw: pass full draw params to draw_pt_arrays()
4451- llvmpipe: stop flattening multidraws
4452- lavapipe: ignore unused clearvalues when beginning renderpass
4453- zink: rework texture_barrier hook
4454- zink: move update_descriptors & related funcs to zink_descriptors.c
4455- zink: move descriptor barrier handling to main update function
4456- zink: simplify some descriptor update function parameters
4457- zink: use GENERAL layout for sampler images that are also bound as shader images
4458- zink: rework some includes
4459- zink: rework memory_barrier hook
4460- zink: add locking for descriptor pools
4461- zink: add locking for resource maps
4462- zink: manually invoke cpu detection during screen init
4463- zink: add locking for batch states
4464- zink: add function for checking whether a batch is done
4465- zink: split fence finish func
4466- zink: add locking for fence resources
4467- zink: explicitly reset a couple more batch state members
4468- zink: assume fence has already completed if a batch state isn't found
4469- zink: rename init_batch_state to get_batch_state
4470- zink: store context to batch state
4471- zink: make a local screen pointer in zink_flush
4472- zink: remove zink_fence_init()
4473- zink: move VkQueue to batch object
4474- zink: break out queue submit into separate functions
4475- zink: also check for device lost reset on flush
4476- zink: remove zink_create_fence()
4477- zink: track coherent resource objects
4478- zink: use cached memory for all resources when possible
4479- radv: stop zeroing radv_draw_info during draw
4480- radv: refactor draw dispatch
4481- radv: track whether gl_BaseInstance is used
4482- radv: simplify vs draw param counting during setup
4483- radv: set gfx pipeline vtx_emit_num to the number of sgprs
4484- radv: track whether drawid is used on the pipeline struct
4485- radv: track whether baseinstance is used on the pipeline struct
4486- radv: break out vertex shader param emission into separate function
4487- radv: make vertex param sgpr count more explicit
4488- radv: reorder vertex shader params
4489- radv: don't emit baseinstance and drawid if neither is used
4490- radv: don't reset vertex state params on pipeline bind if reg layout matches
4491- zink: implement threaded context
4492- zink: ci updates
4493- zink: handle PIPE_MAP_DONTBLOCK for buffer read maps
4494- zink: add set_context_param hook
4495- zink: add batch tracking id for program struct
4496- zink: track last completed batch id to optimize checking states
4497- zink: handle expired deferred fences more reasonably
4498- zink: hook up timeline semaphore signalling during batch submission
4499- zink: add timeline semaphore fastpath for checking/triggering batch completion
4500- zink: optimize batch states for timeline use
4501- zink: enforce device lost status
4502- zink: be more explicit about blit layer/depth usage
4503- zink: use VkSubresourceLayout::depthPitch as layer_stride when mapping 3D imgs
4504- zink: zink_push_constant -> zink_gfx_push_constant
4505- zink: use max_rt to determine number of blend state attachments
4506- zink: emit ImageCubeArray cap when accessing arrayed cube dimension images
4507- zink: fix layercount for array texture blits
4508- zink: add some asserts to avoid zero-sized blit regions
4509- features: mark off ARB_compute_variable_group_size for zink
4510- features: mark off GL_OES_viewport_array for zink
4511- zink: store shader_info to ntv_context struct
4512- zink: only emit SpvCapabilitySampleMaskPostDepthCoverage if the mode is set
4513- zink: enable PIPE_CAP_TGSI_TES_LAYER_VIEWPORT
4514- features: mark off ARB_shader_viewport_layer_array for zink
4515- zink: avoid cached memory allocations when not requested
4516- util/threaded_context: support pipe_context::set_sample_locations
4517- zink: hook up cs push constant for nir_intrinsic_load_work_dim
4518- zink: use better usage flags for staging resources
4519- zink: use vkGetPhysicalDeviceFormatProperties2 when available
4520- zink: use 2 variant to check image format props during create
4521- zink: only use host mem for staging resources with linear tiling
4522- zink: move cmdpool reset to batch state reset
4523- zink: split total_mem off to total_video_mem, use total_mem for tc
4524- zink: relax maybe_flush mem threshold
4525- zink: relax maybe_flush batch count threshold
4526- zink: check last_finished first in fence_finish early out case
4527- zink: defer timestamp query pool resets to end_query
4528- zink: reset queries when suspending if >50% of total pool is used
4529- zink: don't use cached mem for staging resources
4530- zink: flag DYNAMIC resources as coherent
4531- zink: drop VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT from compute path
4532- aux/trace: add a set_inlinable_constants hook
4533- intel: avoid dumping null cs sampler/binding states
4534- zink: emit WorkgroupSize when not using ExecutionModeLocalSize
4535- lavapipe: add some asserts for blit region extents
4536- zink: export PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER
4537- aux/trace: add screen deduplication for zink+lavapipe tracing
4538- aux/trace: add a bunch of methods for lavapipe
4539- util/set: add macro for destructively iterating set entries
4540- util/hash_table: add macro for destructively iterating entries
4541- aux/trace: add GALLIUM_TRACE_TRIGGER mode
4542- zink: add a pipe_screen::finalize_nir hook
4543- zink: implement uniform inlining
4544- zink: add env var to force uniform inlining
4545- zink: remove atomic usage from batch tracking comparisons
4546- zink: bypass separate stencil path in resource_reference_rw when not a zs image
4547- zink: fix conditional when assigning tess variable io
4548- zink: stop unmapping resources
4549- zink: simplify clear-apply on fb state change
4550- zink: use set_foreach_remove()
4551- zink: use explicit subpass deps
4552- zink: hook up EXT_fragment_shader_interlock
4553- zink: support ARB_fragment_shader_interlock
4554- aux/trace: dump all the blend state members
4555- features: mark off ARB_fragment_shader_interlock for zink
4556- gallium/threaded_context: add another rule for buffer mapping
4557- zink: fix CI flakiness in glx-multithread-clearbuffer
4558- zink: make timeline semaphores per-screen
4559- zink: handle checking batch completion from other contexts without timelines
4560- zink: only unmap PIPE_MAP_ONCE in synchronous mode
4561- zink: don't lose existing pNext when using wsi_image_create_info in image creation
4562- anv: fix debugoptimized build compile
4563- zink: move descriptor state management to descriptors.c
4564- zink: make a bunch of descriptor functions static
4565- zink: create separate linear tiling image for scanout
4566- zink: flag anv for mesa image create wsi
4567- zink: disable mutable formats for zs formats and scanout images
4568- aux/trace: enhance trigger mode to dump context states during bind
4569- aux/trace: dump current fb state on trigger-mode draw if it hasn't been seen yet
4570- aux/trace: do deep dumps of fb state for triggered traces
4571- aux/trace: use ralloc_free for ralloc()ed state pointers
4572- zink: compare against screen batch id when determining which semaphore to use
4573- zink: always copy the nir shader before compiling
4574- zink: fix tcs slot map eval for user vars
4575- zink: fix tcs input reservation for user vars
4576- zink: merge copy-to-scanout path into non-deferred flush path
4577- zink: force scanout sync when mapping scanout resource
4578- zink: use undefined layout for first scanout obj transition
4579- zink: move scanout sync to end of batch
4580- zink: add a flag indicating whether scanout object needs updating
4581- zink: move wsi flush info conditional to queue submission
4582- zink: directly set batch->state->flush_res from flush_resource hook
4583- zink: add clear-on-flush mechanic deeper into flush codepath
4584- Revert "zink: force scanout sync when mapping scanout resource"
4585- softpipe: fix render condition checking
4586- softpipe: fix streamout queries
4587- softpipe: ci updates
4588- zink: track persistent resource objects, not resources
4589- zink: restore previous semaphore (prev_sem) handling
4590- zink: use cached memory for staging resources
4591- zink: only reset query on suspend if the query has previously been stopped
4592- zink: when performing an implicit reset, sync qbos
4593
4594Nanley Chery (22):
4595
4596- gallium: Map _DRI_IMAGE_FORMAT_NONE to NULL
4597- gallium: Flush GL API resources in eglCreateImage
4598- iris: Disable aux as needed in iris_flush_resource
4599- blorp: Assert 8x4 alignment for a HiZ op on Gen8-9
4600- i965,iris: Delete misleading HiZ sampling comments
4601- iris: Drop an XXX comment about sampling HiZ arrays
4602- iris: Drop a stale comment about HiZ sampling
4603- iris: Delete redundant assertion in iris_hiz_exec
4604- iris: Drop batch param from iris_resource_prepare_render
4605- iris: Fix the depth aspect aux usage in iris_blit
4606- iris: Keep aux_usage in iris_blorp_surf_for_resource
4607- iris: Fix aux usage of depth buffer prepare/finish
4608- iris: Loosen aux state getter/setter assert on HiZ
4609- iris: Don't avoid aux state getter/setter with HiZ
4610- iris: Drop iris_resource::aux::has_hiz
4611- iris: Call iris_sample_with_depth_aux earlier
4612- iris: Set BO maps to NULL in bo_free
4613- drm-uapi: Update drm_fourcc.h for new TGL modifier
4614- isl: Describe I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC
4615- iris: Support clear color plane imports for RC_CCS_CC
4616- iris: Support RC_CCS_CC modifier in plane queries
4617- iris: Support I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS_CC
4618
4619Neha Bhende (3):
4620
4621- mesa: set states in fast path for restoring light attributes
4622- gallium/u_vbuf: use updated pipe_draw_start_count while using draw_vbo
4623- nir_to_tgsi: Fix indices for CMP in nir_to_tgsi for nir_op_fcsel
4624
4625Philipp Zabel (1):
4626
4627- meson: Fix missing xcb-xrandr dependency for Vulkan X11 WSI
4628
4629Pierre Moreau (4):
4630
4631- docs/features: Add OpenCL status
4632- spirv: Ignore WorkgroupSize in non-compute stages
4633- nv50: Replace hardcoded texture/constbuf count with define
4634- nv50: Update texture indices to match stage indices
4635
4636Pierre-Eric Pelloux-Prayer (55):
4637
4638- ac: add ifdef __cplusplus guard to header
4639- radeonsi: invalidate compute sgprs in si_rebind_buffer
4640- radeonsi: inhibit clockgating when using SQTT
4641- ci: split src/mesa/\**/* matching rule
4642- radeonsi/sqtt: use more event identifier
4643- radeonsi/sqtt: fix SQTT bo size overflow
4644- radeonsi/sqtt: allow AMD_THREAD_TRACE_TRIGGER to be a frame number
4645- radeonsi/sqtt: forward string markers to sqtt
4646- radeonsi: don't use cp_dma prefetch on GFX6
4647- gallium/u_upload_mgr: lower risk of hitting an assert
4648- radeonsi: fix indentation issue in si_texture.c
4649- radeonsi: store si_context::xxx_shader members in union
4650- radeonsi: fix read from compute / write from draw sync
4651- radeonsi: fix si_check_render_feedback
4652- radeonsi: replace force_cp_dma arg of si_clear_buffer by enum
4653- radeonsi: enable dcc image stores on gfx10+
4654- radeonsi: force dcc clear to use compute clear
4655- mesa: update vao _EnabledWithMapMode in copy_array_object
4656- radeonsi: properly set SPI_SHADER_PGM_HI_ES
4657- ac/rgp: make the max gap between shader code a warning
4658- ac/rtld: make ac_rtld_upload returns the code size
4659- ac/rgp: move radv/sqtt functions to ac
4660- radeonsi/sqtt: keep a copy of the uploaded shader code
4661- radeonsi/sqtt: remove duplicate token
4662- radeonsi/sqtt: don't always use WGP 0
4663- radeonsi/sqtt: export shader code to RGP
4664- radeonsi/sqtt: fix user event max size
4665- frontends/va: fix protected slice data buffer read size
4666- mesa/st: fix lower_tex_src_plane in multiple samplers scenario
4667- dlist: remove ListExt feature
4668- mesa: remove 2 recursive lock usages of _mesa_HashTable
4669- mesa/hash: make the mtx non-recursive
4670- mesa/hash: switch to simple_mtx
4671- mesa: make _mesa_HashTable InDeleteAll debug only
4672- vbo/dlist: use DrawGallium(Complex)
4673- nir/lower_tex: ignore texture_index if tex_instr has deref src
4674- mesa/st: fix st_nir_lower_tex_src_plane arguments
4675- mesa/st: ignore texture_index if tex_instr has deref src
4676- gallium/u_threaded: split draws that don't fit in a batch
4677- st/draw: remove st_draw_vbo
4678- vbo: inline vbo_primitive_restart in brw_primitive_restart
4679- radeonsi/rgp: export barriers
4680- radeonsi/rgp: export compute shader programs
4681- gallium/u_threaded: skip refcounting only once
4682- driconf: add workarounds for Teardown
4683- amdgpu,radeon: add needs_reset param to ctx_query_reset_status
4684- radeonsi: submit cs to failed context instead of skipping them
4685- radeonsi: use SI_CONTEXT_FLAG_AUX when recreating the aux context
4686- radeonsi: do not recreate the aux context from the aux context
4687- radeonsi: only recreate the aux_context when soft recovery failed
4688- radeonsi: re-create the aux context in si_create_context
4689- amdgpu,radeon: add full_reset_only param to ctx_query_reset_status
4690- radeonsi: avoid querying gpu state if possible
4691- r600/sb: Use assignments for resetting struct r600_sb::literal
4692- driconf: add workaround for Golf With Friends
4693
4694Qiang Yu (1):
4695
4696- lima: fix xserver page flip fail for full screen client
4697
4698Rhys Perry (141):
4699
4700- nir/loop_unroll: unroll more aggressively if it can improve load scheduling
4701- aco: fix convert_to_SDWA() check in add_subdword_definition()
4702- aco: add test for incorrect convert_to_SDWA() check
4703- radv: fix max_waves estimation on GFX10.3
4704- aco: fix num_waves on GFX10+
4705- aco: have emit_wqm() take Builder instead of isel_context
4706- aco: add emit_mimg() helper
4707- aco: move VADDR to the end of the operand list
4708- aco: use non-sequential addressing
4709- aco: only require texture coordinates to be in WQM if NSA is used
4710- aco: add affinity for non-sequential MIMG operands
4711- radv,aco: don't use MUBUF for multi-channel loads on GFX8 with robustness2
4712- nir/lower_io: fix array_length lowering if buffer is smaller than offset
4713- radv,aco: use deref_buffer_array_length
4714- radv: use nir_opt_access
4715- nir/sink,nir/move: sink/move reorderable load_ssbo
4716- radv: sink load_ssbo
4717- aco: don't consider a phi trivial if same's register doesn't match the def
4718- aco: remove Format::{VOP3A,VOP3B}
4719- aco: add instruction cast and format-check methods
4720- aco: use instruction cast methods
4721- aco: use format-check methods
4722- aco: return references in instruction cast methods
4723- aco: fix WQM for texture instructions with args before the coordinates
4724- nir/opt_uniform_atomics: recognize more complicated invocation comparisons
4725- nir/opt_uniform_atomics: fix elect detection
4726- aco: disable a*1.0 optimization if the instruction is precise
4727- nir/algebraic: optimize out exact a*1.0 if it's used only as a float
4728- aco: optimize a*0.0
4729- aco: optimize out a*1.0 if it's used as a float
4730- nir/algebraic: optimize out exact a+0.0 if it's used only as a float
4731- nir/algebraic: eliminate exact a*0.0 if float execution mode allow it
4732- aco: don't affect isPrecise() after applying output modifiers
4733- nir,spirv: allow non-uniform OpArrayLength
4734- radv,ac/nir: implement non-uniform get_ssbo_size
4735- aco: implement non-uniform get_ssbo_size
4736- radv: round-up num_records division in radv_flush_vertex_descriptors
4737- radv: correctly enable WGP_MODE for NGG and GS
4738- radv: correctly enable WGP_MODE for tessellation control
4739- aco: add fallback algorithm in get_reg()
4740- aco: always set exec_live=false
4741- aco: optimize AC_FETCH_FORMAT_SNORM alpha adjust
4742- aco: do not flag all blocks WQM to ensure we enter all nested loops in WQM
4743- aco: rewrite setting of Exact_Branch
4744- aco: remove loop to flag loop blocks as WQM
4745- aco: fix adjust_vertex_fetch_alpha
4746- radv: use a more relaxed alignment for upload buffer allocations
4747- radv: fix max_lds_per_simd on GFX10
4748- radv: switch MaxWaves statistic to wave32 waves
4749- ac: split lds_granularity into encode and allocation granularities
4750- radv: use lds_{encode,alloc}_granularity
4751- radv: round up max_lds_per_simd / lds_per_wave
4752- aco: fix waves calculation for wave32
4753- aco: add Program::wgp_mode
4754- radv,aco: add radv_nir_compiler_options::wgp_mode
4755- aco: consider that GFX10.3 allocates LDS in 1024 byte blocks
4756- aco: add DeviceInfo
4757- aco: fix transition_to_{WQM,Exact} if exec.back() is not in exec
4758- radv: relax shared alignment requirements in mem_vectorize_callback
4759- radv,aco: allow unaligned LDS access on GFX9+
4760- aco/lower_phis: fix all_preds_uniform with continue_or_break
4761- nir/dce: replace instruction worklist with ssa def bitset
4762- nir: inline nir_foreach_{src,dest}
4763- nir/dce: perform DCE for unlooped instructions in a single pass
4764- aco: calculate all p_as_uniform and v_readfirstlane_b32 sources in WQM
4765- aco: use p_as_uniform for get_sampler_desc and convert_pointer_to_64_bit
4766- nir: fix build at -O1
4767- nir: add nir_ssa_def_is_unused()
4768- nir/copy_prop: remove unused copies
4769- nir/copy_prop: visit copies instead of sources
4770- nir/copy_prop: use nir_{instr,if}_rewrite_{src,condition}_ssa
4771- Revert "radv,aco: allow unaligned LDS access on GFX9+"
4772- aco: add missing usable_read2 check
4773- nir/opt_shrink_vectors: add option to skip shrinking image stores
4774- radv: don't shrink image stores for The Surge 2
4775- radv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11
4776- radv,aco: remove aco_compiler_statistics
4777- radv: cache pipeline statistics
4778- aco: set compr for fp16 exports
4779- radv/llvm: fix enabled_channels for compressed exports
4780- aco: simplify loop_nest_depth tracking in isel
4781- aco: track divergent and uniform branch depth
4782- aco: move wait_imm to aco_ir.h
4783- aco: lower p_constaddr into separate instructions earlier
4784- aco: add instruction classes
4785- aco: add latency and inverse throughput statistics
4786- aco: add print option to print program without temporary IDs
4787- aco: add ACO_DEBUG=perfinfo
4788- aco: remove vmem/smem score statistics
4789- aco: fix NSA MIMG followed by MUBUF/MTBUF
4790- aco/tests: add test for NSAToVMEMBug
4791- aco: fix NSA following writelane
4792- aco/tests: add test for waNsaCannotFollowWritelane
4793- nir: Don't update base in vectorize_loads()
4794- aco: implement 64-bit VGPR {u,i}find_msb
4795- aco: use uadd32_sat() helper for nir_op_uadd_sat
4796- aco: use a single instruction for uadd32_sat() on GFX8
4797- aco: implement image_deref_samples
4798- aco: add aco_print_program() flag to print kill flags
4799- aco: add aco_print_program() flags to print live_out and register demand
4800- docs: document ACO_DEBUG=perfinfo
4801- aco: add ACO_DEBUG=liveinfo
4802- radv: lower variables to ssa before nir_propagate_invariant
4803- radv: lower view_index to zero if multiview is disabled
4804- ci: add expected fail for RADV
4805- aco: don't optimize min(a*1.0, ...) to min(a, ...) on GFX8
4806- aco: use -1.0*x and 1.0*|x| for fneg/fabs
4807- aco/tests: add tests for denormal-aware propagation
4808- ac: invalidate metadata after hs_emit_write_tess_factors()
4809- aco/tests: fix isel.sparse.clause for LLVM 12+
4810- lavapipe: fix initialization of pipe_stream_output with unwritten outputs
4811- nir/gather_info: implement partial masking of struct and compact I/O
4812- nir/lower_tex: handle deref casts
4813- nir_to_tgsi: run constant folding after nir_opt_algebraic
4814- aco: fix integer tg4 workaround with unnormalized coordinates
4815- draw: fix pstipple, aaline and aapoint without LLVM
4816- aco: ensure loops nested in a WQM loop are in WQM
4817- nir/gather_info: fix partial masking of compact I/O with location_frac!=0
4818- radv: remove second nir_lower_idiv
4819- nir/lower_idiv: add options to use fp32 for 8-bit division lowering
4820- nir/lower_idiv: make lowered divisions exact
4821- aco: fix 16-bit u2f32
4822- aco: fix 16-bit f2{u8,i8} on GFX6/7
4823- radv: don't use fp16 for 8-bit division lowering before GFX9
4824- nir: add nir_block_get_predecessors_sorted() helper
4825- nir/lcssa: fix nondeterminism in predecessor iteration
4826- nir/loop_unroll: fix is_indirect_load() with load_global
4827- radv: fix conditions for running nir_opt_vectorize
4828- aco/ra: use original names when renaming loop carried phi operands
4829- aco/ra: remove live-in temporary from live_out_per_block when moving it
4830- radv: fix barrier in radv_decompress_dcc_compute shader
4831- radv: fix clearing DCC-compressed e5b9g9r9 images
4832- aco: set TRUNC_COORD=0 for nir_texop_tg4
4833- ac/nir: set TRUNC_COORD=0 for nir_texop_tg4
4834- Revert "radeonsi: set TRUNC_COORD=0 for Total War: WARHAMMER to fix it"
4835- aco: don't update register demand during RA validation
4836- aco: allow SDWA sels smaller than the operand size
4837- radv: disable VK_FORMAT_R64_SFLOAT
4838- vulkan: fix use-after-free in vk_common_DestroyDebugReportCallbackEXT
4839- radv: fix use-after-free upon GS copy shader cache hits
4840- radv,ac/llvm: use a dword alignment for descriptor loads
4841
4842Rob Clark (143):
4843
4844- freedreno/ir3: Fix ldg decoding/parsing
4845- freedreno/ir3: Decouple ir3_info collection from assembler
4846- freedreno/ir3: Add some new "logical" opcodes
4847- freedreno/hw: Add isaspec mechanism for documenting/defining an ISA
4848- freedreno/hw/isa: Add description of ir3 ISA
4849- freedreno/hw/isa: Add expression caching
4850- freedreno/ir3/tests: Switch disasm test over to new decoder
4851- freedreno/ir3: Switch over to new encoder/decoder
4852- freedreno/ir3: Small resinfo disasm tweak
4853- freedreno/ir3: Better sstall estimation
4854- freedreno/ir3: Realign disasm shader stats
4855- freedreno/ir3/decode: Switch over to new disasm
4856- freedreno/ir3: Remove legacy packed-struct encoding
4857- frontend/dri: Expose RGB[AX]_SRGB as well
4858- freedreno/isa: Fix branch/jump offset encoding
4859- freedreno/a6xx: Add r2d support for GMEM resolves
4860- gallium/util: Add helpers to determine if z/s is written
4861- freedreno/a6xx: Don't early-z if there are stencil writes
4862- r300: Use util_writes_depth_stencil() helper
4863- radeonsi: Use util_writes_stencil() helper
4864- freedreno: Add perf_warn() trace helper
4865- freedreno: Add fmt/args macros for pipe_resource
4866- freedreno/a6xx: Add helper to check if UBWC is supported
4867- freedreno: Add perf_warn() for missed UBWC opportunities
4868- ci/freedreno/a6xx: Skip vs-output-array-vec2-index-wr-before-gs
4869- freedreno/a6xx: Fix 3dmark misrendering with unwritten MRTs
4870- mesa: Remove _mesa_destroy_context()
4871- freedreno/decode: Fix overflow
4872- freedreno: Put an upper limit on VSC size
4873- freedreno: Misc cleanup
4874- freedreno/a5xx: Drop fd5_compute_stateobj
4875- freedreno/a6xx: Drop fd6_compute_stateobj
4876- freedreno/ir3+a5xx+a6xx: De-duplicate create_compute_state()
4877- freedreno/ir3: Add ir3_shader_state
4878- freedreno/ir3: Move ir3_compiler_create()
4879- freedreno/ir3: Add ir3_screen_fini()
4880- freedreno/ir3: Reshuffle ir3_shader_create()
4881- freedreno/ir3: Reshuffle compute state creation
4882- freedreno/ir3: Async shader compile
4883- freedreno/ir3: Add missing shader prog cache invalidation
4884- freedreno: Quiet fallthrough warnings
4885- freedreno: Split batch_flush_reset_dependencies()
4886- freedreno: driver-thread annotations
4887- freedreno/ir3/print: More sane ssa src/dst display
4888- freedreno/ir3/print: Improve branch printing
4889- util/fossilize_db: Fix compile error with clang
4890- freedreno: Handle InvalidateBufferData() case
4891- freedreno: Add perf_debug logging for bo stalls
4892- freedreno: Workaround for UNSYNC+DISCARD_RANGE
4893- driconf: Generate a static table when no xmlconfig
4894- xmlconfig: Reshuffle to keep attr processing
4895- xmlconfig: Add static driconfig support
4896- freedreno/ir3: Drop foreach_bit() macro
4897- freedreno: Drop foreach_bit() macro
4898- etnaviv: Drop foreach_bit() macro
4899- v3d: Drop foreach_bit() macro
4900- freedreno: Fix think-o in fd_resource_wait()
4901- freedreno/ir3: Fix initial_variants_synchronous() condition
4902- freedreno: Add FD_DBG() macro
4903- freedreno: Slight perf_debug rework
4904- freedreno: Add macro for duration based warns
4905- util/u_queue: Ensure num_cpu_mask_bits is valid
4906- util: Add accessor for util_cpu_caps
4907- freedreno/a6xx: Always pass ctx to fd6_emit_textures()
4908- freedreno/a6xx: Fix uncompressed resource vs stale CSO
4909- freedreno/ir3: Add comments about shader key/gen
4910- freedreno: Deduplicate fixup_shader_state()
4911- freedreno/a6xx: Fix compile warning
4912- driconf: Add ignore_map_unsynchronized option
4913- freedreno: Remove dead-cells MBR workaround
4914- util: Extract thread-id helpers from u_current
4915- gallium/u_threaded: Add helper to assert driver thread
4916- gallium/u_threaded: use mesa_log for debug msgs
4917- freedreno: Fix u_blitter constant-buffer leak
4918- freedreno: Factor out common fd_resource init
4919- freedreno: Split out batch/resource tracking
4920- freedreno: Restructure transfer_map()
4921- freedreno: Extend threaded_resource
4922- freedreno: Extend threaded_transfer
4923- freedreno: Extract out helper for transfer-map flag munging
4924- freedreno: Add fd_replace_buffer_storage()
4925- freedreno: Add transfer_pool_unsync
4926- freedreno/a6xx: Move UBWC demotion to first sampler view bind
4927- freedreno: Check cb0 in rebind_resource()
4928- freedreno: threaded_context support
4929- freedreno: threaded_context async flush support
4930- freedreno: Fix fd_fence_finish()
4931- freedreno/drm: Avoid unitialized timestamp in submit fail
4932- freedreno/drm: Split softpin "reloc" functions
4933- freedreno/drm: Split 64b vs 32b paths
4934- freedreno/drm: Move emit_reloc_tail to head
4935- freedreno/drm: Inline iova calculation
4936- freedreno/ir3: Precompute whether we need driver-params
4937- freedreno: Add helpers to mark dirty state
4938- freedreno: Add mapping to generation specific dirty state
4939- freedreno/a6xx: Convert to dirty_groups
4940- freedreno: Small dirty flag re-org
4941- freedreno: Add dirty bit for state that needs rsc tracking
4942- freedreno: Don't ignore geom/tess stage resources
4943- freedreno: Split out helper for updating sw stats
4944- freedreno: Only collect sw stats when required
4945- freedreno/a6xx/vsc: Be more tolerate of degenerate prims
4946- freedreno: Drop u_trim_pipe_prim() from fast-paths
4947- u_draw: Add helper to emultate multi-draw
4948- freedreno: Use multi-draw helper
4949- freedreno: Handle multi-draw edge cases
4950- freedreno: Push multi-draw closer to backend
4951- freedreno/a6xx: Emit streamout state on every draw
4952- freedreno: Add draw cost estimation
4953- freedreno/batch: Export key/hash fxns
4954- freedreno/batch: Add a way to clone a batch key
4955- freedreno: Add gmem_reason_mask
4956- freedreno/a6xx: Fix sRGB/snorm vs sysmem clear path
4957- freedreno: Autotune bypass vs GMEM rendering decision
4958- freedreno/a6xx: Fix typo
4959- freedreno: Make headers C++ happy
4960- freedreno/fdperf: Use os_read_file()
4961- freedreno: Split out devicetree helpers
4962- ci: Disable panfrost t760
4963- freedreno/a6xx: Fix indirect+patches draws
4964- freedreno/a6xx: Fix obsolete comment
4965- d3d12: Use util_draw_multi() helper
4966- etnaviv: Use util_draw_multi() helper
4967- i915: Use util_draw_multi() helper
4968- iris: Use util_draw_multi() helper
4969- lima: Use util_draw_multi() helper
4970- llvmpipe: Use util_draw_multi() helper
4971- nouveau: Use util_draw_multi() helper
4972- r300: Use util_draw_multi() helper
4973- r600: Use util_draw_multi() helper
4974- softpipe: Use util_draw_multi() helper
4975- svga: Use util_draw_multi() helper
4976- tegra: Use util_draw_multi() helper
4977- vc4: Use util_draw_multi() helper
4978- v3d: Use util_draw_multi() helper
4979- virgl: Use util_draw_multi() helper
4980- freedreno: Don't handle multi-draw in indirect case
4981- util/primconvert: Handle indirect and multi-draw
4982- freedreno: Add .clang-format
4983- freedreno: Some manual reformatting
4984- freedreno: Re-indent
4985- freedreno: Manual fixups
4986- freedreno: Add missing foreach macros and update indentation
4987
4988Rohan Garg (8):
4989
4990- virgl: Cache depth and stencil buffers
4991- ci: Ensure that jobs inherting the ci-deqp jobs artifact meson logs
4992- intel/genxml: Free resource before exiting
4993- intel/compiler: Free resources on test teardown
4994- virgl: update headers
4995- virgl: Return total video memory if available
4996- virgl: Add support for querying detailed memory info
4997- virgl: Support the ETC1_RGB8 format as virglrenderer supports it
4998
4999Roman Stratiienko (1):
5000
5001- egl: android: use num_planes param in createImageFromDmaBufs()
5002
5003Ruijing Dong (1):
5004
5005- radeon/vcn: release si buffer for encoding at the end.
5006
5007Ryan Neph (1):
5008
5009- Revert "virgl: fix BGRA emulation artifacts during window resize"
5010
5011Sagar Ghuge (7):
5012
5013- anv: Invalidate the correct AUX-TT entry
5014- anv: Skip CCS ambiguate which preceed fast-clears
5015- intel/mi_builder: Added support for command streamer shift operations
5016- anv: Add anv_memregion structure
5017- Revert "Revert "blorp/gen12: Don't use aux address if implicit CCS""
5018- intel/blorp: Fix condition to figure out aux_address
5019- anv: Set correct binding table entry count
5020
5021Samuel Iglesias Gonsálvez (9):
5022
5023- turnip: disable UBWC on Z24_S8 MSAA images on A630
5024- turnip: set sparseAddressSpaceSize to zero
5025- turnip: fix UINT64_MAX size wrapping in tu_GetBufferMemoryRequirements()
5026- turnip: fix resolve MSAA D24_UNORM_S8_UINT image to S8_UINT
5027- turnip: fix resolve MSAA D32_SFLOAT_S8_UINT image to S8_UINT
5028- util: fix parsing of /proc/meminfo MemAvailable value
5029- turnip: keep track of memory heap usage, size and flags
5030- turnip: VK_EXT_memory_budget implementation
5031- turnip: set depth plane control zmode to A6XX_LATE_Z when sample mask is written
5032
5033Samuel Pitoiset (218):
5034
5035- radv: do not invalidate the L2 metadata cache on compute queues
5036- ci: mark some sparse CTS as expected failures on RAVEN
5037- radv: flush L2 metadata as part of CB/DB flush instead of CS_DONE on GFX9
5038- radv: add a comment explaining the micro tile mode resolve
5039- radv: enable TC-compat HTILE with D32S8 and MSAA on GFX9+
5040- radv: enable TC-compat HTILE for D16S8 on GFX9+
5041- radv: restore invalidating the vector cache for internal meta operations
5042- radv: flush L2 for images affected by the pipe misaligned issue on GFX10+
5043- ci: exclude one CTS test that timeout most of the time for RADV CI
5044- radv: remove redundant check in radv_process_depth_stencil()
5045- radv: remove unnecessary radv_image::tc_compatible_htile
5046- radv: remove redundant check in depth_view_can_fast_clear()
5047- radv: fix a sync issue with geometry shader primitives query on GFX10+
5048- radv: fix overflow when computing the SQTT buffer size
5049- radv: inhibit clock gating when tracing with SQTT
5050- ac/rgp: add support for GFX10.3
5051- ac,radv: add SQTT support on GFX10.3
5052- radv: enable SQTT support on GFX10.3
5053- radv: fix separate depth/stencil layout in render pass
5054- radv: add multi-layer support to FMASK color expand
5055- radv: use the range aspect mask in FMASK color expand
5056- radv: use a workgroup size of 8x8 for FMASK color expand
5057- radv: only decompress the depth/stencil aspect that needs to be resolved
5058- radv: enable sparseImageInt64Atomics/sparseImageFloat32Atomics
5059- radv,aco: fix shifting input VGPRs for the LS VGPR init bug on GFX9
5060- radv: synchronize Cmd{Set,Write}Event() using PS_DONE/CS_DONE events
5061- radv: add support for emitting PS_DONE/CS_DONE on GFX6-8
5062- radv: remove radv_util.h
5063- radv: remove stub() macros
5064- radv: remove unused EMPTY constant in radv_descript_set.c
5065- nir/algebraic: mark more optimization with fsat(NaN) as inexact
5066- ac/surface: store HTILE mip info into the surface
5067- radv: use the image view range when fast clearing depth
5068- radv: check if HTILE is enabled per-level instead of the entire image
5069- radv: do not decompress/resummarize levels without HTILE
5070- radv: remove mipmaps related assertions when initializing HTILE
5071- radv: add support for fast clearing levels of the HTILE buffer
5072- radv: teach radv_htile_enabled() about the number of HTILE levels
5073- radv: enable TC-compat HTILE for mipmaps on GFX10+
5074- radv: re-disable TC-compat HTILE for D32S8 on all generations
5075- radv: fix centroid with VRS coarse shading
5076- radv/winsys: move the initial BO domain to radeon_winsys_bo
5077- radv: prefer CP DMA for GTT buffer copies/clears on dGPUs due to slow PCIe
5078- radv: fix waiting on the last enabled RB for occlusion queries
5079- radv/winsys: use an array for the global BO list instead of a list
5080- radv/winsys: remove the radv_amdgpu_winsys_bo::ws indirection
5081- radv/winsys: remove useless continue preamble CS for IBs path
5082- radv/winsys: remove useless is_local check in radv_amdgpu_cs_add_buffer()
5083- radv/winsys: remove unused radeon_bo_usage enum
5084- radv/winsys: simplify the user fence logic for submission
5085- radv/winsys: remove unused fields in radv_amdgpu_cs_request
5086- radv/winsys: stop zeroing radv_amdgpu_cs_request
5087- radv: use less AMDGPU contexts by creating only one per queue priority
5088- radv: add radeon_winsys_bo::use_global_list
5089- radv: stop using VM_ALWAYS_VALID on APUs
5090- radv/winsys: move the debug_all_bos check outside of the add/del helpers
5091- radv/winsys: set use_global_list to avoid adding a BO twice
5092- radv/winsys: add buffer_make_resident() to the API
5093- radv/winsys: add the resident BOs to the list of BOs at submit time
5094- radv/winsys: enable the global BO list unconditionally
5095- radv: use the global BO list from the winsys
5096- radv: fix printing the debug option names
5097- radv: fix double free when creating a fence failed
5098- radv: stop allocating useless ESGS scratch BO on GFX10+
5099- radv: fix memory leaks if a submission fails
5100- radv: do not overallocate the SQTT buffer
5101- radv: adjust an error message related to the SQTT buffer size
5102- radv: add support for resizing the SQTT buffer automatically
5103- ac/rgp: append the number of seconds to the generated RGP file
5104- radv: emit pipeline bind markers for SQTT
5105- radv: only make the WSI images resident if the global BO list is used
5106- radv/winsys: set use_global_list inside the critical section
5107- radv: only apply the MRT output NaN fixup to non-meta shaders
5108- radv: create the start/stop CS for SQTT dynamically
5109- radv: move SQTT parameters initialization to radv_thread_trace_init()
5110- radv: remove an outdated TODO about SQTT cache flushes
5111- radv: make sure to allocate enough space when emitting SQTT userdata
5112- radv: stop emitting pipeline bind markers
5113- radv: do not allow to capture SQTT on the compute queue
5114- radv: add support for user event markers with SQTT
5115- radv: only emit pipeline bind markers for application pipelines
5116- radv: use the pipeline key as hash for pipeline bind markers
5117- radv: set correct value for OFFCHIP_BUFFERING on GFX10+
5118- radv: make the border color BO a resident buffer
5119- radv: make the trace BO a resident buffer
5120- radv: make the TMA/TBA BOs resident buffers
5121- radv: emit the trap handler registers earlier
5122- radv: rework radv_cmd_buffer_resolve_subpass() a bit
5123- radv: emit missing subpass resolve marker for SQTT
5124- ac/rgp: fill CPU info by parsing /proc/cpuinfo
5125- radv: store a pointer to the code in radv_shader_variant
5126- radv: add support for exporting pipelines with RGP
5127- radv: add support for instruction timing with RGP
5128- radv: do not scale the depth bias for D16_UNORM depth surfaces
5129- include/drm-uapi: bump AMDGPU headers
5130- ac/rgp: recognize more memory types
5131- ac/rgp: report LDS size in CU mode on GFX10+
5132- ac/rgp: report the number of memory operations per clock
5133- ac/rgp: report the number of primitives per clock
5134- radv: remove duplicate REG_INCLUDE_CONTEXT setting for SQTT
5135- radv: always select the first active CU when profiling with SQTT
5136- radv: fix exporting SQTT pipelines with LLVM
5137- radv: exclude perf counters for SQTT also on GFX10.3
5138- Revert "radv: do not overallocate the SQTT buffer"
5139- radeonsi,radv: do not overallocate the SQTT buffer size
5140- radv: remove useless decompression of the DS resolve attachment
5141- radv: do not trace inactive shader engines with SQTT
5142- ac/sqtt: fix determining if the trace is complete on GFX10+
5143- radv: double the SQTT buffer size when it is resized
5144- radv: trigger a new SQTT capture automatically after resizing the buffer
5145- radv: bump the initial SQTT buffer size to 32MB per SE
5146- radv: fix RGP barrier layout transition for TC-compatible CMASK images
5147- Revert "radv: stop using VM_ALWAYS_VALID on APUs"
5148- radv: cleanup enabling TC-compat HTILE for depth surfaces
5149- radv: remove useless check about mips+layers for TC-compat HTILE images
5150- radv: skip useless FCE when fast-clearing MSAA images with DCC enabled
5151- radv: re-enable TC-compat HTILE for MSAA D32S8 images on GFX9+
5152- radv: do not declare push constants for DCC decompress on compute
5153- radv: check if dynamic VRS state changed
5154- radv: check if dynamic line stipple state changed
5155- radv: disable sampling with VK_FORMAT_R64_SFLOAT
5156- radv: fix meta save/restore state with non renderable images
5157- radv: fix potential clears with non renderable images on GFX9+
5158- radv: fix initialization of disable_compression when clearing color image
5159- radv: add missing SQTT events for copy_commands2/create_renderpass2
5160- radv: remove useless DCC disable check for 3D images on GFX10+
5161- radv: rework radv_use_dcc_for_image() a bit
5162- vulkan: add missing vk_shader_module.c/h includes to Makefile
5163- radv: use common entrypoints for VK_KHR_copy_commands2
5164- radv: do not enable TC-compat CMASK if the image isn't readable by a shader
5165- radv: remove redundant check when enabling TC-compat CMASK
5166- radv: make sure FMASK is enabled for TC-compat CMASK
5167- radv: only configure the CMASK tiling for TC-compat on GFX8
5168- radv: initialize TC-compat CMASK images with the DCC clear code
5169- radv: enable TC-compat CMASK on GFX10+
5170- radv: add notccompatcmask debug option
5171- radv: extend the dirty bits to 64-bit
5172- ac/surface: init CMASK slice size on GFX9+
5173- radv: fix clearing CMASK layers on GFX9+
5174- radv: initialize CMASK with correct clear codes
5175- radv: restore previous MRT CB_SHADER_MASK logic
5176- radv: gather if the FS uses perspective or linear interpolations
5177- radv: determine if a pipeline is candidate for flat shading
5178- radv: enable VRS 2x2 coarse shading for flat shading on GFX10.3+
5179- radv: add RADV_DEBUG=novrsflatshading option
5180- ci: update list of expected CTS failures for RADV
5181- vulkan: add common entrypoints for VK_KHR_create_renderpass2
5182- radv: use common entrypoints for VK_KHR_create_renderpass2
5183- turnip: use common entrypoints for VK_KHR_create_renderpass2
5184- lavapipe: use common entrypoints for VK_KHR_create_renderpass2
5185- anv: use common entrypoints for VK_KHR_create_renderpass2
5186- radv: report that degenerated triangles are not culled
5187- radv: require DRM 3.35+
5188- ac/surface: do not allocate FMASK or CMASK for stencil-only surfaces on GFX9+
5189- radv: do not fixup DCC after compute color resolves if DCC stores enabled
5190- radv: only set WRITE_COMPRESS_ENABLE for storage image descriptors
5191- radv: use a sampled image descriptor for reads for the MSAA color decompress
5192- radv: compress FMASK for all layouts except GENERAL
5193- radv: cleanup FMASK expand transitions
5194- radv: do not force enable FMASK during MSAA blits
5195- radv: use COLOR_ATTACHMENT_OPTIMAL for fast clear/hw resolve operations
5196- ac: add ac_get_family_name() helper
5197- radv: change RADV_FORCE_FAMILY to use family name instead of LLVM processor name
5198- radv: try to keep HTILE compressed with DEPTH_STENCIL_READ_ONLY_OPTIMAL
5199- radv: clean up fence syncobj code
5200- ac: add ac_gpu_info::has_image_load_dcc_bug
5201- aco: fix get_sampler_desc() for image loads
5202- aco: implement a workaround for the image load DCC hw bug on GFX10.3
5203- radv: allow DCC for storage images on GFX10.3 with RADV_PERFTEST=dccstores
5204- radv: handle implicit subpass dependencies per attachment
5205- radv: init CMASK/FMASK/DCC in parallel
5206- radv: perform MSAA color decompression for storage images with DCC
5207- radv: enable DCC stores with MSAA 4x/8x on GFX10+
5208- radv: simplify a check when enabling DCC for concurrent images
5209- radv: enable DCC for concurrent images on GFX10
5210- radv: make sure FMASK decompress and FCE are performed on gfx queue
5211- radv: add MSAA support to ClearColorImage() on compute queue
5212- radv: do not clamp framebuffer dimensions to the minimum dimension
5213- radv: add MSAA support to CopyImage() on compute queue
5214- radv: use explicit VRS mode when configuring PA_CL_VRS_CNTL
5215- radv: allow to force VRS rates on GFX10.3 with RADV_FORCE_VRS
5216- radv: fix needed dynamic state for VRS
5217- amd/addrlib: expose HTILE address equations to drivers on GFX10+
5218- ac/surface: rename ac_surface_dcc_address_test.c
5219- ac/surface: add a test of HtileAddrFromCoord prototype outside of addrlib
5220- ac/surface: rename gfx9_dcc_equation to gfx9_meta_equation
5221- ac/surface: increase gfx9_meta_equation::gfx10_bits by 4 elements
5222- ac/surface: copy the HTILE equations to the surface
5223- ac/surface: implement HtileAddrFromCoord in NIR
5224- ac/surface: store the HTILE pitch to the surface
5225- radv: expose R8_UINT as the only supported format for VRS attachments
5226- radv: do not allow MSAA with fragment shading rate attachments
5227- radv: do not enable DCC for fragment shading rate attachments
5228- radv: determine if attachment VRS is enabled
5229- radv: configure the VRS HTILE encoding size
5230- radv: do not use the whole HTILE buffer for depth when VRS is used
5231- radv: update the HTILE clear word when VRS is used
5232- radv: allow HTILE for very small images if VRS attachment is used
5233- radv: create an image for VRS if no depth/stencil attachment is bound
5234- radv: handle the VRS attachment subpass
5235- radv: bind our internal depth buffer when not provided by the app
5236- radv: add support for copying VRS rates into HTILE
5237- radv: copy VRS rates to HTILE when beginning a subpass
5238- radv: configure the VRS combiners when an attachment is used
5239- radv: advertise attachmentFragmentShadingRate on GFX10.3
5240- ac: add missing BUF_DATA_FORMAT_10_11_11 vertex format on GFX10+
5241- radv: keep DCC compressed for clears on compute with image stores
5242- aco: fix opquantize2f16 on GFX6-7
5243- radv: fix fast clearing depth-only or stencil-only aspects with HTILE
5244- radv: fix emitting depth bias when beginning a command buffer
5245- radv: fix emitting default depth bounds state on GFX6
5246- radv/winsys: fix allocating the number of CS in the sysmem path
5247- radv/winsys: fix resetting the number of padded IB words
5248- radv: make sure CP DMA is idle before executing secondary command buffers
5249- radv: fix various CMASK regressions on GFX9
5250- radv: fix computation of the number of user SGPRS for NGG GS state
5251- radv: check if DCC is enabled when resolving different levels
5252- radv/winsys: fix executing huge secondary command buffers on GFX6
5253
5254Serge Martin (1):
5255
5256- clover: return CL_INVALID_VALUE when origin or region are NULL
5257
5258Simon Ser (15):
5259
5260- nouveau/nvc0: fix linear buffer alignment for scan-out/cursors
5261- nouveau/nv50: fix linear buffer alignment for scan-out/cursors
5262- frontends/va: extract pipe format to DRM format mapping
5263- frontends/va: add support for VA_EXPORT_SURFACE_COMPOSED_LAYERS
5264- frontends/va: add pipe to DRM format mapping for NV12 and P010
5265- radeonsi/uvd: make format modifiers-aware
5266- egl: use render node for wl_drm if available
5267- gbm: fail early when modifier list only contains INVALID
5268- gbm: remove fprintf calls in gbm_dri_bo_create
5269- egl/wayland: avoid unnecessary roundtrip when authenticated
5270- gbm: add gbm_bo_get_fd_for_plane
5271- egl: fix software flag in _eglAddDevice call on DRM
5272- egl: only take render nodes into account when listing DRM devices
5273- Revert "egl: Don't add hardware device if there is no render node v2."
5274- radv: fix format feature reporting for modifiers
5275
5276Simon Zeni (1):
5277
5278- egl/dri2: enable EGL_WL_bind_wayland_display in EGL device platform
5279
5280Stéphane Marchesin (1):
5281
5282- virgl: Add simple disk cache
5283
5284SureshGuttula (3):
5285
5286- va/picture : Added failure check for stability
5287- frontends/va: Update conditional checks for code stability.
5288- frontends/va : Fix memory leaks incase of error returns
5289
5290Tamara Schmitz (1):
5291
5292- util: add mesa_glthread for Valheim in OpenGL mode.
5293
5294Tapani Pälli (17):
5295
5296- mesa: add GL_SR8_EXT, GL_SRG8_EXT for color/srgb format queries
5297- intel/perf: cleanup, remove duplicate function declaration
5298- intel/perf: introduce additional ralloc context parameter
5299- i965: use aligned malloc for context instead of ralloc
5300- mesa: add check that non base level attachment is mipmap complete
5301- gitlab-ci: bump piglit commit for windows
5302- anv: toggle on sample shading if it is set in the shader
5303- anv/android: fix compilation failure
5304- anv: fix compilation due to missing vk_format_from_android
5305- mesa: check cube completeness for cube fbo attachments
5306- anv/android: fix image creation with external format
5307- android: add some more stub functions for cross compilation
5308- intel/common: disable batch decoder on Android platform
5309- loader: prefer iris on Android
5310- iris: clamp PointWidth in 3DSTATE_SF like i965 does
5311- egl: support no error attribute set to false with ES 1.1
5312- glx: revert "Downgrade sRGB-ful fbconfigs"
5313
5314Thong Thai (2):
5315
5316- frontends/va/config: Fix check for packed header config
5317- radeon: Add cropping to encoded H.265 when padding is used
5318
5319Timothee Chabat (1):
5320
5321- llvmpipe: increase PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE value
5322
5323Timothy Arceri (23):
5324
5325- util/disk_cache: do crc32 check on compressed data for ZSTD
5326- util/disk_cache: move cache path strdup call back into disk_cache.c
5327- util/disk_cache: use a new cache dir for the single file cache feature
5328- util/mesa_sha1: add helper to reconvert sha1 hex strings
5329- util/fossilize_db: add basic fossilize db util to read/write shader caches
5330- util/disk_cache: make use of single file cache when env var set
5331- nir: handle negatives in ffma reassociation optimisation
5332- util/disk_cache: fix crash in fossilize_db
5333- util/disk_cache: move cache tests to the util directory
5334- util/disk_cache: make MESA_DISK_CACHE_READ_ONLY_FOZ_DBS a relative path
5335- Revert "glsl: default to compat shaders in compat profile"
5336- glsl: fix declarations of gl_MaxVaryingFloats
5337- util: create some standalone compression helpers
5338- util/disk_cache: make use of the new compression helpers
5339- util/fossilize_db: remove compression from foz db helper
5340- util/compress: make compression function inputs const
5341- util/disk_cache: separate file reads from cache item validation
5342- util/disk_cache: detangle cache item creation from disk writing
5343- util/disk_cache: add cache item headers to single file cache entries
5344- glsl: add compilation errors for attribute and varying qualifiers
5345- glsl: enforce restrictions on builtin functions moved to compat
5346- mesa: fix incomplete GL_NV_half_float implementation
5347- util: disable glthread in CSGO
5348
5349Timur Kristóf (55):
5350
5351- radv: Only enable sparse features on Polaris and newer.
5352- tgsi_to_nir: Fix uniform ranges.
5353- aco: Fix LDS statistics of tess control shaders.
5354- radv/llvm: Fix reporting LDS stats of tess control shaders.
5355- aco: Disallow LSHS temp-only I/O when VS output is written indirectly.
5356- pan/bi: Use correct enum type for NIR intrinsics.
5357- aco: Use ASSERTED to avoid unused variable warning.
5358- intel/compiler: Use assume() instead of assert() for array bounds.
5359- intel/compiler: Make room for maximum dest size in nir_emit_texture.
5360- anv: Use unreachable() in anv_genX.
5361- anv: Use ASSERTED for results that are only used in asserts.
5362- nir: Add new nir_builder helpers for iadd with no_unsigned_wrap.
5363- nir: Add nir_builder helper for I/O address offset calculations.
5364- nir: Add a few more algebraic optimizations to help address calculation.
5365- nir: Fix unsigned upper bound of local_invocation_index for non-CS stages.
5366- nir: Shrink vectors for load_shared.
5367- nir: Add unsigned upper bound for TCS load_invocation_id.
5368- nir: Add default unsigned upper bound configuration.
5369- nir: Add AMD-specific buffer load/store intrinsics.
5370- nir: Add nir_opt_offsets to fold const adds into load/store offsets.
5371- nir: Add tessellation related AMD-specific intrinsics.
5372- nir: Add AMD-specific Geometry Shader related intrinsics.
5373- aco: Implement new buffer load/store intrinsics.
5374- aco: Implement the new tessellation I/O related NIR intrinsics.
5375- aco: Implement new Geometry Shader intrinsics.
5376- ac/llvm: Implement AMD-specific buffer load/store intrinsics.
5377- ac/llvm: Implement the new tessellation intrinsics.
5378- ac/llvm: Implement new Geometry Shader intrinsics.
5379- ac/llvm: Make shared loads/stores work correctly for non-CS stages.
5380- ac/llvm: Make sure to always emit integer comparison for nir_op_ieq.
5381- ac/llvm: Add constant offset to load/store_shared.
5382- ac/llvm: Emit more efficient code for load_shared.
5383- ac: Add NIR passes to lower VS->TCS->TES I/O to memory accesses.
5384- ac: Add NIR passes to lower ES->GS I/O to memory accesses.
5385- radv: Lower IO and set driver locations earlier.
5386- radv: Save I/O usage data to both shader infos for merged stages.
5387- radv: Calculate tess patches and LDS use outside the backend compilers.
5388- radv: Determine tcs_in_out_eq in radv_pipeline instead of the compiler.
5389- radv: Fill some tess shader info earlier.
5390- radv: Reorder some NIR optimizations in preparation for the I/O changes.
5391- radv: Use new, NIR-based I/O lowering.
5392- radv/llvm: Only store TCS outputs where they are really needed.
5393- radv/llvm: Delete superfluous tess and ESGS I/O code.
5394- aco: Delete superfluous tess and ESGS I/O code.
5395- aco: Fix constant address offset calculation for ds_read2 instructions.
5396- ac/llvm: Fix alignment of shared load intrinsics.
5397- aco: Optimize workgroup exclusive scan to better avoid bank conflicts.
5398- aco: Align NGG scratch size to 16 so a single ds_read can always read it.
5399- aco: Remove useless s_setprio near gs_alloc_req.
5400- aco: Use s_setprio 3 at the beginning of every VS and TES.
5401- aco: Extract ngg_nogs_export_prim_id to a separate function.
5402- aco: Set block_kind_export_end in create_vs/fs_exports.
5403- aco: Emit fewer branches for NGG VS/TES with late primitive export.
5404- aco: Add a simple heuristic to decide early or late primitive export.
5405- aco: Mark VCC clobbered for iadd8 and iadd16 reductions on GFX6-7.
5406
5407Tomeu Vizoso (17):
5408
5409- ci: Fix selection of linker in Android builds
5410- ci: Move container files into their own dir
5411- ci: Move out expect files from .gitlab-ci
5412- ci: Disable two radeonsi jobs
5413- Revert "ci/panfrost: disable the rest of these jobs temporarily"
5414- Revert "ci/panfrost: Disable t860/radeonsi testing while the runners are struggling."
5415- Revert "CI: Disable Panfrost T760"
5416- ci: Fix visibility property of LAVA jobs
5417- ci/fdo: Use trimmed traces for Valve games
5418- gallium/dri2: Pass the resource that corresponds to the plane
5419- ci: Use a single template for LAVA jobs
5420- ci: Set more reasonable timeouts for LAVA jobs
5421- ci: Don't retry failed test runs
5422- ci: Disable t720 LAVA jobs
5423- Revert "ci: Disable t720 LAVA jobs"
5424- Revert "ci: Disable panfrost g52"
5425- Revert "ci: Disable panfrost t760"
5426
5427Tony Wasserka (34):
5428
5429- aco/ra: Update register use bounds before recursing in get_regs_for_copies
5430- aco/ra: Introduce PhysRegInterval helper class
5431- aco/ra: Conservatively refactor existing code to use PhysRegInterval
5432- aco/ra: Remove always-false conditions
5433- aco/ra: Add iterator interface for PhysRegInterval
5434- aco/ra: Use std::find_if(_not) to clean up get_reg_simple
5435- aco/ra: Use std::all_of to simplify a loop
5436- aco/ra: Conservatively refactor get_reg_specified to use PhysRegInterval
5437- aco/ra: Move commonly repeated code to a helper function
5438- aco/ra: Add helpers to test for intersection/containment of reg intervals
5439- aco/ra: Use std::all_of to simplify a loop
5440- aco/ra: Remove unused function parameter
5441- aco/ra: Use PhysReg for member functions of PhysRegInterval
5442- aco/ra: Use PhysReg when indexing into RegisterFile's containers
5443- aco/ra: Use PhysRegInterval for collect_vars parameters
5444- aco/ra: Use PhysRegInterval for count_zero
5445- aco/ra: Fix print_regs using the wrong constant to check for blocked slots
5446- aco/ra: Fix build with print_regs enabled
5447- aco/ra: Remove preprocessor guards for print_regs
5448- aco/ra: Add helper to get a PhysRegInterval for the register demand
5449- aco: Fix vector::reserve() being called with the wrong size
5450- radv: Fix improper max_index_count argument for indexed draws
5451- ac: Add has_zero_index_buffer_bug to ac_gpu_info
5452- radv: Skip 0-sized index buffers only when necessary
5453- aco/ra: Avoid unnecessary copying of std::vectors
5454- aco/isel: Don't emit unsupported i16<->f16 conversion opcodes on GFX6/7
5455- aco/isel: Fix i64/u64->float32 conversion for large inputs
5456- aco/isel: Don't request sign extension when truncating signed integers
5457- aco/isel: Add documentation and asserts for convert_int
5458- aco/isel: Fix large inputs being truncated in int32->f16 conversions
5459- aco/isel: Add documentation for (u)int64->f16 conversion
5460- ci: skip pipeline_barrier tests that currently crash on RADV
5461- gitlab: rename RADV bug report template
5462- aco/spill: Fix improper handling of exec phis
5463
5464Vasily Khoruzhick (10):
5465
5466- lima: add precompile debug flag
5467- lima/ppir: don't use list_length() in loop in regalloc and liveness analysis
5468- lima: update dEQP fails and skips lists
5469- lima: relax checks of imported BO
5470- lima: rename \*_shader_state to \*_compiled_shader
5471- lima: rename lima_{fs,vs}_bind_state to lima_{fs,vs}_uncompiled_shader
5472- lima: implement shader disk cache
5473- lima: compute nir_sha1 for shader key even if disk cache is disabled
5474- lima: use passed surface to get mipmap level for reload, not cbuf
5475- lima: limit number of draws per job
5476
5477Vinson Lee (31):
5478
5479- panfrost: Fix typos.
5480- nouveau: Fix typos.
5481- nv50/ir: Initialize DataArray members in constructor.
5482- r600/sfn: Remove StoreMerger unused member b.
5483- nv50/ir: Add InsertConstraintsPass constructor.
5484- nv50/ir: Initialize CodeEmitter members in constructor.
5485- nv50/ir: Initialize RegAlloc member func in constructor.
5486- clover: Add constructor for global_argument.
5487- lima: Fix typos.
5488- v3dv: Fix assert.
5489- nvc0/ir: Initialize NVC0LoweringPass member gpEmitAddress.
5490- nvc0/ir: Initialize SchedDataCalculator members in constructor.
5491- nv50/ir: Initialize BindArgumentsPass member sub in constructor.
5492- virgl: Convert errno to string.
5493- r600/sfn: Initialize FragmentShaderFromNir member m_pos_input.
5494- etnaviv: Fix memory leak in etna_vertex_elements_state_create.
5495- nv50/ir: Initialize ValueDef member origin in constructors.
5496- nv50/ir: Initialize Instruction members.
5497- aco: Initialize ds_state.front.writeMask.
5498- r600: Fix typos.
5499- llvmpipe: Fix typos.
5500- nir/lower_tex: Change coord type to int.
5501- gv100/ir: Initialize CodeEmitterGV100 members in constructor.
5502- zink: Remove leftover dead code.
5503- nv50/ir: Add constructor for NV50LegalizePostRA.
5504- iris: Fix typos.
5505- clover: Add constructor for sampler_argument.
5506- ac: Fix emit_split_buffer_store modulus operation.
5507- freedreno: Fix file descriptor leak.
5508- glsl: Initialize parcel_out_uniform_storage members.
5509- Remove leftover dead code.
5510
5511Víctor Manuel Jáquez Leal (1):
5512
5513- frontends/va/context: don't set max_references with num_render_targets
5514
5515Witold Baryluk (3):
5516
5517- lavapipe: Defer lavapipe warning to CreateDevice
5518- util: Use explicit relaxed reads for u_queue
5519- radv: memset the alignment hole in cache_entry to 0
5520
5521Xin He (1):
5522
5523- virgl: use atomic operations when increase sub_ctx_id
5524
5525Yannik Marek (1):
5526
5527- turnip: fix alpha to coverage in no color and unused attachment cases
5528
5529Yevhenii Kharchenko (2):
5530
5531- st/mesa: fix PBO download for TEXTURE_1D_ARRAY textures
5532- intel/compiler: remove unused member 'input_vue_map'
5533
5534Yevhenii Kolesnikov (3):
5535
5536- iris: only set point sprite overrides if actually using points
5537- nir/from_ssa: consider defs in sibling blocks
5538- nir/from_ssa: don't check for interference within the same set
5539
5540Yiwei Zhang (3):
5541
5542- venus: properly enable WSI for different platforms
5543- venus: bring up Android support
5544- venus: implement vn_debug_init_once with os_get_option
5545
5546Yogesh Mohan Marimuthu (7):
5547
5548- ac/rgp: add ac_msgpack.h/c
5549- ac/rgp: add rgp co, col, pso data structures
5550- ac/rgp: add helper function to write rgp elf oject
5551- ac/rgp: expose data structure to populate co, col, pso database
5552- ac/rgp,radeonsi,radv: pass struct thread_trace_data to ac_sqtt_dump_data()
5553- ac/rgp: dump co, col, pso database to rgp profile file
5554- ac/rgp: set gfxip in elf_hdr.e_flags
5555
5556chenli (1):
5557
5558- mesa: update oudated members for debug and check
5559
5560cheyang (3):
5561
5562- frontend/dri: fix doesn't support RGBA ordering still expose RGBA in config
5563- glsl: redeclare built-in variable with separate shader
5564- virgl: add astc 2d compressed formats
5565