#
8bc75586 |
| 17-Aug-2023 |
Christian König <christian.koenig@amd.com> |
drm/amdgpu: workaround to avoid SET_Q_MODE packets v2
It turned out that executing the SET_Q_MODE packet on every submission creates to much overhead.
Implement a workaround which allows skipping t
drm/amdgpu: workaround to avoid SET_Q_MODE packets v2
It turned out that executing the SET_Q_MODE packet on every submission creates to much overhead.
Implement a workaround which allows skipping the SET_Q_MODE packet if subsequent submissions all use the same parameters.
v2: add a NULL check for ring_obj
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
c68cbbfd |
| 15-Aug-2023 |
Christian König <christian.koenig@amd.com> |
drm/amdgpu: cleanup conditional execution
First of all calculating the number of dw to patch into a conditional execution is not something HW generation specific. This is just standard ring buffer c
drm/amdgpu: cleanup conditional execution
First of all calculating the number of dw to patch into a conditional execution is not something HW generation specific. This is just standard ring buffer calculations. While at it also reduce the BUG_ON() into WARN_ON().
Then instead of a random bit pattern use 0 as default value for the number of dw skipped, this way it's not mandatory any more to patch the conditional execution.
And last make the address to check a parameter of the conditional execution instead of getting this from the ring.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
9749c868 |
| 05-Jan-2024 |
Ma Jun <Jun.Ma2@amd.com> |
drm/amdgpu: Fix the warning info in mode1 reset
Fix the warning info below during mode1 reset. [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000006] ? show_regs+0x6e/0x80 [ +0.000011] ? _
drm/amdgpu: Fix the warning info in mode1 reset
Fix the warning info below during mode1 reset. [ +0.000004] Call Trace: [ +0.000004] <TASK> [ +0.000006] ? show_regs+0x6e/0x80 [ +0.000011] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000005] ? __warn+0x91/0x150 [ +0.000009] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000006] ? report_bug+0x19d/0x1b0 [ +0.000013] ? handle_bug+0x46/0x80 [ +0.000012] ? exc_invalid_op+0x1d/0x80 [ +0.000011] ? asm_exc_invalid_op+0x1f/0x30 [ +0.000014] ? __flush_work.isra.0+0x2e8/0x390 [ +0.000007] ? __flush_work.isra.0+0x208/0x390 [ +0.000007] ? _prb_read_valid+0x216/0x290 [ +0.000008] __cancel_work_timer+0x11d/0x1a0 [ +0.000007] ? try_to_grab_pending+0xe8/0x190 [ +0.000012] cancel_work_sync+0x14/0x20 [ +0.000008] amddrm_sched_stop+0x3c/0x1d0 [amd_sched] [ +0.000032] amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu]
This warning info was printed after applying the patch "drm/sched: Convert drm scheduler to use a work queue rather than kthread". The root cause is that amdgpu driver tries to use the uninitialized work_struct in the struct drm_gpu_scheduler
v2: - Rename the function to amdgpu_ring_sched_ready and move it to amdgpu_ring.c (Alex) v3: - Fix a few more checks based on Vitaly's patch (Alex) v4: - squash in fix noticed by Bert in https://gitlab.freedesktop.org/drm/amd/-/issues/3139
Fixes: 11b3b9f461c5 ("drm/sched: Check scheduler ready before calling timeout handling") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
1a29f367 |
| 10-May-2023 |
Lang Yu <Lang.Yu@amd.com> |
drm/amdgpu: add UMSCH RING TYPE definition
Add RING TYPE definition for Multi Mdeia User Mode Scheduler.
Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by:
drm/amdgpu: add UMSCH RING TYPE definition
Add RING TYPE definition for Multi Mdeia User Mode Scheduler.
Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
2d6ea3b0 |
| 23-Apr-2022 |
Huang Rui <ray.huang@amd.com> |
drm/amdgpu: add VPE RING TYPE definition
Add RING TYPE for Video Processing Engine.
Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
1b01c010 |
| 02-Aug-2023 |
Ran Sun <sunran001@208suo.com> |
drm/amdgpu: Clean up errors in amdgpu_ring.h
Fix the following errors reported by checkpatch:
ERROR: spaces required around that ':' (ctx:VxW)
Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-
drm/amdgpu: Clean up errors in amdgpu_ring.h
Fix the following errors reported by checkpatch:
ERROR: spaces required around that ':' (ctx:VxW)
Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
b13eb02b |
| 19-Apr-2023 |
Christian König <christian.koenig@amd.com> |
drm/amdgpu: add amdgpu_error_* debugfs file
This allows us to insert some error codes into the bottom of the pipeline on an engine.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewe
drm/amdgpu: add amdgpu_error_* debugfs file
This allows us to insert some error codes into the bottom of the pipeline on an engine.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
8ff865be |
| 25-May-2023 |
Jiadong Zhu <Jiadong.Zhu@amd.com> |
drm/amdgpu: Modify indirect buffer packages for resubmission
When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use
drm/amdgpu: Modify indirect buffer packages for resubmission
When the preempted IB frame resubmitted to cp, we need to modify the frame data including: 1. set PRE_RESUME 1 in CONTEXT_CONTROL. 2. use meta data(DE and CE) read from CSA in WRITE_DATA.
Add functions to save the location the first time IBs emitted and callback to patch the package when resubmission happens.
Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
797a0a14 |
| 15-Aug-2022 |
James Zhu <James.Zhu@amd.com> |
drm/amdgpu: add partition ID track in ring
Keep track partition ID in ring.
Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexa
drm/amdgpu: add partition ID track in ring
Keep track partition ID in ring.
Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
bb0ed57b |
| 16-Mar-2023 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: increase AMDGPU_MAX_RINGS
On newer GPUs, the number of kernel rings are increased.
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
drm/amdgpu: increase AMDGPU_MAX_RINGS
On newer GPUs, the number of kernel rings are increased.
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
1bd99ca2 |
| 10-Jan-2023 |
James Zhu <James.Zhu@amd.com> |
drm/amdgpu: increase AMDGPU_MAX_HWIP_RINGS
[WA] Increase AMDGPU_MAX_HWIP_RINGS to 64 to support more compute ring resource. Later need redesign with queue/prirority/scheduler factors to reduce AMDGP
drm/amdgpu: increase AMDGPU_MAX_HWIP_RINGS
[WA] Increase AMDGPU_MAX_HWIP_RINGS to 64 to support more compute ring resource. Later need redesign with queue/prirority/scheduler factors to reduce AMDGPU_MAX_HWIP_RINGS.
Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
45ed97ad |
| 12-Dec-2022 |
James Zhu <James.Zhu@amd.com> |
drm/amdgpu: increase MAX setting to hold more jpeg instances
vcn_v4_0_3 increased jpeg instances, need increasing MAX resources setting accordlingly.
Signed-off-by: James Zhu <James.Zhu@amd.com> Ac
drm/amdgpu: increase MAX setting to hold more jpeg instances
vcn_v4_0_3 increased jpeg instances, need increasing MAX resources setting accordlingly.
Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
386ea27c |
| 23-Feb-2022 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: adjust some basic elements for multiple AID case
add some elements below: - num_aid - aid_id for each sdma instance - num_inst_per_aid for sdma
and extend macro size below: - SDMA_M
drm/amdgpu: adjust some basic elements for multiple AID case
add some elements below: - num_aid - aid_id for each sdma instance - num_inst_per_aid for sdma
and extend macro size below: - SDMA_MAX_INSTANCES to 16 - AMDGPU_MAX_RINGS to 96 - AMDGPU_MAX_HWIP_RINGS to 32
v2: move aid_id from amdgpu_ring to amdgpu_sdma_instance. (Lijo)
Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
b185c318 |
| 21-Mar-2023 |
Alex Deucher <alexander.deucher@amd.com> |
drm/amdgpu: track MQD size for gfx and compute
It varies by generation and we need to know the size to expose this via debugfs.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by
drm/amdgpu: track MQD size for gfx and compute
It varies by generation and we need to know the size to expose this via debugfs.
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
ac928705 |
| 09-Mar-2023 |
Christian König <christian.koenig@amd.com> |
drm/amdgpu: add gfx shadow CS IOCTL support
Add support for submitting the shadow update packet when submitting an IB. Needed for MCBP on GFX11.
v2: update API for CSA (Alex) v3: fix ordering; SET
drm/amdgpu: add gfx shadow CS IOCTL support
Add support for submitting the shadow update packet when submitting an IB. Needed for MCBP on GFX11.
v2: update API for CSA (Alex) v3: fix ordering; SET_Q_PREEMPTION_MODE most come before COND_EXEC Add missing check for AMDGPU_CHUNK_ID_CP_GFX_SHADOW in amdgpu_cs_pass1() Only initialize shadow on first use (Alex) v4: Pass parameters rather than job to new ring callback (Alex) v5: squash in change to call SET_Q_PREEMPTION_MODE/COND_EXEC before RELEASE_MEM to complete the UMDs use of the shadow (Alex)
Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
c30ddcec |
| 13-Apr-2023 |
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> |
drm/amdgpu: Add a max ibs per submission limit.
And ensure each ring supports that many submissions. This makes sure that we don't get surprises after the submission has been scheduled where the rin
drm/amdgpu: Add a max ibs per submission limit.
And ensure each ring supports that many submissions. This makes sure that we don't get surprises after the submission has been scheduled where the ring allocation actually gets rejected.
My calculations on the existing limits: COMPUTE v10: 128 COMPUTE v11: 128 COMPUTE v6: 157 COMPUTE v7: 133 COMPUTE v8: 130 COMPUTE v9: 125 GFX v10: 208 GFX v11: 213 GFX v6: 154 (doubling this in the previous patch) GFX v7: 226 GFX v8: 213 GFX v9: 208 GFX v9 (SW): 208 SDMA CIK: 87 SDMA SI: 97 SDMA v2.4: 74 SDMA v3.0: 74 SDMA v4.0: 72 SDMA v5.0: 51 SDMA v6.0: 52 UVD ENC v6.0: 98 UVD ENC v7.0: 92 UVD v3.1: 124 UVD v4.2: 124 UVD v5.0: 83 UVD v6.0 (VM): 55 UVD v7.0: 51 VCE v2.0: 126 VCE v3.0 (VM): 98 VCE v4.0: 93 VCN DEC v1.0: 49 VCN DEC v2.0: 51 VCN DEC v3.0: 51 VCN ENC v1.0: 58 VCN ENC v2.0: 93 VCN ENC v3.0: 93 VCN ENC v4.0: 93 VCN JPEG v1.0: 17 VCN JPEG v2.0: 16 VCN JPEG v2.5: 17 VCN JPEG v3.0: 17 VCN JPEG v4.0: 17
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2498 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
541372bb |
| 16-Nov-2021 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: add some basic elements for multiple XCD case
Add some basic definitions and structure member. Inscrease MAX_WB slots to 1024 to support the increasing number of rings for multiple parti
drm/amdgpu: add some basic elements for multiple XCD case
Add some basic definitions and structure member. Inscrease MAX_WB slots to 1024 to support the increasing number of rings for multiple partitions.
v2: unify naming style
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0530553b |
| 19-May-2022 |
Le Ma <le.ma@amd.com> |
drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)
It looks better to place this field in ring structure. Also drop the repeated ring funcs definitions if there's no difference except for vmhub fi
drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)
It looks better to place this field in ring structure. Also drop the repeated ring funcs definitions if there's no difference except for vmhub field.
v2: rename the field to vm_hub like others (Le) v3: apply the changes to new ip blocks (Hawking) v4: fix vcn sw ring (Alex)
Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
c103a23f |
| 24-Feb-2023 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/amd: Convert amdgpu to use suballocation helper.
Now that we have a generic suballocation helper, Use it in amdgpu. For lines that get moved or changed, also fix up pre-existing style issues.
S
drm/amd: Convert amdgpu to use suballocation helper.
Now that we have a generic suballocation helper, Use it in amdgpu. For lines that get moved or changed, also fix up pre-existing style issues.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230224095152.30134-3-thomas.hellstrom@linux.intel.com
show more ...
|
#
6c1a6d0b |
| 08-Feb-2023 |
JesseZhang <Jesse.Zhang@amd.com> |
amd/amdgpu: remove test ib on hw ring
test ib function is not necessary on hw ring, so remove it.
v2: squash in NULL check fix
Signed-off-by: JesseZhang <Jesse.Zhang@amd.com> Acked-by: Christian K
amd/amdgpu: remove test ib on hw ring
test ib function is not necessary on hw ring, so remove it.
v2: squash in NULL check fix
Signed-off-by: JesseZhang <Jesse.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
3f4c175d |
| 07-Sep-2022 |
Jiadong.Zhu <Jiadong.Zhu@amd.com> |
drm/amdgpu: MCBP based on DRM scheduler (v9)
Trigger Mid-Command Buffer Preemption according to the priority of the software rings and the hw fence signalling condition.
The muxer saves the locatio
drm/amdgpu: MCBP based on DRM scheduler (v9)
Trigger Mid-Command Buffer Preemption according to the priority of the software rings and the hw fence signalling condition.
The muxer saves the locations of the indirect buffer frames from the software ring together with the fence sequence number in its fifo queue, and pops out those records when the fences are signalled. The locations are used to resubmit packages in preemption scenarios by coping the chunks from the software ring.
v2: Update comment style. v3: Fix conflict caused by previous modifications. v4: Remove unnecessary prints. v5: Fix corner cases for resubmission cases. v6: Refactor functions for resubmission, calling fence_process in irq handler. v7: Solve conflict for removing amdgpu_sw_ring.c. v8: Add time threshold to judge if preemption request is needed. v9: Correct comment spelling. Set fence emit timestamp before rsu assignment.
Cc: Christian Koenig <Christian.Koenig@amd.com> Cc: Luben Tuikov <Luben.Tuikov@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
be254550 |
| 13-Jul-2022 |
Jiadong.Zhu <Jiadong.Zhu@amd.com> |
drm/amdgpu: Modify unmap_queue format for gfx9 (v6)
1. Modify the unmap_queue package on gfx9. Add trailing fence to track the preemption done. 2. Modify emit_ce_meta emit_de_meta functions for t
drm/amdgpu: Modify unmap_queue format for gfx9 (v6)
1. Modify the unmap_queue package on gfx9. Add trailing fence to track the preemption done. 2. Modify emit_ce_meta emit_de_meta functions for the resumed ibs.
v2: Restyle code not to use ternary operator. v3: Modify code format. v4: Enable Mid-Command Buffer Preemption for gfx9 by default. v5: Optimize the flag bit set for emit_fence. v6: Modify log message for preemption timeout.
Cc: Christian Koenig <Christian.Koenig@amd.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Luben Tuikov <Luben.Tuikov@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
0c97a19a |
| 07-Sep-2022 |
Jiadong.Zhu <Jiadong.Zhu@amd.com> |
drm/amdgpu: Add software ring callbacks for gfx9 (v8)
Set ring functions with software ring callbacks on gfx9.
The software ring could be tested by debugfs_test_ib case.
v2: Set sw_ring 2 to enabl
drm/amdgpu: Add software ring callbacks for gfx9 (v8)
Set ring functions with software ring callbacks on gfx9.
The software ring could be tested by debugfs_test_ib case.
v2: Set sw_ring 2 to enable software ring by default. v3: Remove the parameter for software ring enablement. v4: Use amdgpu_ring_init/fini for software rings. v5: Update for code format. Fix conflict. v6: Remove unnecessary checks and enable software ring on gfx9 by default. v7: Use static array for software ring names and priorities. v8: Stop creating software rings if no gfx ring existed.
Cc: Christian Koenig <Christian.Koenig@amd.com> Cc: Luben Tuikov <Luben.Tuikov@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
ded946f3 |
| 07-Sep-2022 |
Jiadong.Zhu <Jiadong.Zhu@amd.com> |
drm/amdgpu: Introduce gfx software ring (v9)
The software ring is created to support priority context while there is only one hardware queue for gfx.
Every software ring has its fence driver and co
drm/amdgpu: Introduce gfx software ring (v9)
The software ring is created to support priority context while there is only one hardware queue for gfx.
Every software ring has its fence driver and could be used as an ordinary ring for the GPU scheduler. Multiple software rings are bound to a real ring with the ring muxer. The packages committed on the software ring are copied to the real ring.
v2: Use array to store software ring entry. v3: Remove unnecessary prints. v4: Remove amdgpu_ring_sw_init/fini functions, using gtt for sw ring buffer for later dma copy optimization. v5: Allocate ring entry dynamically in the muxer. v6: Update comments for the ring muxer. v7: Modify for function naming. v8: Combine software ring functions into amdgpu_ring_mux.c v9: Use kernel-doc comment on the get_rptr function.
Cc: Christian Koenig <Christian.Koenig@amd.com> Cc: Luben Tuikov <Luben.Tuikov@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Michel Dänzer <michel@daenzer.net> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|
#
9e225fb9 |
| 18-Jun-2022 |
Andrey Grodzovsky <andrey.grodzovsky@amd.com> |
drm/amdgpu: Prevent race between late signaled fences and GPU reset.
Problem: After we start handling timed out jobs we assume there fences won't be signaled but we cannot be sure and sometimes they
drm/amdgpu: Prevent race between late signaled fences and GPU reset.
Problem: After we start handling timed out jobs we assume there fences won't be signaled but we cannot be sure and sometimes they fire late. We need to prevent concurrent accesses to fence array from amdgpu_fence_driver_clear_job_fences during GPU reset and amdgpu_fence_process from a late EOP interrupt.
Fix: Before accessing fence array in GPU disable EOP interrupt and flush all pending interrupt handlers for amdgpu device's interrupt line.
v2: Switch from irq_get/put to full enable/disable_irq for amdgpu
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
show more ...
|