#
326e30e4 |
| 20-Mar-2024 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: Drop dead code for pvc
PCI IDs for PVC were never added and platform always marked with force_probe. Drop what's not used and rename some places as needed.
The registers not used anymore
drm/i915: Drop dead code for pvc
PCI IDs for PVC were never added and platform always marked with force_probe. Drop what's not used and rename some places as needed.
The registers not used anymore are also removed.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://patchwork.freedesktop.org/patch/msgid/20240320060543.4034215-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
show more ...
|
#
e4ae85e3 |
| 07-Nov-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Add ability for tracking buffer objects per client
In order to show per client memory usage lets add some infrastructure which enables tracking buffer objects owned by clients.
We add a p
drm/i915: Add ability for tracking buffer objects per client
In order to show per client memory usage lets add some infrastructure which enables tracking buffer objects owned by clients.
We add a per client list protected by a new per client lock and to support delayed destruction (post client exit) we make tracked objects hold references to the owning client.
Also, object memory region teardown is moved to the existing RCU free callback to allow safe dereference from the fdinfo RCU read section.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231107101806.608990-1-tvrtko.ursulin@linux.intel.com
show more ...
|
#
d6c531ab |
| 01-Aug-2023 |
Chris Wilson <chris.p.wilson@linux.intel.com> |
drm/i915: Invalidate the TLBs on each GT
With multi-GT devices, the object may have been bound on each GT. Invalidate the TLBs across all GT before releasing the pages back to the system.
Signed-of
drm/i915: Invalidate the TLBs on each GT
With multi-GT devices, the object may have been bound on each GT. Invalidate the TLBs across all GT before releasing the pages back to the system.
Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Fei Yang <fei.yang@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230801141955.383305-4-andi.shyti@linux.intel.com
show more ...
|
#
72e31c0a |
| 27-Jul-2023 |
Jouni Högander <jouni.hogander@intel.com> |
drm/i915: Add macros to get i915 device from i915_gem_object
We want to stop touching directly i915_gem_object struct members in intel_frontbuffer code. As a part of this we add helper macro to get
drm/i915: Add macros to get i915 device from i915_gem_object
We want to stop touching directly i915_gem_object struct members in intel_frontbuffer code. As a part of this we add helper macro to get i915 device from i915_gem_object.
v2: operate on and return pointer in defined macros
Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230727064142.751976-2-jouni.hogander@intel.com
show more ...
|
#
9275277d |
| 09-May-2023 |
Fei Yang <fei.yang@intel.com> |
drm/i915: use pat_index instead of cache_level
Currently the KMD is using enum i915_cache_level to set caching policy for buffer objects. This is flaky because the PAT index which really controls th
drm/i915: use pat_index instead of cache_level
Currently the KMD is using enum i915_cache_level to set caching policy for buffer objects. This is flaky because the PAT index which really controls the caching behavior in PTE has far more levels than what's defined in the enum. In addition, the PAT index is platform dependent, having to translate between i915_cache_level and PAT index is not reliable, and makes the code more complicated.
From UMD's perspective there is also a necessity to set caching policy for performance fine tuning. It's much easier for the UMD to directly use PAT index because the behavior of each PAT index is clearly defined in Bspec. Having the abstracted i915_cache_level sitting in between would only cause more ambiguity. PAT is expected to work much like MOCS already works today, and by design userspace is expected to select the index that exactly matches the desired behavior described in the hardware specification.
For these reasons this patch replaces i915_cache_level with PAT index. Also note, the cache_level is not completely removed yet, because the KMD still has the need of creating buffer objects with simple cache settings such as cached, uncached, or writethrough. For kernel objects, cache_level is used for simplicity and backward compatibility. For Pre-gen12 platforms PAT can have 1:1 mapping to i915_cache_level, so these two are interchangeable. see the use of LEGACY_CACHELEVEL.
One consequence of this change is that gen8_pte_encode is no longer working for gen12 platforms due to the fact that gen12 platforms has different PAT definitions. In the meantime the mtl_pte_encode introduced specfically for MTL becomes generic for all gen12 platforms. This patch renames the MTL PTE encode function into gen12_pte_encode and apply it to all gen12. Even though this change looks unrelated, but separating them would temporarily break gen12 PTE encoding, thus squash them in one patch.
Special note: this patch changes the way caching behavior is controlled in the sense that some objects are left to be managed by userspace. For such objects we need to be careful not to change the userspace settings.There are kerneldoc and comments added around obj->cache_coherent, cache_dirty, and how to bypass the checkings by i915_gem_object_has_cache_level. For full understanding, these changes need to be looked at together with the two follow-up patches, one disables the {set|get}_caching ioctl's and the other adds set_pat extension to the GEM_CREATE uAPI.
Bspec: 63019
Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-3-fei.yang@intel.com
show more ...
|
#
5e352e32 |
| 09-May-2023 |
Fei Yang <fei.yang@intel.com> |
drm/i915: preparation for using PAT index
This patch is a preparation for replacing enum i915_cache_level with PAT index. Caching policy for buffer objects is set through the PAT index in PTE, the o
drm/i915: preparation for using PAT index
This patch is a preparation for replacing enum i915_cache_level with PAT index. Caching policy for buffer objects is set through the PAT index in PTE, the old i915_cache_level is not sufficient to represent all caching modes supported by the hardware.
Preparing the transition by adding some platform dependent data structures and helper functions to translate the cache_level to pat_index.
cachelevel_to_pat: a platform dependent array mapping cache_level to pat_index.
max_pat_index: the maximum PAT index recommended in hardware specification Needed for validating the PAT index passed in from user space.
i915_gem_get_pat_index: function to convert cache_level to PAT index.
obj_to_i915(obj): macro moved to header file for wider usage.
I915_MAX_CACHE_LEVEL: upper bound of i915_cache_level for the convenience of coding.
Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230509165200.1740-2-fei.yang@intel.com
show more ...
|
#
ddb24fc5 |
| 04-Apr-2023 |
Nirmoy Das <nirmoy.das@intel.com> |
drm/i915/ttm: Add I915_BO_PREALLOC
Add a mechanism to preserve existing data when creating a TTM object with the I915_BO_ALLOC_USER flag. This will be used in the subsequent patch where the I915_BO_
drm/i915/ttm: Add I915_BO_PREALLOC
Add a mechanism to preserve existing data when creating a TTM object with the I915_BO_ALLOC_USER flag. This will be used in the subsequent patch where the I915_BO_ALLOC_USER flag will be applied to the framebuffer object. For a pre-allocated framebuffer without the I915_BO_PREALLOC flag.
TTM would clear the content, which is not desirable. Cc: Matthew Auld <matthew.auld@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404143100.10452-1-nirmoy.das@intel.com
show more ...
|
#
779cb5ba |
| 20-Mar-2023 |
Ville Syrjälä <ville.syrjala@linux.intel.com> |
drm/i915/dpt: Treat the DPT BO as a framebuffer
Currently i915_gem_object_is_framebuffer() doesn't treat the BO containing the framebuffer's DPT as a framebuffer itself. This means eg. that the shri
drm/i915/dpt: Treat the DPT BO as a framebuffer
Currently i915_gem_object_is_framebuffer() doesn't treat the BO containing the framebuffer's DPT as a framebuffer itself. This means eg. that the shrinker can evict the DPT BO while leaving the actual FB BO bound, when the DPT is allocated from regular shmem.
That causes an immediate oops during hibernate as we try to rewrite the PTEs inside the already evicted DPT obj.
TODO: presumably this might also be the reason for the DPT related display faults under heavy memory pressure, but I'm still not sure how that would happen as the object should be pinned by intel_dpt_pin() while in active use by the display engine...
Cc: stable@vger.kernel.org Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Imre Deak <imre.deak@intel.com> Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320090522.9909-2-ville.syrjala@linux.intel.com Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
show more ...
|
#
ad0fca2d |
| 12-Dec-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: consider CCS for backup objects
It seems we can have one or more framebuffers that are still pinned when suspending lmem, in such a case we end up creating a shmem backup object, inste
drm/i915/ttm: consider CCS for backup objects
It seems we can have one or more framebuffers that are still pinned when suspending lmem, in such a case we end up creating a shmem backup object, instead of evicting the object directly, but this will skip copying the CCS aux state, since we don't allocate the extra storage for the CCS pages as part of the ttm_tt construction. Since we can already deal with pinned objects just fine, it doesn't seem too nasty to just extend to support dealing with the CCS aux state, if the object is a pinned framebuffer. This fixes display corruption (like in gnome-shell) seen on DG2 when returning from suspend.
Fixes: da0595ae91da ("drm/i915/migrate: Evict and restore the flatccs capable lmem obj") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: <stable@vger.kernel.org> # v5.19+ Tested-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212171958.82593-2-matthew.auld@intel.com (cherry picked from commit 95df9cc24bee8a09d39c62bcef4319b984814e18) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
show more ...
|
#
a3185f91 |
| 09-May-2022 |
Christian König <christian.koenig@amd.com> |
drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2
Merge and cleanup the two headers into a single description of the object API. Also move all the documentation to the implementation and drop unnec
drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2
Merge and cleanup the two headers into a single description of the object API. Also move all the documentation to the implementation and drop unnecessary includes from the header.
No functional change.
v2: minimal checkpatch.pl cleanup
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com
show more ...
|
#
695ddc93 |
| 04-Oct-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: allow control over the flags when migrating
In the next patch we want to move the object (if the current resource is not compatible), to the mappable part of lmem for some display buffers.
drm/i915: allow control over the flags when migrating
In the next patch we want to move the object (if the current resource is not compatible), to the mappable part of lmem for some display buffers. Currently that requires being able to unset the I915_BO_ALLOC_GPU_ONLY hint.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jianshui Yu <jianshui.yu@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221004131916.233474-3-matthew.auld@intel.com (cherry picked from commit 999f4562077208b683f0519e5f1aa1e5c2fd2191) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
#
ad74457a |
| 13-Sep-2022 |
Anshuman Gupta <anshuman.gupta@intel.com> |
drm/i915/dgfx: Release mmap on rpm suspend
Release all mmap mapping for all lmem objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings wi
drm/i915/dgfx: Release mmap on rpm suspend
Release all mmap mapping for all lmem objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault.
Runtime resume the dgpu(when gem object lies in lmem). This will transition the dgpu graphics function to D0 state if it was in D3 in order to access the mmap memory mappings.
v2: - Squashes the patches. [Matt Auld] - Add adequate locking for lmem_userfault_list addition. [Matt Auld] - Reused obj->userfault_count to avoid double addition. [Matt Auld] - Added i915_gem_object_lock to check i915_gem_object_is_lmem. [Matt Auld]
v3: - Use i915_ttm_cpu_maps_iomem. [Matt Auld] - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld] - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld] - Delete the mmaped obj from lmem_userfault_list in obj destruction path. [Matt Auld] - Get a wakeref for object destruction patch. [Matt Auld] - Use intel_wakeref_auto to delay runtime PM. [Matt Auld]
v4: - Avoid using mmo offset to get the vma_node. [Matt Auld] - Added comment to use the lmem_userfault_lock. [Matt Auld] - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset. [Matt Auld] - Fixed kernel test robot generated warning.
v5: - Addressed the cosmetics comments. [Andi] - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic.
PCIe Specs 5.3.1.4.1
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331 Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-3-anshuman.gupta@intel.com
show more ...
|
#
5d36acb7 |
| 27-Jul-2022 |
Chris Wilson <chris.p.wilson@intel.com> |
drm/i915/gt: Batch TLB invalidations
Invalidate TLB in batches, in order to reduce performance regressions.
Currently, every caller performs a full barrier around a TLB invalidation, ignoring all o
drm/i915/gt: Batch TLB invalidations
Invalidate TLB in batches, in order to reduce performance regressions.
Currently, every caller performs a full barrier around a TLB invalidation, ignoring all other invalidations that may have already removed their PTEs from the cache. As this is a synchronous operation and can be quite slow, we cause multiple threads to contend on the TLB invalidate mutex blocking userspace.
We only need to invalidate the TLB once after replacing our PTE to ensure that there is no possible continued access to the physical address before releasing our pages. By tracking a seqno for each full TLB invalidate we can quickly determine if one has been performed since rewriting the PTE, and only if necessary trigger one for ourselves.
That helps to reduce the performance regression introduced by TLB invalidate logic.
[mchehab: rebased to not require moving the code to a separate file]
Cc: stable@vger.kernel.org Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing store") Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Cc: Fei Yang <fei.yang@intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4e97ef5deb6739cadaaf40aa45620547e9c4ec06.1658924372.git.mchehab@kernel.org
show more ...
|
#
bfe53be2 |
| 29-Jun-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: handle blitter failure on DG2
If the move or clear operation somehow fails, and the memory underneath is not cleared, like when moving to lmem, then we currently fallback to memcpy or
drm/i915/ttm: handle blitter failure on DG2
If the move or clear operation somehow fails, and the memory underneath is not cleared, like when moving to lmem, then we currently fallback to memcpy or memset. However with small-BAR systems this fallback might no longer be possible. For now we use the set_wedged sledgehammer if we ever encounter such a scenario, and mark the object as borked to plug any holes where access to the memory underneath can happen. Add some basic selftests to exercise this.
v2: - In the selftests make sure we grab the runtime pm around the reset. Also make sure we grab the reset lock before checking if the device is wedged, since the wedge might still be in-progress and hence the bit might not be set yet. - Don't wedge or put the object into an unknown state, if the request construction fails (or similar). Just returning an error and skipping the fallback should be safe here. - Make sure we wedge each gt. (Thomas) - Peek at the unknown_state in io_reserve, that way we don't have to export or hand roll the fault_wait_for_idle. (Thomas) - Add the missing read-side barriers for the unknown_state. (Thomas) - Some kernel-doc fixes. (Thomas) v3: - Tweak the ordering of the set_wedged, also add FIXME.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
show more ...
|
#
ecbf2060 |
| 15-Mar-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: wire up the object offset
For the ttm backend we can use existing placements fpfn and lpfn to force the allocator to place the object at the requested offset, potentially evicting stuf
drm/i915/ttm: wire up the object offset
For the ttm backend we can use existing placements fpfn and lpfn to force the allocator to place the object at the requested offset, potentially evicting stuff if the spot is currently occupied.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220315181425.576828-5-matthew.auld@intel.com
show more ...
|
#
30b9d1b3 |
| 25-Feb-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: add I915_BO_ALLOC_GPU_ONLY
If the user doesn't require CPU access for the buffer, then ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the non-mappable portion of LMEM,
drm/i915: add I915_BO_ALLOC_GPU_ONLY
If the user doesn't require CPU access for the buffer, then ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the non-mappable portion of LMEM, on devices with small BAR.
v2(Thomas): - The BO_ALLOC_TOPDOWN naming here is poor, since this is pure lies on systems that don't even have small BAR. A better name is GPU_ONLY, which is accurate regardless of the configuration.
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-3-matthew.auld@intel.com
show more ...
|
#
7938d615 |
| 19-Oct-2021 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Flush TLBs before releasing backing store
We need to flush TLBs before releasing backing store otherwise userspace is able to encounter stale entries if a) it is not declaring access to ce
drm/i915: Flush TLBs before releasing backing store
We need to flush TLBs before releasing backing store otherwise userspace is able to encounter stale entries if a) it is not declaring access to certain buffers and b) it races with the backing store release from a such undeclared execution already executing on the GPU in parallel.
The approach taken is to mark any buffer objects which were ever bound to the GPU and to trigger a serialized TLB flush when their backing store is released.
Alternatively the flushing could be done on VMA unbind, at which point we would be able to ascertain whether there is potential a parallel GPU execution (which could race), but essentially it boils down to paying the cost of TLB flushes potentially needlessly at VMA unbind time (when the backing store is not known to be going away so not needed for safety), versus potentially needlessly at backing store relase time (since we at that point cannot tell whether there is anything executing on the GPU which uses that object).
Thereforce simplicity of implementation has been chosen for now with scope to benchmark and refine later as required.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
39a2bd34 |
| 10-Jan-2022 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915: Use the vma resource as argument for gtt binding / unbinding
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place
drm/i915: Use the vma resource as argument for gtt binding / unbinding
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place.
Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource instead of a struct i915_vma for the bind_vma() and unbind_vma() ops. Similarly change the insert_entries() op for struct i915_address_space.
Replace a couple of i915_vma_snapshot members with their newly introduced i915_vma_resource counterparts, since they have the same lifetime.
Also make sure to avoid changing the struct i915_vma_flags (in particular the bind flags) async. That should now only be done sync under the vm mutex.
v2: - Update the vma_res::bound_flags when binding to the aliased ggtt v6: - Remove I915_VMA_ALLOC_BIT (Matthew Auld) - Change some members of struct i915_vma_resource from unsigned long to u64 (Matthew Auld) v7: - Fix vma resource size parameters to be u64 rather than unsigned long (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
show more ...
|
#
8ee262ba |
| 06-Jan-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages. Importantly this should now handle the ttm swapping case, and all other places that already cal
drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages. Importantly this should now handle the ttm swapping case, and all other places that already call into i915_ttm_move_notify().
v2: fix up the selftest
Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-3-matthew.auld@intel.com (cherry picked from commit 903e0387270eef14a711c0feb23b7bf62d2480df) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
show more ...
|
#
903e0387 |
| 06-Jan-2022 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages. Importantly this should now handle the ttm swapping case, and all other places that already cal
drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages. Importantly this should now handle the ttm swapping case, and all other places that already call into i915_ttm_move_notify().
v2: fix up the selftest
Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-3-matthew.auld@intel.com
show more ...
|
#
ffa3fe08 |
| 15-Dec-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: clean up shrinker_release_pages
Add some proper flags for the different modes, and shorten the name to something more snappy.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
drm/i915: clean up shrinker_release_pages
Add some proper flags for the different modes, and shorten the name to something more snappy.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211215110746.865-2-matthew.auld@intel.com
show more ...
|
#
93544177 |
| 15-Dec-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915: remove writeback hook
Ditch the writeback hook and drop i915_gem_object_writeback(). We already support the shrinker_release_pages hook which can just call shmem_writeback directly.
Sugge
drm/i915: remove writeback hook
Ditch the writeback hook and drop i915_gem_object_writeback(). We already support the shrinker_release_pages hook which can just call shmem_writeback directly.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211215110746.865-1-matthew.auld@intel.com
show more ...
|
#
004746e4 |
| 22-Nov-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915/ttm: Correctly handle waiting for gpu when shrinking
With async migration, the shrinker may end up wanting to release the pages of an object while the migration blit is still running, since
drm/i915/ttm: Correctly handle waiting for gpu when shrinking
With async migration, the shrinker may end up wanting to release the pages of an object while the migration blit is still running, since the GT migration code doesn't set up VMAs and the shrinker is thus oblivious to the fact that the GPU is still using the pages.
Add waiting for gpu in the shrinker_release_pages() op and an argument to that function indicating whether the shrinker expects it to not wait for gpu. In the latter case the shrinker_release_pages() op will return -EBUSY if the object is not idle.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-5-thomas.hellstrom@linux.intel.com
show more ...
|
#
cad7109a |
| 01-Nov-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915: Introduce refcounted sg-tables
As we start to introduce asynchronous failsafe object migration, where we update the object state and then submit asynchronous commands we need to record wha
drm/i915: Introduce refcounted sg-tables
As we start to introduce asynchronous failsafe object migration, where we update the object state and then submit asynchronous commands we need to record what memory resources are actually used by various part of the command stream. Initially for three purposes:
1) Error capture. 2) Asynchronous migration error recovery. 3) Asynchronous vma bind.
At the time where these happens, the object state may have been updated to be several migrations ahead and object sg-tables discarded.
In order to make it possible to keep sg-tables with memory resource information for these operations, introduce refcounted sg-tables that aren't freed until the last user is done with them.
The alternative would be to reference information sitting on the corresponding ttm_resources which typically have the same lifetime as these refcountes sg_tables, but that leads to other awkward constructs: Due to the design direction chosen for ttm resource managers that would lead to diamond-style inheritance, the LMEM resources may sometimes be prematurely freed, and finally the subclassed struct ttm_resource would have to bleed into the asynchronous vma bind code.
v3: - Address a number of style issues (Matthew Auld) v4: - Dont check for st->sgl being NULL in i915_ttm_tt__shmem_unpopulate(), that should never happen. (Matthew Auld) v5: - Fix a Potential double-free (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211101122444.114607-1-thomas.hellstrom@linux.intel.com
show more ...
|
#
ebd4a8ec |
| 18-Oct-2021 |
Matthew Auld <matthew.auld@intel.com> |
drm/i915/ttm: move shrinker management into adjust_lru
We currently just evict lmem objects to system memory when under memory pressure. For this case we might lack the usual object mm.pages, which
drm/i915/ttm: move shrinker management into adjust_lru
We currently just evict lmem objects to system memory when under memory pressure. For this case we might lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again.
For all cases we can just adjust the i915 shrinker LRU each time we also adjust the TTM LRU. The two cases we care about are:
1) When something is moved by TTM, including when initially populating an object. Importantly this covers the case where TTM moves something from lmem <-> smem, outside of the normal get_pages() interface, which should still ensure the shmem pages underneath are reclaimable.
2) When calling into i915_gem_object_unlock(). The unlock should ensure the object is removed from the shinker LRU, if it was indeed swapped out, or just purged, when the shrinker drops the object lock.
v2(Thomas): - Handle managing the shrinker LRU in adjust_lru, where it is always safe to touch the object. v3(Thomas): - Pretty much a re-write. This time piggy back off the shrink_pin stuff, which actually seems to fit quite well for what we want here. v4(Thomas): - Just use a simple boolean for tracking ttm_shrinkable. v5: - Ensure we call adjust_lru when faulting the object, to ensure the pages are visible to the shrinker, if needed. - Add back the adjust_lru when in i915_ttm_move (Thomas) v6(Reported-by: kernel test robot <lkp@intel.com>): - Remove unused i915_tt
Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4 Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
show more ...
|