fd7726e7 | 02-Apr-2024 |
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> |
accel/ivpu: Fix deadlock in context_xa
ivpu_device->context_xa is locked both in kernel thread and IRQ context. It requires XA_FLAGS_LOCK_IRQ flag to be passed during initialization otherwise the lo
accel/ivpu: Fix deadlock in context_xa
ivpu_device->context_xa is locked both in kernel thread and IRQ context. It requires XA_FLAGS_LOCK_IRQ flag to be passed during initialization otherwise the lock could be acquired from a thread and interrupted by an IRQ that locks it for the second time causing the deadlock.
This deadlock was reported by lockdep and observed in internal tests.
Fixes: 35b137630f08 ("accel/ivpu: Introduce a new DRM driver for Intel VPU") Cc: <stable@vger.kernel.org> # v6.3+ Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-9-jacek.lawrynowicz@linux.intel.com
show more ...
|
0d298e23 | 02-Apr-2024 |
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> |
accel/ivpu: Fix missed error message after VPU rename
Change "VPU" to "NPU" in ivpu_suspend() so it matches all other error messages.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel
accel/ivpu: Fix missed error message after VPU rename
Change "VPU" to "NPU" in ivpu_suspend() so it matches all other error messages.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-8-jacek.lawrynowicz@linux.intel.com
show more ...
|
c52c35e5 | 02-Apr-2024 |
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> |
accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE
DRM_IVPU_PARAM_CORE_CLOCK_RATE returns current NPU frequency which could be 0 if device was sleeping. This value isn't really useful to
accel/ivpu: Return max freq for DRM_IVPU_PARAM_CORE_CLOCK_RATE
DRM_IVPU_PARAM_CORE_CLOCK_RATE returns current NPU frequency which could be 0 if device was sleeping. This value isn't really useful to the user space, so return max freq instead which can be used to estimate NPU performance.
Fixes: c39dc15191c4 ("accel/ivpu: Read clock rate only if device is up") Cc: <stable@vger.kernel.org> # v6.7 Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-7-jacek.lawrynowicz@linux.intel.com
show more ...
|
3556f922 | 02-Apr-2024 |
Wachowski, Karol <karol.wachowski@intel.com> |
accel/ivpu: Improve clarity of MMU error messages
This patch improves readability and clarity of MMU error messages. Previously, the error strings were somewhat confusing and could lead to ambiguous
accel/ivpu: Improve clarity of MMU error messages
This patch improves readability and clarity of MMU error messages. Previously, the error strings were somewhat confusing and could lead to ambiguous interpretations, making it difficult to diagnose issues.
Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-6-jacek.lawrynowicz@linux.intel.com
show more ...
|
875bc9cd | 02-Apr-2024 |
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> |
accel/ivpu: Put NPU back to D3hot after failed resume
Put NPU in D3hot after ivpu_resume() fails to power up the device. This will assure that D3->D0 power cycle will be performed before the next re
accel/ivpu: Put NPU back to D3hot after failed resume
Put NPU in D3hot after ivpu_resume() fails to power up the device. This will assure that D3->D0 power cycle will be performed before the next resume and also will minimize power usage in this corner case.
Fixes: 28083ff18d3f ("accel/ivpu: Fix DevTLB errors on suspend/resume and recovery") Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-5-jacek.lawrynowicz@linux.intel.com
show more ...
|
3534eacb | 02-Apr-2024 |
Wachowski, Karol <karol.wachowski@intel.com> |
accel/ivpu: Fix PCI D0 state entry in resume
In case of failed power up we end up left in PCI D3hot state making it impossible to access NPU registers on retry. Enter D0 state on retry before procee
accel/ivpu: Fix PCI D0 state entry in resume
In case of failed power up we end up left in PCI D3hot state making it impossible to access NPU registers on retry. Enter D0 state on retry before proceeding with power up sequence.
Fixes: 28083ff18d3f ("accel/ivpu: Fix DevTLB errors on suspend/resume and recovery") Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Wachowski, Karol <karol.wachowski@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-4-jacek.lawrynowicz@linux.intel.com
show more ...
|
e3caadf1 | 02-Apr-2024 |
Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> |
accel/ivpu: Remove d3hot_after_power_off WA
Always enter D3hot after entering D0i3 an all platforms. This minimizes power usage.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
accel/ivpu: Remove d3hot_after_power_off WA
Always enter D3hot after entering D0i3 an all platforms. This minimizes power usage.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240402104929.941186-3-jacek.lawrynowicz@linux.intel.com
show more ...
|
576d7cc5 | 19-Feb-2024 |
Ricardo B. Marliere <ricardo@marliere.net> |
accel: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the accel_sysfs_devi
accel: constify the struct device_type usage
Since commit aed65af1cc2f ("drivers: make device_type const"), the driver core can properly handle constant struct device_type. Move the accel_sysfs_device_minor variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
fa58b594 | 12-Feb-2024 |
Ofir Bitton <obitton@habana.ai> |
accel/habanalabs: modify pci health check
Today we read PCI VENDOR-ID in order to make sure PCI link is healthy. Apparently the VENDOR-ID might be stored on host and hence, when we read it we might
accel/habanalabs: modify pci health check
Today we read PCI VENDOR-ID in order to make sure PCI link is healthy. Apparently the VENDOR-ID might be stored on host and hence, when we read it we might not access the PCI bus. In order to make sure PCI health check is reliable, we will start checking the DEVICE-ID instead.
Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
c5170683 | 30-Jan-2024 |
Tomer Tayar <ttayar@habana.ai> |
accel/habanalabs: keep explicit size of reserved memory for FW
The reserved memory for FW is currently saved in an ASIC property in units of MB, just like the value that comes from FW. Except the fa
accel/habanalabs: keep explicit size of reserved memory for FW
The reserved memory for FW is currently saved in an ASIC property in units of MB, just like the value that comes from FW. Except the fact that it is not clear from the property's name, it means also that a calculation to actual size is required everywhere that it is used. Modify the property to hold the size in bytes.
Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
db45bbdd | 29-Jan-2024 |
Tomer Tayar <ttayar@habana.ai> |
accel/habanalabs: handle reserved memory request when working with full FW
Currently the reserved memory request from FW is handled when running with preboot only, but this request is relevant also
accel/habanalabs: handle reserved memory request when working with full FW
Currently the reserved memory request from FW is handled when running with preboot only, but this request is relevant also when running with full FW. Modify to always handle this reservation request.
Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
5b6658eb | 06-Feb-2024 |
Ofir Bitton <obitton@habana.ai> |
accel/habanalabs/hwmon: rate limit errors user can generate
Fetching sensor data can fail due to various reasons. In order not to pollute the kernel log, those error prints must be rate limited.
Si
accel/habanalabs/hwmon: rate limit errors user can generate
Fetching sensor data can fail due to various reasons. In order not to pollute the kernel log, those error prints must be rate limited.
Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
3bf6ef98 | 05-Feb-2024 |
Ofir Bitton <obitton@habana.ai> |
accel/habanalabs/gaudi2: drain event lacks rd/wr indication
Due to a H/W issue, AXI drain event does not include a read/write indication, hence we remove this print.
Signed-off-by: Ofir Bitton <obi
accel/habanalabs/gaudi2: drain event lacks rd/wr indication
Due to a H/W issue, AXI drain event does not include a read/write indication, hence we remove this print.
Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
fd8d2fa0 | 05-Feb-2024 |
Dani Liberman <dliberman@habana.ai> |
accel/habanalabs: fix error print
The unmasking is for event and it can be other event than RAZWI.
Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> R
accel/habanalabs: fix error print
The unmasking is for event and it can be other event than RAZWI.
Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
c8c062e9 | 31-Jan-2024 |
Tal Risin <trisin@habana.ai> |
accel/habanalabs: initialize maybe-uninitialized variables
Prevent static analysis warning.
Signed-off-by: Tal Risin <trisin@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Ca
accel/habanalabs: initialize maybe-uninitialized variables
Prevent static analysis warning.
Signed-off-by: Tal Risin <trisin@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
0b105a2a | 16-Jan-2024 |
Avri Kehat <akehat@habana.ai> |
accel/habanalabs: fix debugfs files permissions
debugfs files are created with permissions that don't align with the access requirements.
Signed-off-by: Avri Kehat <akehat@habana.ai> Reviewed-by: O
accel/habanalabs: fix debugfs files permissions
debugfs files are created with permissions that don't align with the access requirements.
Signed-off-by: Avri Kehat <akehat@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
c1e89ae4 | 18-Jan-2024 |
Tomer Tayar <ttayar@habana.ai> |
accel/habanalabs/gaudi2: check extended errors according to PCIe addr_dec interrupt info
The FW interrupt info for a PCIe addr_dec event is set correctly, so check for either global errors or razwi
accel/habanalabs/gaudi2: check extended errors according to PCIe addr_dec interrupt info
The FW interrupt info for a PCIe addr_dec event is set correctly, so check for either global errors or razwi according to the indications there.
Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
7159813c | 18-Jan-2024 |
Tomer Tayar <ttayar@habana.ai> |
accel/habanalabs: modify print for skip loading linux FW to debug log
Skip loading a linux FW image into the device with the current supported ASICs is done for test purposes only. Moreover, for fut
accel/habanalabs: modify print for skip loading linux FW to debug log
Skip loading a linux FW image into the device with the current supported ASICs is done for test purposes only. Moreover, for future supported ASICs it is possible that there won't be a need to load such an image. The print in such a case is therefore not needed in most cases, so replace the used dev_info() with dev_dbg().
Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Reviewed-by: Carl Vanderlip <quic_carlv@quicinc.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
9e263c50 | 20-Jan-2024 |
Erick Archer <erick.archer@gmx.com> |
accel/habanalabs: use kcalloc() instead of kzalloc()
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multipli
accel/habanalabs: use kcalloc() instead of kzalloc()
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors.
So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function.
Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/162
Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
5ae8b6b7 | 06-Jan-2024 |
Colin Ian King <colin.i.king@intel.com> |
accel/habanalabs/goya: remove redundant assignment to pointer 'input'
The pointer input is assigned a value that is not read, it is being re-assigned again later with the same value. Resolve this by
accel/habanalabs/goya: remove redundant assignment to pointer 'input'
The pointer input is assigned a value that is not read, it is being re-assigned again later with the same value. Resolve this by moving the declaration to input into the if block.
Cleans up clang scan build warning: warning: Value stored to 'input' during its initialization is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@intel.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
01f8cd0f | 02-Jan-2024 |
Tomer Tayar <ttayar@habana.ai> |
accel/habanalabs/gaudi2: fail memory memset when failing to copy QM packet to device
gaudi2_memset_memory_chunk_using_edma_qm() calls the access_dev_mem() ASIC function, but ignores its return value
accel/habanalabs/gaudi2: fail memory memset when failing to copy QM packet to device
gaudi2_memset_memory_chunk_using_edma_qm() calls the access_dev_mem() ASIC function, but ignores its return value. Add this missing check.
Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|
731d320e | 01-Jan-2024 |
Dani Liberman <dliberman@habana.ai> |
accel/habanalabs: remove call to deprecated function
In newer kernel versions, irq_set_affinity_hint() is deprecated. Instead, use the newer version which is irq_set_affinity_and_hint().
Signed-off
accel/habanalabs: remove call to deprecated function
In newer kernel versions, irq_set_affinity_hint() is deprecated. Instead, use the newer version which is irq_set_affinity_and_hint().
Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
show more ...
|