#
fb4e2b70 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work
The rehash delayed work is rescheduled with a delay if the number of credits at end of the work is not negative as supposedly it
mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash work
The rehash delayed work is rescheduled with a delay if the number of credits at end of the work is not negative as supposedly it means that the migration ended. Otherwise, it is rescheduled immediately.
After "mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash" the above is no longer accurate as a non-negative number of credits is no longer indicative of the migration being done. It can also happen if the work encountered an error in which case the migration will resume the next time the work is scheduled.
The significance of the above is that it is possible for the work to be pending and associated with hints that were allocated when the migration started. This leads to the hints being leaked [1] when the work is canceled while pending as part of ACL region dismantle.
Fix by freeing the hints if hints are associated with a work that was canceled while pending.
Blame the original commit since the reliance on not having a pending work associated with hints is fragile.
[1] unreferenced object 0xffff88810e7c3000 (size 256): comm "kworker/0:16", pid 176, jiffies 4295460353 hex dump (first 32 bytes): 00 30 95 11 81 88 ff ff 61 00 00 00 00 00 00 80 .0......a....... 00 00 61 00 40 00 00 00 00 00 00 00 04 00 00 00 ..a.@........... backtrace (crc 2544ddb9): [<00000000cf8cfab3>] kmalloc_trace+0x23f/0x2a0 [<000000004d9a1ad9>] objagg_hints_get+0x42/0x390 [<000000000b143cf3>] mlxsw_sp_acl_erp_rehash_hints_get+0xca/0x400 [<0000000059bdb60a>] mlxsw_sp_acl_tcam_vregion_rehash_work+0x868/0x1160 [<00000000e81fd734>] process_one_work+0x59c/0xf20 [<00000000ceee9e81>] worker_thread+0x799/0x12c0 [<00000000bda6fe39>] kthread+0x246/0x300 [<0000000070056d23>] ret_from_fork+0x34/0x70 [<00000000dea2b93e>] ret_from_fork_asm+0x1a/0x30
Fixes: c9c9af91f1d9 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/0cc12ebb07c4d4c41a1265ee2c28b392ff997a86.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
b377add0 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix incorrect list API usage
Both the function that migrates all the chunks within a region and the function that migrates all the entries within a chunk call list_first_en
mlxsw: spectrum_acl_tcam: Fix incorrect list API usage
Both the function that migrates all the chunks within a region and the function that migrates all the entries within a chunk call list_first_entry() on the respective lists without checking that the lists are not empty. This is incorrect usage of the API, which leads to the following warning [1].
Fix by returning if the lists are empty as there is nothing to migrate in this case.
[1] WARNING: CPU: 0 PID: 6437 at drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:1266 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0> Modules linked in: CPU: 0 PID: 6437 Comm: kworker/0:37 Not tainted 6.9.0-rc3-custom-00883-g94a65f079ef6 #39 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0x2c0 [...] Call Trace: <TASK> mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x4a0 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 ret_from_fork_asm+0x1a/0x30 </TASK>
Fixes: 6f9579d4e302 ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/4628e9a22d1d84818e28310abbbc498e7bc31bc9.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
743edc85 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix warning during rehash
As previously explained, the rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the f
mlxsw: spectrum_acl_tcam: Fix warning during rehash
As previously explained, the rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters.
When the work runs out of credits it stores the current chunk and entry as markers in the per-work context so that it would know where to resume the migration from the next time the work is scheduled.
Upon error, the chunk marker is reset to NULL, but without resetting the entry markers despite being relative to it. This can result in migration being resumed from an entry that does not belong to the chunk being migrated. In turn, this will eventually lead to a chunk being iterated over as if it is an entry. Because of how the two structures happen to be defined, this does not lead to KASAN splats, but to warnings such as [1].
Fix by creating a helper that resets all the markers and call it from all the places the currently only reset the chunk marker. For good measures also call it when starting a completely new rehash. Add a warning to avoid future cases.
[1] WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0 Modules linked in: CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G W 6.9.0-rc3-custom-00880-g29e61d91b77b #29 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_afk_encode+0x242/0x2f0 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290 mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 </TASK>
Fixes: 6f9579d4e302 ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
8ca3f7a7 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix memory leak during rehash
The rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same
mlxsw: spectrum_acl_tcam: Fix memory leak during rehash
The rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters.
If the migration fails, the code tries to migrate the filters back to the old region. However, the rollback itself can also fail in which case another migration will be erroneously performed. Besides the fact that this ping pong is not a very good idea, it also creates a problem.
Each virtual chunk references two chunks: The currently used one ('vchunk->chunk') and a backup ('vchunk->chunk2'). During migration the first holds the chunk we want to migrate filters to and the second holds the chunk we are migrating filters from.
The code currently assumes - but does not verify - that the backup chunk does not exist (NULL) if the currently used chunk does not reference the target region. This assumption breaks when we are trying to rollback a rollback, resulting in the backup chunk being overwritten and leaked [1].
Fix by not rolling back a failed rollback and add a warning to avoid future cases.
[1] WARNING: CPU: 5 PID: 1063 at lib/parman.c:291 parman_destroy+0x17/0x20 Modules linked in: CPU: 5 PID: 1063 Comm: kworker/5:11 Tainted: G W 6.9.0-rc2-custom-00784-gc6a05c468a0b #14 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:parman_destroy+0x17/0x20 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_region_fini+0x19/0x60 mlxsw_sp_acl_tcam_region_destroy+0x49/0xf0 mlxsw_sp_acl_tcam_vregion_rehash_work+0x1f1/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 ret_from_fork_asm+0x1a/0x30 </TASK>
Fixes: 843500518509 ("mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/d5edd4f4503934186ae5cfe268503b16345b4e0f.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
5bcf9255 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Rate limit error message
In the rare cases when the device resources are exhausted it is likely that the rehash delayed work will fail. An error message will be printed whe
mlxsw: spectrum_acl_tcam: Rate limit error message
In the rare cases when the device resources are exhausted it is likely that the rehash delayed work will fail. An error message will be printed whenever this happens which can be overwhelming considering the fact that the work is per-region and that there can be hundreds of regions.
Fix by rate limiting the error message.
Fixes: e5e7962ee5c2 ("mlxsw: spectrum_acl: Implement region migration according to hints") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/c510763b2ebd25e7990d80183feff91cde593145.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
54225988 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash
The rehash delayed work migrates filters from one region to another according to the number of available credits.
The migrated fr
mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash
The rehash delayed work migrates filters from one region to another according to the number of available credits.
The migrated from region is destroyed at the end of the work if the number of credits is non-negative as the assumption is that this is indicative of migration being complete. This assumption is incorrect as a non-negative number of credits can also be the result of a failed migration.
The destruction of a region that still has filters referencing it can result in a use-after-free [1].
Fix by not destroying the region if migration failed.
[1] BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230 Read of size 8 at addr ffff8881735319e8 by task kworker/0:31/3858
CPU: 0 PID: 3858 Comm: kworker/0:31 Tainted: G W 6.9.0-rc2-custom-00782-gf2275c2157d8 #5 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xce/0x670 kasan_report+0xd7/0x110 mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230 mlxsw_sp_acl_ctcam_entry_del+0x2e/0x70 mlxsw_sp_acl_atcam_entry_del+0x81/0x210 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3cd/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 </TASK>
Allocated by task 174: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __kmalloc+0x19c/0x360 mlxsw_sp_acl_tcam_region_create+0xdf/0x9c0 mlxsw_sp_acl_tcam_vregion_rehash_work+0x954/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30
Freed by task 7: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 poison_slab_object+0x102/0x170 __kasan_slab_free+0x14/0x30 kfree+0xc1/0x290 mlxsw_sp_acl_tcam_region_destroy+0x272/0x310 mlxsw_sp_acl_tcam_vregion_rehash_work+0x731/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30
Fixes: c9c9af91f1d9 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/3e412b5659ec2310c5c615760dfe5eac18dd7ebd.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
79b5b4b1 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update
The rule activity update delayed work periodically traverses the list of configured rules and queries their activity from
mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity update
The rule activity update delayed work periodically traverses the list of configured rules and queries their activity from the device.
As part of this task it accesses the entry pointed by 'ventry->entry', but this entry can be changed concurrently by the rehash delayed work, leading to a use-after-free [1].
Fix by closing the race and perform the activity query under the 'vregion->lock' mutex.
[1] BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140 Read of size 8 at addr ffff8881054ed808 by task kworker/0:18/181
CPU: 0 PID: 181 Comm: kworker/0:18 Not tainted 6.9.0-rc2-custom-00781-gd5ab772d32f7 #2 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xce/0x670 kasan_report+0xd7/0x110 mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140 mlxsw_sp_acl_rule_activity_update_work+0x219/0x400 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 </TASK>
Allocated by task 1039: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __kmalloc+0x19c/0x360 mlxsw_sp_acl_tcam_entry_create+0x7b/0x1f0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x30d/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30
Freed by task 1039: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 poison_slab_object+0x102/0x170 __kasan_slab_free+0x14/0x30 kfree+0xc1/0x290 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3d7/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30
Fixes: 2bffc5322fd8 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/1fcce0a60b231ebeb2515d91022284ba7b4ffe7a.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
d90cfe20 |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work
The purpose of the rehash delayed work is to reduce the number of masks (eRPs) used by an ACL region as the eRP bank is a global and lim
mlxsw: spectrum_acl_tcam: Fix race during rehash delayed work
The purpose of the rehash delayed work is to reduce the number of masks (eRPs) used by an ACL region as the eRP bank is a global and limited resource.
This is done in three steps:
1. Creating a new set of masks and a new ACL region which will use the new masks and to which the existing filters will be migrated to. The new region is assigned to 'vregion->region' and the region from which the filters are migrated from is assigned to 'vregion->region2'.
2. Migrating all the filters from the old region to the new region.
3. Destroying the old region and setting 'vregion->region2' to NULL.
Only the second steps is performed under the 'vregion->lock' mutex although its comments says that among other things it "Protects consistency of region, region2 pointers".
This is problematic as the first step can race with filter insertion from user space that uses 'vregion->region', but under the mutex.
Fix by holding the mutex across the entirety of the delayed work and not only during the second step.
Fixes: 2bffc5322fd8 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/1ec1d54edf2bad0a369e6b4fa030aba64e1f124b.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
627f9c1b |
| 22-Apr-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix race in region ID allocation
Region identifiers can be allocated both when user space tries to insert a new tc filter and when filters are migrated from one region to a
mlxsw: spectrum_acl_tcam: Fix race in region ID allocation
Region identifiers can be allocated both when user space tries to insert a new tc filter and when filters are migrated from one region to another as part of the rehash delayed work.
There is no lock protecting the bitmap from which these identifiers are allocated from, which is racy and leads to bad parameter errors from the device's firmware.
Fix by converting the bitmap to IDA which handles its own locking. For consistency, do the same for the group identifiers that are part of the same structure.
Fixes: 2bffc5322fd8 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()") Reported-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/ce494b7940cadfe84f3e18da7785b51ef5f776e3.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
5625ca56 |
| 19-Apr-2024 |
Mateusz Polchlopek <mateusz.polchlopek@intel.com> |
devlink: extend devlink_param *set pointer
Extend devlink_param *set function pointer to take extack as a param. Sometimes it is needed to pass information to the end user from set function. It is m
devlink: extend devlink_param *set pointer
Extend devlink_param *set function pointer to take extack as a param. Sometimes it is needed to pass information to the end user from set function. It is more proper to use for that netlink instead of passing message to dmesg.
Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
show more ...
|
#
1267f722 |
| 26-Jan-2024 |
Amit Cohen <amcohen@nvidia.com> |
mlxsw: Use refcount_t for reference counting
mlxsw driver uses 'unsigned int' for reference counters in several structures. Instead, use refcount_t type which allows us to catch overflow and underfl
mlxsw: Use refcount_t for reference counting
mlxsw driver uses 'unsigned int' for reference counters in several structures. Instead, use refcount_t type which allows us to catch overflow and underflow issues. Change the type of the counters and use the appropriate API.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
#
483ae90d |
| 17-Jan-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix stack corruption
When tc filters are first added to a net device, the corresponding local port gets bound to an ACL group in the device. The group contains a list of AC
mlxsw: spectrum_acl_tcam: Fix stack corruption
When tc filters are first added to a net device, the corresponding local port gets bound to an ACL group in the device. The group contains a list of ACLs. In turn, each ACL points to a different TCAM region where the filters are stored. During forwarding, the ACLs are sequentially evaluated until a match is found.
One reason to place filters in different regions is when they are added with decreasing priorities and in an alternating order so that two consecutive filters can never fit in the same region because of their key usage.
In Spectrum-2 and newer ASICs the firmware started to report that the maximum number of ACLs in a group is more than 16, but the layout of the register that configures ACL groups (PAGT) was not updated to account for that. It is therefore possible to hit stack corruption [1] in the rare case where more than 16 ACLs in a group are required.
Fix by limiting the maximum ACL group size to the minimum between what the firmware reports and the maximum ACLs that fit in the PAGT register.
Add a test case to make sure the machine does not crash when this condition is hit.
[1] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: mlxsw_sp_acl_tcam_group_update+0x116/0x120 [...] dump_stack_lvl+0x36/0x50 panic+0x305/0x330 __stack_chk_fail+0x15/0x20 mlxsw_sp_acl_tcam_group_update+0x116/0x120 mlxsw_sp_acl_tcam_group_region_attach+0x69/0x110 mlxsw_sp_acl_tcam_vchunk_get+0x492/0xa20 mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 mlxsw_sp_acl_rule_add+0x47/0x240 mlxsw_sp_flower_replace+0x1a9/0x1d0 tc_setup_cb_add+0xdc/0x1c0 fl_hw_replace_filter+0x146/0x1f0 fl_change+0xc17/0x1360 tc_new_tfilter+0x472/0xb90 rtnetlink_rcv_msg+0x313/0x3b0 netlink_rcv_skb+0x58/0x100 netlink_unicast+0x244/0x390 netlink_sendmsg+0x1e4/0x440 ____sys_sendmsg+0x164/0x260 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xc0 do_syscall_64+0x40/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC") Reported-by: Orel Hagag <orelh@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/2d91c89afba59c22587b444994ae419dbea8d876.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
efeb7dfe |
| 17-Jan-2024 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
When calling mlxsw_sp_acl_tcam_region_destroy() from an error path after failing to attach the region to an ACL group, we hit a N
mlxsw: spectrum_acl_tcam: Fix NULL pointer dereference in error path
When calling mlxsw_sp_acl_tcam_region_destroy() from an error path after failing to attach the region to an ACL group, we hit a NULL pointer dereference upon 'region->group->tcam' [1].
Fix by retrieving the 'tcam' pointer using mlxsw_sp_acl_to_tcam().
[1] BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] RIP: 0010:mlxsw_sp_acl_tcam_region_destroy+0xa0/0xd0 [...] Call Trace: mlxsw_sp_acl_tcam_vchunk_get+0x88b/0xa20 mlxsw_sp_acl_tcam_ventry_add+0x25/0xe0 mlxsw_sp_acl_rule_add+0x47/0x240 mlxsw_sp_flower_replace+0x1a9/0x1d0 tc_setup_cb_add+0xdc/0x1c0 fl_hw_replace_filter+0x146/0x1f0 fl_change+0xc17/0x1360 tc_new_tfilter+0x472/0xb90 rtnetlink_rcv_msg+0x313/0x3b0 netlink_rcv_skb+0x58/0x100 netlink_unicast+0x244/0x390 netlink_sendmsg+0x1e4/0x440 ____sys_sendmsg+0x164/0x260 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xc0 do_syscall_64+0x40/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Fixes: 22a677661f56 ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/fb6a4542bbc9fcab5a523802d97059bffbca7126.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
74cbc3c0 |
| 06-Feb-2023 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code
Cited commit added 'DEVLINK_CMD_PARAM_DEL' notifications whenever the network namespace of the devlink instance is changed. Specifically, th
mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code
Cited commit added 'DEVLINK_CMD_PARAM_DEL' notifications whenever the network namespace of the devlink instance is changed. Specifically, the notifications are generated after calling reload_down(), but before calling reload_up(). At this stage, the data structures accessed while reading the value of the "acl_region_rehash_interval" devlink parameter are uninitialized, resulting in a use-after-free [1].
Fix by moving the registration and unregistration of the devlink parameter to the TCAM code where it is actually used. This means that the parameter is unregistered during reload_down() and then re-registered during reload_up(), avoiding the use-after-free between these two operations.
Reproducer:
# ip netns add test123 # devlink dev reload pci/0000:06:00.0 netns test123
[1] BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 Read of size 4 at addr ffff888162fd37d8 by task devlink/1323 [...] Call Trace: <TASK> dump_stack_lvl+0x95/0xbd print_report+0x181/0x4a1 kasan_report+0xdb/0x200 mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0 mlxsw_sp_params_acl_region_rehash_intrvl_get+0x32/0x80 devlink_nl_param_fill.constprop.0+0x29a/0x11e0 devlink_param_notify.constprop.0+0xb9/0x250 devlink_notify_unregister+0xbc/0x470 devlink_reload+0x1aa/0x440 devlink_nl_cmd_reload+0x559/0x11b0 genl_family_rcv_msg_doit.isra.0+0x1f8/0x2e0 genl_rcv_msg+0x558/0x7f0 netlink_rcv_skb+0x170/0x440 genl_rcv+0x2d/0x40 netlink_unicast+0x53f/0x810 netlink_sendmsg+0x961/0xe80 __sys_sendto+0x2a4/0x420 __x64_sys_sendto+0xe5/0x1c0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 7d7e9169a3ec ("devlink: move devlink reload notifications back in between _down() and _up() calls") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
194ab947 |
| 06-Feb-2023 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Reorder functions to avoid forward declarations
Move the initialization and de-initialization code further below in order to avoid forward declarations in the next patch. N
mlxsw: spectrum_acl_tcam: Reorder functions to avoid forward declarations
Move the initialization and de-initialization code further below in order to avoid forward declarations in the next patch. No functional changes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
61fe3b91 |
| 06-Feb-2023 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Make fini symmetric to init
Move mutex_destroy() to the end to make the function symmetric with mlxsw_sp_acl_tcam_init(). No functional changes.
Signed-off-by: Ido Schimme
mlxsw: spectrum_acl_tcam: Make fini symmetric to init
Move mutex_destroy() to the end to make the function symmetric with mlxsw_sp_acl_tcam_init(). No functional changes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
65823e07 |
| 06-Feb-2023 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl_tcam: Add missing mutex_destroy()
Pair mutex_init() with a mutex_destroy() in the error path. Found during code review. No functional changes.
Signed-off-by: Ido Schimmel <idosc
mlxsw: spectrum_acl_tcam: Add missing mutex_destroy()
Pair mutex_init() with a mutex_destroy() in the error path. Found during code review. No functional changes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
d1314096 |
| 04-May-2022 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl: Do not report activity for multicast routes
The driver periodically queries the device for activity of ACL rules in order to report it to tc upon 'FLOW_CLS_STATS'.
In Spectrum-
mlxsw: spectrum_acl: Do not report activity for multicast routes
The driver periodically queries the device for activity of ACL rules in order to report it to tc upon 'FLOW_CLS_STATS'.
In Spectrum-2 and later ASICs, multicast routes are programmed as ACL rules, but unlike rules installed by tc, their activity is of no interest.
Avoid unnecessary activity query for such rules by always reporting them as inactive.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
2c087dfc |
| 24-Oct-2021 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
mlxsw: spectrum: Use 'bitmap_zalloc()' when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments.
Also change the corr
mlxsw: spectrum: Use 'bitmap_zalloc()' when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments.
Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
72865028 |
| 27-Sep-2020 |
Ido Schimmel <idosch@nvidia.com> |
mlxsw: spectrum_acl: Fix mlxsw_sp_acl_tcam_group_add()'s error path
If mlxsw_sp_acl_tcam_group_id_get() fails, the mutex initialized earlier is not destroyed.
Fix this by initializing the mutex aft
mlxsw: spectrum_acl: Fix mlxsw_sp_acl_tcam_group_add()'s error path
If mlxsw_sp_acl_tcam_group_id_get() fails, the mutex initialized earlier is not destroyed.
Fix this by initializing the mutex after calling the function. This is symmetric to mlxsw_sp_acl_tcam_group_del().
Fixes: 5ec2ee28d27b ("mlxsw: spectrum_acl: Introduce a mutex to guard region list updates") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
593bb843 |
| 09-May-2020 |
Jiri Pirko <jiri@mellanox.com> |
mlxsw: spectrum_flower: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum rule priority for specified chain. This is going to be u
mlxsw: spectrum_flower: Expose a function to get min and max rule priority
Introduce an infrastructure that allows to get minimum and maximum rule priority for specified chain. This is going to be used by a subsequent patch to enforce ordering between flower and matchall filters.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
6ef4889f |
| 27-Apr-2020 |
Jiri Pirko <jiri@mellanox.com> |
mlxsw: spectrum_acl_tcam: Position vchunk in a vregion list properly
Vregion helpers to get min and max priority depend on the correct ordering of vchunks in the vregion list. However, the current c
mlxsw: spectrum_acl_tcam: Position vchunk in a vregion list properly
Vregion helpers to get min and max priority depend on the correct ordering of vchunks in the vregion list. However, the current code always adds new chunk to the end of the list, no matter what the priority is. Fix this by finding the correct place in the list and put vchunk there.
Fixes: 22a677661f56 ("mlxsw: spectrum: Introduce ACL core with simple TCAM implementation") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
e99f8e7f |
| 18-Feb-2020 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
mlxsw: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare var
mlxsw: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99:
struct foo { int stuff; struct boo array[]; };
By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this change:
"Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
a4e76ba6 |
| 31-Mar-2019 |
Jiri Pirko <jiri@mellanox.com> |
mlxsw: spectrum_acl: Rename rehash_dis trace
The name of the trace is no longer correct, since there is no disable of rehash done. So name it "rehash_rollback_failed".
Signed-off-by: Jiri Pirko <ji
mlxsw: spectrum_acl: Rename rehash_dis trace
The name of the trace is no longer correct, since there is no disable of rehash done. So name it "rehash_rollback_failed".
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7c33c72b |
| 31-Mar-2019 |
Jiri Pirko <jiri@mellanox.com> |
mlxsw: spectrum_acl: Remove failed_rollback dead end
Currently if a rollback ends with error, the vregion is in a zombie state until end of the existence. Instead of that, rather try to continue whe
mlxsw: spectrum_acl: Remove failed_rollback dead end
Currently if a rollback ends with error, the vregion is in a zombie state until end of the existence. Instead of that, rather try to continue where rollback ended later on (after rehash interval).
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|