#
5f60d5f6 |
| 01-Oct-2024 |
Al Viro <viro@zeniv.linux.org.uk> |
move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-
move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
show more ...
|
#
76c313f6 |
| 13-Sep-2024 |
Keith Busch <kbusch@kernel.org> |
blk-integrity: improved sg segment mapping
Make the integrity mapping more like data mapping, blk_rq_map_sg. Use the request to validate the segment count, and update the callers so they don't have
blk-integrity: improved sg segment mapping
Make the integrity mapping more like data mapping, blk_rq_map_sg. Use the request to validate the segment count, and update the callers so they don't have to.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20240913191746.2628196-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
f4330766 |
| 13-Sep-2024 |
Keith Busch <kbusch@kernel.org> |
nvme-rdma: use request to get integrity segments
The request tracks the integrity segments already, so no need to recount the segments again.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by
nvme-rdma: use request to get integrity segments
The request tracks the integrity segments already, so no need to recount the segments again.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20240913182854.2445457-8-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
cead0b89 |
| 26-Aug-2024 |
Anuj Gupta <anuj20.g@samsung.com> |
nvme: rename apptag and appmask to lbat and lbatm
Rename apptag and appmask to lbat and lbatm so that it matches the field names used in NVMe spec.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
nvme: rename apptag and appmask to lbat and lbatm
Rename apptag and appmask to lbat and lbatm so that it matches the field names used in NVMe spec.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
03c3d7c7 |
| 15-Aug-2024 |
Niklas Cassel <cassel@kernel.org> |
nvme-rdma: send cntlid in the RDMA_CM_REQUEST Private Data
When sending a RDMA_CM_REQUEST, the NVMe RDMA Transport Specification allows you to populate the cntlid field in the RDMA_CM_REQUEST Privat
nvme-rdma: send cntlid in the RDMA_CM_REQUEST Private Data
When sending a RDMA_CM_REQUEST, the NVMe RDMA Transport Specification allows you to populate the cntlid field in the RDMA_CM_REQUEST Private Data.
The cntlid is returned by the target on completion of the first RDMA_CM_REQUEST command (which creates the admin queue).
The cntlid field can then be populated by the host when the I/O queues are created (using additional RDMA_CM_REQUEST commands), such that the target can perform extra validation for additional RDMA_CM_REQUEST commands.
This additional error code and error message is also added, such that nvme_rdma_cm_msg() will display the proper error message if the target fails the RDMA_CM_REQUEST command because of this extra validation.
Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
210b1f65 |
| 24-Jun-2024 |
Keith Busch <kbusch@kernel.org> |
nvme-pci: do not directly handle subsys reset fallout
Scheduling reset_work after a nvme subsystem reset is expected to fail on pcie, but this also prevents potential handling the platform's pcie se
nvme-pci: do not directly handle subsys reset fallout
Scheduling reset_work after a nvme subsystem reset is expected to fail on pcie, but this also prevents potential handling the platform's pcie services may provide that might successfully recovering the link without re-enumeration. Such examples include AER, DPC, and power's EEH.
Provide a pci specific operation that safely initiates a subsystem reset, and instead of scheduling reset work, read back the status register to trigger a pcie read error.
Since this only affects pci, the other fabrics drivers subscribe to a generic nvmf subsystem reset that is exactly the same as before. The loop fabric doesn't use it because nvmet doesn't support setting that property anyway.
And since we're using the magic NSSR value in two places now, provide a symbolic define for it.
Reported-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
1a9e2181 |
| 04-Jun-2024 |
Keith Busch <kbusch@kernel.org> |
nvme: split device add from initialization
Combining both creates an ambiguous cleanup scenario for the caller if an error is returned: does the device reference need to be dropped or did the error
nvme: split device add from initialization
Combining both creates an ambiguous cleanup scenario for the caller if an error is returned: does the device reference need to be dropped or did the error occur before the device was initialized? If an error occurs after the device is added, then the existing cleanup routines will leak memory.
Furthermore, the nvme core is taking it upon itself to free the device's kobj name under certain conditions rather than go through the core device API. We shouldn't be peaking into these implementation details.
Split the device initialization from the addition to make it easier to know the error handling actions, fix the existing memory leaks, and stop the device layering violations.
Link: https://lore.kernel.org/linux-nvme/c4050a37-ecc9-462c-9772-65e25166f439@grimberg.me/ Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
ea47c471 |
| 04-Jun-2024 |
Keith Busch <kbusch@kernel.org> |
nvme: rdma: split controller bringup handling
Drivers must call nvme_uninit_ctrl after a successful nvme_init_ctrl. Split the allocation side out to make the error handling boundary easier to naviga
nvme: rdma: split controller bringup handling
Drivers must call nvme_uninit_ctrl after a successful nvme_init_ctrl. Split the allocation side out to make the error handling boundary easier to navigate. The nvme rdma driver's error handling had different returns in the error goto label's, which harm readability.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
54a76c87 |
| 05-May-2024 |
Tokunori Ikegami <ikegami.t@gmail.com> |
nvme-rdma, nvme-tcp: include max reconnects for reconnect logging
Makes clear max reconnects translated by ctrl loss tmo and reconnect delay.
Signed-off-by: Tokunori Ikegami <ikegami.t@gmail.com> S
nvme-rdma, nvme-tcp: include max reconnects for reconnect logging
Makes clear max reconnects translated by ctrl loss tmo and reconnect delay.
Signed-off-by: Tokunori Ikegami <ikegami.t@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
adfde7ed |
| 30-Apr-2024 |
Hannes Reinecke <hare@suse.de> |
nvme-fabrics: short-circuit reconnect retries
Returning a nvme status from nvme_tcp_setup_ctrl() indicates that the association was established and we have received a status from the controller; con
nvme-fabrics: short-circuit reconnect retries
Returning a nvme status from nvme_tcp_setup_ctrl() indicates that the association was established and we have received a status from the controller; consequently we should honour the DNR bit. If not any future reconnect attempts will just return the same error, so we can short-circuit the reconnect attempts and fail the connection directly.
Signed-off-by: Hannes Reinecke <hare@suse.de> [dwagner: - extended nvme_should_reconnect] Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
ad178ba9 |
| 23-Jan-2024 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvme-rdma: clamp queue size according to ctrl cap
If a controller is configured with metadata support, clamp the maximal queue size to be 128 since there are more resources that are needed for metad
nvme-rdma: clamp queue size according to ctrl cap
If a controller is configured with metadata support, clamp the maximal queue size to be 128 since there are more resources that are needed for metadata operations. Otherwise, clamp it to 256.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
d2045e6a |
| 29-Jan-2024 |
Nitin U. Yewale <nyewale@redhat.com> |
nvme-rdma: show hostnqn when connecting to rdma target
Log hostnqn when connecting to nvme target. As hostnqn could be changed, logging this information in syslog at appropriate time may help in tro
nvme-rdma: show hostnqn when connecting to rdma target
Log hostnqn when connecting to nvme target. As hostnqn could be changed, logging this information in syslog at appropriate time may help in troubleshooting.
Signed-off-by: Nitin U. Yewale <nyewale@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
7d23e836 |
| 31-Jan-2024 |
Caleb Sander <csander@purestorage.com> |
nvme: split out fabrics version of nvme_opcode_str()
nvme_opcode_str() currently supports admin, IO, and fabrics commands. However, fabrics commands aren't allowed for the pci transport. Currently t
nvme: split out fabrics version of nvme_opcode_str()
nvme_opcode_str() currently supports admin, IO, and fabrics commands. However, fabrics commands aren't allowed for the pci transport. Currently the pci caller passes 0 as the fctype, which means any fabrics command would be displayed as "Property Set".
Move fabrics command support into a function nvme_fabrics_opcode_str() and remove the fctype argument to nvme_opcode_str(). This way, a fabrics command will display as "Unknown" for pci. Convert the rdma and tcp transports to use nvme_fabrics_opcode_str().
Signed-off-by: Caleb Sander <csander@purestorage.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
15ade5bf |
| 24-Jan-2024 |
Israel Rukshin <israelr@nvidia.com> |
nvme-rdma: Fix transfer length when write_generate/read_verify are 0
When the block layer doesn't generate/verify metadata, the SG length is smaller than the transfer length. This is because the SG
nvme-rdma: Fix transfer length when write_generate/read_verify are 0
When the block layer doesn't generate/verify metadata, the SG length is smaller than the transfer length. This is because the SG length doesn't include the metadata length that is added by the HW on the wire. The target failes those commands with "Data SGL Length Invalid" by comparing the transfer length and the SG length. Fix it by adding the metadata length to the transfer length when there is no metadata SGL. The bug reproduces when setting read_verify/write_generate configs to 0 at the child multipath device or at the primary device when NVMe multipath is disabled.
Note that setting those configs to 0 on the multipath device (ns_head) doesn't have any impact on the I/Os.
Fixes: 5ec5d3bddc6b ("nvme-rdma: add metadata/T10-PI support") Signed-off-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
92b0b0ff |
| 23-Jan-2024 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvme: add module description to stop warnings
Add MODULE_DESCRIPTION() in order to remove warnings & get clean build:-
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-core.
nvme: add module description to stop warnings
Add MODULE_DESCRIPTION() in order to remove warnings & get clean build:-
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-core.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-fabrics.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-rdma.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-fc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/host/nvme-tcp.o
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
a5c1a87c |
| 07-Jan-2024 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvme-rdma: enhance timeout kernel log
Print the command_id along side blk-mq's tag to help match commands with protocol wire traces and logs.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Revi
nvme-rdma: enhance timeout kernel log
Print the command_id along side blk-mq's tag to help match commands with protocol wire traces and logs.
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
0372dd4e |
| 18-Dec-2023 |
Daniel Wagner <dwagner@suse.de> |
nvme: refactor ns info helpers
Pass in the nvme_ns_head pointer directly. This reduces the necessity on the caller side have the nvme_ns data structure present. Thus we can refactor the caller side
nvme: refactor ns info helpers
Pass in the nvme_ns_head pointer directly. This reduces the necessity on the caller side have the nvme_ns data structure present. Thus we can refactor the caller side in the next step as well.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
9419e71b |
| 18-Dec-2023 |
Daniel Wagner <dwagner@suse.de> |
nvme: move ns id info to struct nvme_ns_head
Move the namesapce info to struct nvme_ns_head, because it's the same for all associated namespaces.
Note: with multipathing enabled the PI information
nvme: move ns id info to struct nvme_ns_head
Move the namesapce info to struct nvme_ns_head, because it's the same for all associated namespaces.
Note: with multipathing enabled the PI information is shared between all paths. If a path is using a different PI configuration it will overwrite the previous settings. This is obviously not correct and such configuration will be rejected in future. For the time being we expect a correctly configured storage.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
e6e7f7ac |
| 27-Oct-2023 |
Keith Busch <kbusch@kernel.org> |
nvme: ensure reset state check ordering
A different CPU may be setting the ctrl->state value, so ensure proper barriers to prevent optimizing to a stale state. Normally it isn't a problem to observe
nvme: ensure reset state check ordering
A different CPU may be setting the ctrl->state value, so ensure proper barriers to prevent optimizing to a stale state. Normally it isn't a problem to observe the wrong state as it is merely advisory to take a quicker path during initialization and error recovery, but seeing an old state can report unexpected ENETRESET errors when a reset request was in fact successful.
Reported-by: Minh Hoang <mh2022@meta.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Hannes Reinecke <hare@suse.de>
show more ...
|
#
3af755a4 |
| 21-Nov-2023 |
Hannes Reinecke <hare@suse.de> |
nvme: move nvme_stop_keep_alive() back to original position
Stopping keep-alive not only stops the keep-alive workqueue, but also needs to be synchronized with I/O termination as we must not send a
nvme: move nvme_stop_keep_alive() back to original position
Stopping keep-alive not only stops the keep-alive workqueue, but also needs to be synchronized with I/O termination as we must not send a keep-alive command after all I/O had been terminated. So to avoid any regressions move the call to stop_keep_alive() back to its original position and ensure that keep-alive is correctly stopped failing to setup the admin queue.
Fixes: 4733b65d82bd ("nvme: start keep-alive after admin queue setup") Suggested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
3820c4fd |
| 31-Jul-2023 |
Maurizio Lombardi <mlombard@redhat.com> |
nvme-rdma: do not try to stop unallocated queues
Trying to stop a queue which hasn't been allocated will result in a warning due to calling mutex_lock() against an uninitialized mutex.
DEBUG_LOCKS
nvme-rdma: do not try to stop unallocated queues
Trying to stop a queue which hasn't been allocated will result in a warning due to calling mutex_lock() against an uninitialized mutex.
DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 4 PID: 104150 at kernel/locking/mutex.c:579
Call trace: RIP: 0010:__mutex_lock+0x1173/0x14a0 nvme_rdma_stop_queue+0x1b/0xa0 [nvme_rdma] nvme_rdma_teardown_io_queues.part.0+0xb0/0x1d0 [nvme_rdma] nvme_rdma_delete_ctrl+0x50/0x100 [nvme_rdma] nvme_do_delete_ctrl+0x149/0x158 [nvme_core]
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
29b434d1 |
| 11-Jul-2023 |
Ming Lei <ming.lei@redhat.com> |
nvme-rdma: fix potential unbalanced freeze & unfreeze
Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits:
1) fix unbalanced freeze and unfreeze, since re-con
nvme-rdma: fix potential unbalanced freeze & unfreeze
Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits:
1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal
2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown.
One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal:
1) same problem exists with current code base
2) compared with !mpath, mpath use case is dominant
Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
a249d306 |
| 26-Apr-2023 |
Keith Busch <kbusch@kernel.org> |
nvme-fabrics: add queue setup helpers
tcp and rdma transports have lots of duplicate code setting up the different queue mappings. Add common helpers.
Cc: Chaitanya Kulkarni <kch@nvidia.com> Review
nvme-fabrics: add queue setup helpers
tcp and rdma transports have lots of duplicate code setting up the different queue mappings. Add common helpers.
Cc: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
f3f28373 |
| 19-Apr-2023 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvme-rdma: fix typo in comment
There is no ib_stop_cq API and the need for the +1 is for ib_drain_qp.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Israel Rukshin <israelr@nvidia.com>
nvme-rdma: fix typo in comment
There is no ib_stop_cq API and the need for the +1 is for ib_drain_qp.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
show more ...
|
#
edde9e70 |
| 22-Mar-2023 |
Sagi Grimberg <sagi@grimberg.me> |
blk-mq-rdma: remove queue mapping helper for rdma devices
No rdma device exposes its irq vectors affinity today. So the only mapping that we have left, is the default blk_mq_map_queues, which we fal
blk-mq-rdma: remove queue mapping helper for rdma devices
No rdma device exposes its irq vectors affinity today. So the only mapping that we have left, is the default blk_mq_map_queues, which we fallback to anyways. Also fixup the only consumer of this helper (nvme-rdma).
Remove this now dead code.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
show more ...
|