#
d31563b5 |
| 06-Feb-2024 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: strictly check the count parameter of eventfd_write to avoid inputting illegal strings
Since eventfd's document has clearly stated: A write(2) call adds the 8-byte integer value supplied in
eventfd: strictly check the count parameter of eventfd_write to avoid inputting illegal strings
Since eventfd's document has clearly stated: A write(2) call adds the 8-byte integer value supplied in its buffer to the counter.
However, in the current implementation, the following code snippet did not cause an error:
char str[16] = "hello world"; uint64_t value; ssize_t size; int fd;
fd = eventfd(0, 0); size = write(fd, &str, strlen(str)); printf("eventfd: test writing a string, size=%ld\n", size); size = read(fd, &value, sizeof(value)); printf("eventfd: test reading as uint64, size=%ld, valus=0x%lX\n", size, value);
close(fd);
And its output is: eventfd: test writing a string, size=8 eventfd: test reading as uint64, size=8, valus=0x6F77206F6C6C6568
By checking whether count is equal to sizeof(ucnt), such errors could be detected. It also follows the requirements of the manual.
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Link: https://lore.kernel.org/r/tencent_10AAA44731FFFA493F9F5501521F07DD4D0A@qq.com Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Eric Biggers <ebiggers@google.com> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
bd46543d |
| 15-Jan-2024 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: move 'eventfd-count' printing out of spinlock
When printing eventfd->count, interrupts will be disabled and a spinlock will be obtained, competing with eventfd_write(). By moving the "event
eventfd: move 'eventfd-count' printing out of spinlock
When printing eventfd->count, interrupts will be disabled and a spinlock will be obtained, competing with eventfd_write(). By moving the "eventfd-count" print out of the spinlock and merging multiple seq_printf() into one, it could improve a bit, just like timerfd_show().
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Link: https://lore.kernel.org/r/tencent_B0B3D2BD9861FD009E03AB18A81783322709@qq.com Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dylan Yudaken <dylany@fb.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Eric Biggers <ebiggers@google.com> Cc: <linux-fsdevel@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
6b6ec4ca |
| 10-Jan-2024 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: add a BUILD_BUG_ON() to ensure consistency between EFD_SEMAPHORE and the uapi
introduce a BUILD_BUG_ON to check that the EFD_SEMAPHORE is equal to its definition in the uapi file, just like
eventfd: add a BUILD_BUG_ON() to ensure consistency between EFD_SEMAPHORE and the uapi
introduce a BUILD_BUG_ON to check that the EFD_SEMAPHORE is equal to its definition in the uapi file, just like EFD_CLOEXEC and EFD_NONBLOCK.
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Link: https://lore.kernel.org/r/tencent_0BAA2DEAF9208D49987457E6583F9BE79507@qq.com Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jan Kara <jack@suse.cz> Cc: <linux-fsdevel@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
43422523 |
| 10-Dec-2023 |
Christophe JAILLET <christophe.jaillet@wanadoo.fr> |
eventfd: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
This is less verbose.
Signed-
eventfd: Remove usage of the deprecated ida_simple_xx() API
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove().
This is less verbose.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/575dcecd51097dd30c5515f9f0ed92076b4ef403.1702229520.git.christophe.jaillet@wanadoo.fr Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
b7638ad0 |
| 22-Nov-2023 |
Christian Brauner <brauner@kernel.org> |
eventfd: make eventfd_signal{_mask}() void
No caller care about the return value.
Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-4-bd549b14ce0c@kernel.org Reviewed-by: Jan Kara <jac
eventfd: make eventfd_signal{_mask}() void
No caller care about the return value.
Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-4-bd549b14ce0c@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
120ae585 |
| 22-Nov-2023 |
Christian Brauner <brauner@kernel.org> |
eventfd: simplify eventfd_signal_mask()
The eventfd_signal_mask() helper was introduced for io_uring and similar to eventfd_signal() it always passed 1 for @n. So don't bother with that argument at
eventfd: simplify eventfd_signal_mask()
The eventfd_signal_mask() helper was introduced for io_uring and similar to eventfd_signal() it always passed 1 for @n. So don't bother with that argument at all.
Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-3-bd549b14ce0c@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
3652117f |
| 22-Nov-2023 |
Christian Brauner <brauner@kernel.org> |
eventfd: simplify eventfd_signal()
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed
eventfd: simplify eventfd_signal()
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument.
Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
758b4920 |
| 09-Jul-2023 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: prevent underflow for eventfd semaphores
For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
An unde
eventfd: prevent underflow for eventfd semaphores
For eventfd with flag EFD_SEMAPHORE, when its ctx->count is 0, calling eventfd_ctx_do_read will cause ctx->count to overflow to ULLONG_MAX.
An underflow can happen with EFD_SEMAPHORE eventfds in at least the following three subsystems:
(1) virt/kvm/eventfd.c (2) drivers/vfio/virqfd.c (3) drivers/virt/acrn/irqfd.c
where (2) and (3) are just modeled after (1). An eventfd must be specified for use with the KVM_IRQFD ioctl(). This can also be an EFD_SEMAPHORE eventfd. When the eventfd count is zero or has been decremented to zero an underflow can be triggered when the irqfd is shut down by raising the KVM_IRQFD_FLAG_DEASSIGN flag in the KVM_IRQFD ioctl():
// ctx->count == 0 kvm_vm_ioctl() -> kvm_irqfd() -> kvm_irqfd_deassign() -> irqfd_deactivate() -> irqfd_shutdown() -> eventfd_ctx_remove_wait_queue(&cnt) -> eventfd_ctx_do_read(&cnt)
Userspace polling on the eventfd wouldn't notice the underflow because 1 is always returned as the value from eventfd_read() while ctx->count would've underflowed. It's not a huge deal because this should only be happening when the irqfd is shutdown but we should still fix it and avoid the spurious wakeup.
Fixes: cb289d6244a3 ("eventfd - allow atomic read and waitqueue remove") Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dylan Yudaken <dylany@fb.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <tencent_7588DFD1F365950A757310D764517A14B306@qq.com> [brauner: rewrite commit message and add explanation how this underflow can happen] Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
33d8b5d7 |
| 13-Jun-2023 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: show the EFD_SEMAPHORE flag in fdinfo
The EFD_SEMAPHORE flag should be displayed in fdinfo, as different value could affect the behavior of eventfd.
Suggested-by: Christian Brauner <braune
eventfd: show the EFD_SEMAPHORE flag in fdinfo
The EFD_SEMAPHORE flag should be displayed in fdinfo, as different value could affect the behavior of eventfd.
Suggested-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christian Brauner <brauner@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dylan Yudaken <dylany@fb.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Eric Biggers <ebiggers@google.com> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <tencent_05B9CFEFE6B9BC2A9B3A27886A122A7D9205@qq.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
38f1755a |
| 11-May-2023 |
Min-Hua Chen <minhuadotchen@gmail.com> |
fs: use correct __poll_t type
Fix the following sparse warnings by using __poll_t instead of unsigned type.
fs/eventpoll.c:541:9: sparse: warning: restricted __poll_t degrades to integer fs/eventfd
fs: use correct __poll_t type
Fix the following sparse warnings by using __poll_t instead of unsigned type.
fs/eventpoll.c:541:9: sparse: warning: restricted __poll_t degrades to integer fs/eventfd.c:67:17: sparse: warning: restricted __poll_t degrades to integer
Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com> Message-Id: <20230511164628.336586-1-minhuadotchen@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
113348a4 |
| 05-Apr-2023 |
Wen Yang <wenyang.linux@foxmail.com> |
eventfd: use wait_event_interruptible_locked_irq() helper
wait_event_interruptible_locked_irq was introduced by commit 22c43c81a51e ("wait_event_interruptible_locked() interface"), but older code su
eventfd: use wait_event_interruptible_locked_irq() helper
wait_event_interruptible_locked_irq was introduced by commit 22c43c81a51e ("wait_event_interruptible_locked() interface"), but older code such as eventfd_{write,read} still uses the open code implementation. Inspired by commit 8120a8aadb20 ("fs/timerfd.c: make use of wait_event_interruptible_locked_irq()"), this patch replaces the open code implementation with a single macro call.
No functional change intended.
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dylan Yudaken <dylany@fb.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Fu Wei <wefu@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michal Nazarewicz <m.nazarewicz@samsung.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <tencent_16F9553E8354D950D704214D6EA407315F0A@qq.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
show more ...
|
#
03e02acd |
| 20-Nov-2022 |
Jens Axboe <axboe@kernel.dk> |
eventfd: provide a eventfd_signal_mask() helper
This is identical to eventfd_signal(), but it allows the caller to pass in a mask to be used for the poll wakeup key. The use case is avoiding repeate
eventfd: provide a eventfd_signal_mask() helper
This is identical to eventfd_signal(), but it allows the caller to pass in a mask to be used for the poll wakeup key. The use case is avoiding repeated multishot triggers if we have a dependency between eventfd and io_uring.
If we setup an eventfd context and register that as the io_uring eventfd, and at the same time queue a multishot poll request for the eventfd context, then any CQE posted will repeatedly trigger the multishot request until it terminates when the CQ ring overflows.
In preparation for io_uring detecting this circular dependency, add the mentioned helper so that io_uring can pass in EPOLL_URING as part of the poll wakeup key.
Cc: stable@vger.kernel.org # 6.0 [axboe: fold in !CONFIG_EVENTFD fix from Zhang Qilong] Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
9f0deaa1 |
| 16-Aug-2022 |
Dylan Yudaken <dylany@fb.com> |
eventfd: guard wake_up in eventfd fs calls as well
Guard wakeups that the user can trigger, and that may end up triggering a call back into eventfd_signal. This is in addition to the current approac
eventfd: guard wake_up in eventfd fs calls as well
Guard wakeups that the user can trigger, and that may end up triggering a call back into eventfd_signal. This is in addition to the current approach that only guards in eventfd_signal.
Rename in_eventfd_signal -> in_eventfd at the same time to reflect this.
Without this there would be a deadlock in the following code using libaio:
int main() { struct io_context *ctx = NULL; struct iocb iocb; struct iocb *iocbs[] = { &iocb }; int evfd; uint64_t val = 1;
evfd = eventfd(0, EFD_CLOEXEC); assert(!io_setup(2, &ctx)); io_prep_poll(&iocb, evfd, POLLIN); io_set_eventfd(&iocb, evfd); assert(1 == io_submit(ctx, 1, iocbs)); write(evfd, &val, 8); }
Signed-off-by: Dylan Yudaken <dylany@fb.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20220816135959.1490641-1-dylany@fb.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
b542e383 |
| 29-Jul-2021 |
Thomas Gleixner <tglx@linutronix.de> |
eventfd: Make signal recursion protection a task bit
The recursion protection for eventfd_signal() is based on a per CPU variable and relies on the !RT semantics of spin_lock_irqsave() for protectin
eventfd: Make signal recursion protection a task bit
The recursion protection for eventfd_signal() is based on a per CPU variable and relies on the !RT semantics of spin_lock_irqsave() for protecting this per CPU variable. On RT kernels spin_lock_irqsave() neither disables preemption nor interrupts which allows the spin lock held section to be preempted. If the preempting task invokes eventfd_signal() as well, then the recursion warning triggers.
Paolo suggested to protect the per CPU variable with a local lock, but that's heavyweight and actually not necessary. The goal of this protection is to prevent the task stack from overflowing, which can be achieved with a per task recursion protection as well.
Replace the per CPU variable with a per task bit similar to other recursion protection bits like task_struct::in_page_owner. This works on both !RT and RT kernels and removes as a side effect the extra per CPU storage.
No functional change for !RT kernels.
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/87wnp9idso.ffs@tglx
show more ...
|
#
28f13267 |
| 27-Oct-2020 |
David Woodhouse <dwmw@amazon.co.uk> |
eventfd: Export eventfd_ctx_do_read()
Where events are consumed in the kernel, for example by KVM's irqfd_wakeup() and VFIO's virqfd_wakeup(), they currently lack a mechanism to drain the eventfd's
eventfd: Export eventfd_ctx_do_read()
Where events are consumed in the kernel, for example by KVM's irqfd_wakeup() and VFIO's virqfd_wakeup(), they currently lack a mechanism to drain the eventfd's counter.
Since the wait queue is already locked while the wakeup functions are invoked, all they really need to do is call eventfd_ctx_do_read().
Add a check for the lock, and export it for them.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20201027135523.646811-2-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show more ...
|
#
12aceb89 |
| 01-May-2020 |
Jens Axboe <axboe@kernel.dk> |
eventfd: convert to f_op->read_iter()
eventfd is using ->read() as it's file_operations read handler, but this prevents passing in information about whether a given IO operation is blocking or not.
eventfd: convert to f_op->read_iter()
eventfd is using ->read() as it's file_operations read handler, but this prevents passing in information about whether a given IO operation is blocking or not. We can only use the file flags for that. To support async (-EAGAIN/poll based) retries for io_uring, we need ->read_iter() support. Convert eventfd to using ->read_iter().
With ->read_iter(), we can support IOCB_NOWAIT. Ensure the fd setup is done such that we set file->f_mode with FMODE_NOWAIT.
[missing include added]
Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
b5e683d5 |
| 02-Feb-2020 |
Jens Axboe <axboe@kernel.dk> |
eventfd: track eventfd_signal() recursion depth
eventfd use cases from aio and io_uring can deadlock due to circular or resursive calling, when eventfd_signal() tries to grab the waitqueue lock. On
eventfd: track eventfd_signal() recursion depth
eventfd use cases from aio and io_uring can deadlock due to circular or resursive calling, when eventfd_signal() tries to grab the waitqueue lock. On top of that, it's also possible to construct notification chains that are deep enough that we could blow the stack.
Add a percpu counter that tracks the percpu recursion depth, warn if we exceed it. The counter is also exposed so that users of eventfd_signal() can do the right thing if it's non-zero in the context where it is called.
Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Jens Axboe <axboe@kernel.dk>
show more ...
|
#
457c8996 |
| 19-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was use
treewide: Add SPDX license identifier for missed files
Add SPDX license identifiers to all files which:
- Have no license information of any form
- Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file
These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
ce528c4c |
| 14-May-2019 |
YueHaibing <yuehaibing@huawei.com> |
fs/eventfd.c: make eventfd_ida static
Fix sparse warning:
fs/eventfd.c:26:1: warning: symbol 'eventfd_ida' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20190413142348.347
fs/eventfd.c: make eventfd_ida static
Fix sparse warning:
fs/eventfd.c:26:1: warning: symbol 'eventfd_ida' was not declared. Should it be static?
Link: http://lkml.kernel.org/r/20190413142348.34716-1-yuehaibing@huawei.com Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
b556db17 |
| 14-May-2019 |
Masatake YAMATO <yamato@redhat.com> |
eventfd: present id to userspace via fdinfo
Finding endpoints of an IPC channel is one of essential task to understand how a user program works. Procfs and netlink socket provide enough hints to fi
eventfd: present id to userspace via fdinfo
Finding endpoints of an IPC channel is one of essential task to understand how a user program works. Procfs and netlink socket provide enough hints to find endpoints for IPC channels like pipes, unix sockets, and pseudo terminals. However, there is no simple way to find endpoints for an eventfd file from userland. An inode number doesn't hint. Unlike pipe, all eventfd files share the same inode object.
To provide the way to find endpoints of an eventfd file, this patch adds "eventfd-id" field to /proc/PID/fdinfo of eventfd as identifier. Integers managed by an IDA are used as ids.
A tool like lsof can utilize the information to print endpoints.
Link: http://lkml.kernel.org/r/20190327181823.20222-1-yamato@redhat.com Signed-off-by: Masatake YAMATO <yamato@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
a11e1d43 |
| 28-Jun-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "-
Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls.
Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections.
But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign.
[ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ]
Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|
#
4d572d9f |
| 08-Jun-2018 |
Avi Kivity <avi@scylladb.com> |
eventfd: only return events requested in poll_mask()
The ->poll_mask() operation has a mask of events that the caller is interested in, but we're returning all events regardless.
Change to return o
eventfd: only return events requested in poll_mask()
The ->poll_mask() operation has a mask of events that the caller is interested in, but we're returning all events regardless.
Change to return only the events the caller is interested in. This fixes aio IO_CMD_POLL returning immediately when called with POLLIN on an eventfd, since an eventfd is almost always ready for a write.
Signed-off-by: Avi Kivity <avi@scylladb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
show more ...
|
#
9e42f195 |
| 31-Dec-2017 |
Christoph Hellwig <hch@lst.de> |
eventfd: switch to ->poll_mask
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
2fc96f83 |
| 11-Mar-2018 |
Dominik Brodowski <linux@dominikbrodowski.net> |
fs: add do_eventfd() helper; remove internal call to sys_eventfd()
Using this helper removes an in-kernel call to the sys_eventfd() syscall.
This patch is part of a series which removes in-kernel c
fs: add do_eventfd() helper; remove internal call to sys_eventfd()
Using this helper removes an in-kernel call to the sys_eventfd() syscall.
This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net
Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
show more ...
|
#
a9a08845 |
| 11-Feb-2018 |
Linus Torvalds <torvalds@linux-foundation.org> |
vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAN
vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
show more ...
|