forked from kata-containers/kata-containers
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request kata-containers#8431 from UiPath/fix-vsock-packets…
…-drop kernel: Fix vsock packets drop when the driver initializes
- Loading branch information
Showing
13 changed files
with
834 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
116 | ||
117 |
73 changes: 73 additions & 0 deletions
73
...ing/kernel/patches/5.10.x/0001-vsock-virtio-avoid-potential-deadlock-when-vsock-dev.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
From d03fb89b5e6dcaf399480708ee2d5a32dca70e7a Mon Sep 17 00:00:00 2001 | ||
From: "Longpeng(Mike)" <[email protected]> | ||
Date: Thu, 12 Aug 2021 13:30:56 +0800 | ||
Subject: [PATCH 1/7] vsock/virtio: avoid potential deadlock when vsock device | ||
remove | ||
|
||
There's a potential deadlock case when remove the vsock device or | ||
process the RESET event: | ||
|
||
vsock_for_each_connected_socket: | ||
spin_lock_bh(&vsock_table_lock) ----------- (1) | ||
... | ||
virtio_vsock_reset_sock: | ||
lock_sock(sk) --------------------- (2) | ||
... | ||
spin_unlock_bh(&vsock_table_lock) | ||
|
||
lock_sock() may do initiative schedule when the 'sk' is owned by | ||
other thread at the same time, we would receivce a warning message | ||
that "scheduling while atomic". | ||
|
||
Even worse, if the next task (selected by the scheduler) try to | ||
release a 'sk', it need to request vsock_table_lock and the deadlock | ||
occur, cause the system into softlockup state. | ||
Call trace: | ||
queued_spin_lock_slowpath | ||
vsock_remove_bound | ||
vsock_remove_sock | ||
virtio_transport_release | ||
__vsock_release | ||
vsock_release | ||
__sock_release | ||
sock_close | ||
__fput | ||
____fput | ||
|
||
So we should not require sk_lock in this case, just like the behavior | ||
in vhost_vsock or vmci. | ||
|
||
Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko") | ||
Cc: Stefan Hajnoczi <[email protected]> | ||
Signed-off-by: Longpeng(Mike) <[email protected]> | ||
Reviewed-by: Stefano Garzarella <[email protected]> | ||
Link: https://lore.kernel.org/r/[email protected] | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
net/vmw_vsock/virtio_transport.c | 7 +++++-- | ||
1 file changed, 5 insertions(+), 2 deletions(-) | ||
|
||
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c | ||
index 2700a63ab095..3a056f8affd1 100644 | ||
--- a/net/vmw_vsock/virtio_transport.c | ||
+++ b/net/vmw_vsock/virtio_transport.c | ||
@@ -356,11 +356,14 @@ static void virtio_vsock_event_fill(struct virtio_vsock *vsock) | ||
|
||
static void virtio_vsock_reset_sock(struct sock *sk) | ||
{ | ||
- lock_sock(sk); | ||
+ /* vmci_transport.c doesn't take sk_lock here either. At least we're | ||
+ * under vsock_table_lock so the sock cannot disappear while we're | ||
+ * executing. | ||
+ */ | ||
+ | ||
sk->sk_state = TCP_CLOSE; | ||
sk->sk_err = ECONNRESET; | ||
sk->sk_error_report(sk); | ||
- release_sock(sk); | ||
} | ||
|
||
static void virtio_vsock_update_guest_cid(struct virtio_vsock *vsock) | ||
-- | ||
2.34.1 | ||
|
151 changes: 151 additions & 0 deletions
151
...ging/kernel/patches/5.10.x/0002-vsock-each-transport-cycles-only-on-its-own-sockets.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,151 @@ | ||
From a44b98fc926521d6cf642ed945e1d0781973d0d8 Mon Sep 17 00:00:00 2001 | ||
From: Jiyong Park <[email protected]> | ||
Date: Fri, 11 Mar 2022 11:00:16 +0900 | ||
Subject: [PATCH 2/7] vsock: each transport cycles only on its own sockets | ||
|
||
When iterating over sockets using vsock_for_each_connected_socket, make | ||
sure that a transport filters out sockets that don't belong to the | ||
transport. | ||
|
||
There actually was an issue caused by this; in a nested VM | ||
configuration, destroying the nested VM (which often involves the | ||
closing of /dev/vhost-vsock if there was h2g connections to the nested | ||
VM) kills not only the h2g connections, but also all existing g2h | ||
connections to the (outmost) host which are totally unrelated. | ||
|
||
Tested: Executed the following steps on Cuttlefish (Android running on a | ||
VM) [1]: (1) Enter into an `adb shell` session - to have a g2h | ||
connection inside the VM, (2) open and then close /dev/vhost-vsock by | ||
`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb | ||
session is not reset. | ||
|
||
[1] https://android.googlesource.com/device/google/cuttlefish/ | ||
|
||
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") | ||
Reviewed-by: Stefano Garzarella <[email protected]> | ||
Acked-by: Michael S. Tsirkin <[email protected]> | ||
Signed-off-by: Jiyong Park <[email protected]> | ||
Link: https://lore.kernel.org/r/[email protected] | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
drivers/vhost/vsock.c | 3 ++- | ||
include/net/af_vsock.h | 3 ++- | ||
net/vmw_vsock/af_vsock.c | 9 +++++++-- | ||
net/vmw_vsock/virtio_transport.c | 7 +++++-- | ||
net/vmw_vsock/vmci_transport.c | 5 ++++- | ||
5 files changed, 20 insertions(+), 7 deletions(-) | ||
|
||
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c | ||
index a483cec31d5c..e413aaaca8f9 100644 | ||
--- a/drivers/vhost/vsock.c | ||
+++ b/drivers/vhost/vsock.c | ||
@@ -695,7 +695,8 @@ static int vhost_vsock_dev_release(struct inode *inode, struct file *file) | ||
|
||
/* Iterating over all connections for all CIDs to find orphans is | ||
* inefficient. Room for improvement here. */ | ||
- vsock_for_each_connected_socket(vhost_vsock_reset_orphans); | ||
+ vsock_for_each_connected_socket(&vhost_transport.transport, | ||
+ vhost_vsock_reset_orphans); | ||
|
||
vhost_vsock_stop(vsock); | ||
vhost_vsock_flush(vsock); | ||
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h | ||
index b1c717286993..4d8589244dc7 100644 | ||
--- a/include/net/af_vsock.h | ||
+++ b/include/net/af_vsock.h | ||
@@ -197,7 +197,8 @@ struct sock *vsock_find_bound_socket(struct sockaddr_vm *addr); | ||
struct sock *vsock_find_connected_socket(struct sockaddr_vm *src, | ||
struct sockaddr_vm *dst); | ||
void vsock_remove_sock(struct vsock_sock *vsk); | ||
-void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)); | ||
+void vsock_for_each_connected_socket(struct vsock_transport *transport, | ||
+ void (*fn)(struct sock *sk)); | ||
int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk); | ||
bool vsock_find_cid(unsigned int cid); | ||
|
||
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c | ||
index 791955f5e7ec..bc3b85838158 100644 | ||
--- a/net/vmw_vsock/af_vsock.c | ||
+++ b/net/vmw_vsock/af_vsock.c | ||
@@ -333,7 +333,8 @@ void vsock_remove_sock(struct vsock_sock *vsk) | ||
} | ||
EXPORT_SYMBOL_GPL(vsock_remove_sock); | ||
|
||
-void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)) | ||
+void vsock_for_each_connected_socket(struct vsock_transport *transport, | ||
+ void (*fn)(struct sock *sk)) | ||
{ | ||
int i; | ||
|
||
@@ -342,8 +343,12 @@ void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)) | ||
for (i = 0; i < ARRAY_SIZE(vsock_connected_table); i++) { | ||
struct vsock_sock *vsk; | ||
list_for_each_entry(vsk, &vsock_connected_table[i], | ||
- connected_table) | ||
+ connected_table) { | ||
+ if (vsk->transport != transport) | ||
+ continue; | ||
+ | ||
fn(sk_vsock(vsk)); | ||
+ } | ||
} | ||
|
||
spin_unlock_bh(&vsock_table_lock); | ||
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c | ||
index 3a056f8affd1..e131121533ad 100644 | ||
--- a/net/vmw_vsock/virtio_transport.c | ||
+++ b/net/vmw_vsock/virtio_transport.c | ||
@@ -24,6 +24,7 @@ | ||
static struct workqueue_struct *virtio_vsock_workqueue; | ||
static struct virtio_vsock __rcu *the_virtio_vsock; | ||
static DEFINE_MUTEX(the_virtio_vsock_mutex); /* protects the_virtio_vsock */ | ||
+static struct virtio_transport virtio_transport; /* forward declaration */ | ||
|
||
struct virtio_vsock { | ||
struct virtio_device *vdev; | ||
@@ -383,7 +384,8 @@ static void virtio_vsock_event_handle(struct virtio_vsock *vsock, | ||
switch (le32_to_cpu(event->id)) { | ||
case VIRTIO_VSOCK_EVENT_TRANSPORT_RESET: | ||
virtio_vsock_update_guest_cid(vsock); | ||
- vsock_for_each_connected_socket(virtio_vsock_reset_sock); | ||
+ vsock_for_each_connected_socket(&virtio_transport.transport, | ||
+ virtio_vsock_reset_sock); | ||
break; | ||
} | ||
} | ||
@@ -635,7 +637,8 @@ static void virtio_vsock_remove(struct virtio_device *vdev) | ||
synchronize_rcu(); | ||
|
||
/* Reset all connected sockets when the device disappear */ | ||
- vsock_for_each_connected_socket(virtio_vsock_reset_sock); | ||
+ vsock_for_each_connected_socket(&virtio_transport.transport, | ||
+ virtio_vsock_reset_sock); | ||
|
||
/* Stop all work handlers to make sure no one is accessing the device, | ||
* so we can safely call vdev->config->reset(). | ||
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c | ||
index 8b65323207db..87de3477cb6d 100644 | ||
--- a/net/vmw_vsock/vmci_transport.c | ||
+++ b/net/vmw_vsock/vmci_transport.c | ||
@@ -75,6 +75,8 @@ static u32 vmci_transport_qp_resumed_sub_id = VMCI_INVALID_ID; | ||
|
||
static int PROTOCOL_OVERRIDE = -1; | ||
|
||
+static struct vsock_transport vmci_transport; /* forward declaration */ | ||
+ | ||
/* Helper function to convert from a VMCI error code to a VSock error code. */ | ||
|
||
static s32 vmci_transport_error_to_vsock_error(s32 vmci_error) | ||
@@ -883,7 +885,8 @@ static void vmci_transport_qp_resumed_cb(u32 sub_id, | ||
const struct vmci_event_data *e_data, | ||
void *client_data) | ||
{ | ||
- vsock_for_each_connected_socket(vmci_transport_handle_detach); | ||
+ vsock_for_each_connected_socket(&vmci_transport, | ||
+ vmci_transport_handle_detach); | ||
} | ||
|
||
static void vmci_transport_recv_pkt_work(struct work_struct *work) | ||
-- | ||
2.34.1 | ||
|
43 changes: 43 additions & 0 deletions
43
...aging/kernel/patches/5.10.x/0003-vsock-virtio-initialize-vdev-priv-before-using-VQs.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
From d57616af2cfcd67e964f91867cdcc20124b49091 Mon Sep 17 00:00:00 2001 | ||
From: Stefano Garzarella <[email protected]> | ||
Date: Wed, 23 Mar 2022 18:36:23 +0100 | ||
Subject: [PATCH 3/7] vsock/virtio: initialize vdev->priv before using VQs | ||
|
||
When we fill VQs with empty buffers and kick the host, it may send | ||
an interrupt. `vdev->priv` must be initialized before this since it | ||
is used in the virtqueue callbacks. | ||
|
||
Fixes: 0deab087b16a ("vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsock") | ||
Suggested-by: Michael S. Tsirkin <[email protected]> | ||
Signed-off-by: Stefano Garzarella <[email protected]> | ||
Acked-by: Michael S. Tsirkin <[email protected]> | ||
Reviewed-by: Stefan Hajnoczi <[email protected]> | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
net/vmw_vsock/virtio_transport.c | 3 ++- | ||
1 file changed, 2 insertions(+), 1 deletion(-) | ||
|
||
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c | ||
index e131121533ad..0711aaed17da 100644 | ||
--- a/net/vmw_vsock/virtio_transport.c | ||
+++ b/net/vmw_vsock/virtio_transport.c | ||
@@ -599,6 +599,8 @@ static int virtio_vsock_probe(struct virtio_device *vdev) | ||
INIT_WORK(&vsock->event_work, virtio_transport_event_work); | ||
INIT_WORK(&vsock->send_pkt_work, virtio_transport_send_pkt_work); | ||
|
||
+ vdev->priv = vsock; | ||
+ | ||
mutex_lock(&vsock->tx_lock); | ||
vsock->tx_run = true; | ||
mutex_unlock(&vsock->tx_lock); | ||
@@ -613,7 +615,6 @@ static int virtio_vsock_probe(struct virtio_device *vdev) | ||
vsock->event_run = true; | ||
mutex_unlock(&vsock->event_lock); | ||
|
||
- vdev->priv = vsock; | ||
rcu_assign_pointer(the_virtio_vsock, vsock); | ||
|
||
mutex_unlock(&the_virtio_vsock_mutex); | ||
-- | ||
2.34.1 | ||
|
38 changes: 38 additions & 0 deletions
38
tools/packaging/kernel/patches/5.10.x/0004-vsock-virtio-enable-VQs-early-on-probe.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
From cbb4f49c1f4e761d1d35dab0bbed3b6aed52d456 Mon Sep 17 00:00:00 2001 | ||
From: Stefano Garzarella <[email protected]> | ||
Date: Wed, 23 Mar 2022 18:36:25 +0100 | ||
Subject: [PATCH 4/7] vsock/virtio: enable VQs early on probe | ||
|
||
virtio spec requires drivers to set DRIVER_OK before using VQs. | ||
This is set automatically after probe returns, but virtio-vsock | ||
driver uses VQs in the probe function to fill rx and event VQs | ||
with new buffers. | ||
|
||
Let's fix this, calling virtio_device_ready() before using VQs | ||
in the probe function. | ||
|
||
Fixes: 0ea9e1d3a9e3 ("VSOCK: Introduce virtio_transport.ko") | ||
Signed-off-by: Stefano Garzarella <[email protected]> | ||
Acked-by: Michael S. Tsirkin <[email protected]> | ||
Reviewed-by: Stefan Hajnoczi <[email protected]> | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
net/vmw_vsock/virtio_transport.c | 2 ++ | ||
1 file changed, 2 insertions(+) | ||
|
||
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c | ||
index 0711aaed17da..ae22cc8e1b27 100644 | ||
--- a/net/vmw_vsock/virtio_transport.c | ||
+++ b/net/vmw_vsock/virtio_transport.c | ||
@@ -601,6 +601,8 @@ static int virtio_vsock_probe(struct virtio_device *vdev) | ||
|
||
vdev->priv = vsock; | ||
|
||
+ virtio_device_ready(vdev); | ||
+ | ||
mutex_lock(&vsock->tx_lock); | ||
vsock->tx_run = true; | ||
mutex_unlock(&vsock->tx_lock); | ||
-- | ||
2.34.1 | ||
|
Oops, something went wrong.