From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFE13C54E71 for ; Fri, 22 Mar 2024 03:16:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 026026B007B; Thu, 21 Mar 2024 23:16:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F17C96B0082; Thu, 21 Mar 2024 23:16:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E377A6B0083; Thu, 21 Mar 2024 23:16:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D4ADD6B007B for ; Thu, 21 Mar 2024 23:16:20 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 77B76160124 for ; Fri, 22 Mar 2024 03:16:20 +0000 (UTC) X-FDA: 81923211720.03.4E7EB1F Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by imf11.hostedemail.com (Postfix) with ESMTP id B7A7540006 for ; Fri, 22 Mar 2024 03:16:17 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=IeuIYius; spf=pass (imf11.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711077378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7CSFJUn1XXTBYAN51vHM2ZYnCVJlz3ER3f58uubxN6o=; b=XVDhqeRIzG0ZIThYgR2rQ1We06m7bUkZIwOH2RoLqCwKrmlEkATDU0WnZeFwdNCUhCLQq+ j9SAhwiyM1yJxFpIAkgQ6t5PGotva8ZjibUnbv3bGK6edyKSqMIm+5IW/pNrWP1Mab2Q5T 5qeyD+2xQ04yZBLI7dASkoRaJ6cyHTg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711077378; a=rsa-sha256; cv=none; b=6C3omg2Q4GRjnCtokxyVSu/NSXghCyuWyRhcbG9aMMAkb0cGYukJ9TC8V7t171dBj386F3 4Uc4gw6AV5XGbG6MAWap+C0T2iOqks/68QwpO7sEY8rhWxuAhFykYRd/dM17947cD0axIz 1JVWCg9Ljae/p9JuvanRkA3WFt5gSbM= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=IeuIYius; spf=pass (imf11.hostedemail.com: domain of hezhongkun.hzk@bytedance.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=hezhongkun.hzk@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com Received: by mail-lj1-f180.google.com with SMTP id 38308e7fff4ca-2d4515ec3aaso14021101fa.1 for ; Thu, 21 Mar 2024 20:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1711077376; x=1711682176; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7CSFJUn1XXTBYAN51vHM2ZYnCVJlz3ER3f58uubxN6o=; b=IeuIYiuscNV4iJeYPvBIA654k2ut1SnKnMblSW03uXbvQgMBmX7x06Nu4xbeQOdN4C VYcanUX7ELjPKSy9yncNhp28Vnyr7tqKYvlff1ZH9CHSsKJUwn3CWTs3HjsIQA6HB21i tgNrZh7sJK4TdZwo86NqNrMbJ5FxXAKXdyUIVl/793ZV1EivRsytICxvSX/e+hNfAcK5 foOLzYd4mATQwcr0qlR5WwE9QDgelQbsgPWhqJMdg+5YX4FLrxkWz8L1USfjZgnUat2g y9vEhP1i9eDfbbIQ7OFATZFYWkxHTdwN4XKdmxkro9+4U6B24156HRpQiPumVxVASDK5 5B3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711077376; x=1711682176; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7CSFJUn1XXTBYAN51vHM2ZYnCVJlz3ER3f58uubxN6o=; b=sv0n+DsSPmNluhLD+CfB0nquV8wsrh5m3nEBdLjNdjG6P8yy9WVmGHTL+C2BEPPAr3 zpOVWeU9m2dXNEul3FPZU1j0+qEpdrhTGWFhQFF/FUCbdx+TSP8HUXvH7ZndgLM+Kf2J bx38v+LCPEbNyuOBbrnSIcMjNduhxZqpKEBBYJ7yfqT7E8W7DZf8jdBvGeNNqvgJDHZD YHOEuGXK456M3jHkp9mB/SbtYHqNISjTnHefP3+wKr8k8Y08L2kaUUme4kOqk7IdG+/Y ynOkiWlV//StHM39qIzoWtFNNoK9W81ob1mkc9dpamEBtPYtlF1IW52/O3lWr7p5PCIe 28Tw== X-Forwarded-Encrypted: i=1; AJvYcCWZvr0vfnRfmAFZZpBfiQMvGxhNzjAnMlMEQjdr3hukJ7azLscJuN5zq72KSxWAIbKQpOtUad0V3wqxuALNCyNa0wE= X-Gm-Message-State: AOJu0YxBC+kd1YXaePLofzF648Kv+Ep2AJQGjh4ADlfDs7fCV2Du07Yj c5ZKKdRhTM6OgQXsNGvk9MHbO10ftQbAUQoO3daQ6TfPl7ULkjLKv7C+IDD7e9aeuxXHpLl+59U n1gkNo7+r6Pj0qA2ZvjmdO67Jc4/i+oDUIIfYqGivTZSZ3L3OoUM= X-Google-Smtp-Source: AGHT+IGAdSW/ZmG+rrouXmWEcxXvS2wdP3ErHuwLLgeOAQdauXZxshHb6USu6XDMFIZ4gJhMWqKXE6morkNzGwDcPpg= X-Received: by 2002:a05:651c:168f:b0:2d4:74ab:66c4 with SMTP id bd15-20020a05651c168f00b002d474ab66c4mr226318ljb.14.1711077375802; Thu, 21 Mar 2024 20:16:15 -0700 (PDT) MIME-Version: 1.0 References: <01b0b8e8-af1d-4fbe-951e-278e882283fd@linux.dev> In-Reply-To: From: Zhongkun He Date: Fri, 22 Mar 2024 11:16:03 +0800 Message-ID: Subject: Re: [External] Re: [bug report] mm/zswap :memory corruption after zswap_load(). To: Nhat Pham Cc: Chengming Zhou , Johannes Weiner , Yosry Ahmed , Andrew Morton , linux-mm , wuyun.abel@bytedance.com, zhouchengming@bytedance.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B7A7540006 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: j114bdiq1d7rq69kyod651jcoefuc19y X-HE-Tag: 1711077377-109171 X-HE-Meta: U2FsdGVkX1/O8DFaYTISP4A8Dd7bmr8a087PIzWZ9+BFrq5gS7WWy3eKzvFrDu3JfUo8TZUDFAY6MKvdCLW6pom8ueoKFV996AnNIlNkkFUXuiVsPmYOxEosBG88slw+KMZSkL1Fxdnf+t2NjnrknSv/GZJclkqmpvEtTPZhlE9sTYeSSMt/qVXB6BIXbp7bKl2dTYluv/KaUAOdnE386EPGp84O3XWZawjFVaEUo6P0tFJ1ZsDkT9e1m4H8FggtU6kvD34dG3ZWe9m8b3nW/eMVvl0+WybHu/UWOLrjNK7aoZpO1AKSmhW/hWVnReYvJW49LVWDjWlXZpL1C7L8/fAWXbEwyK2wIiHimi0BUpUnngkisfSibyuNIFTc791CCGi5ZEu6Hyx7Lu5mex5Ct4MQTQzSupsFIUj+hqOgUAKBPSCYjnpijcXtWku+Z97SoN7lrRZXFl/IWSdXgKl+jFETZ0Jp8FckEGE3999cu7fA8F6p5Mu4SNpicJPvH7r5KUkS4rjvJli7SezZCwykpV7+k+90mHqTtqWELVXiNxQ6SVVvTMpmqEPpnd//ceSnwh6RA4N/iUz2yHWssWb7z/x6LMgql0oYvNSZPnYtf4uvfP2Mp5+ybl4xxqQS3Gl1FHpUm3zvB10vQdZ7PgMC/IkRsGXSdqq76HQFA3pZcG++lYnNUk+wmk7z6XxB88PT026HuxzLMvIzqNHOREXFUExHqp44acQOmsqhZ4/XbXky0ZOrmPRCUUs9N6c9gbmn1CWFRcmd1cSXDCRl24qKO7+8IkKop/h1jvWjbCUf83+A0HRlnRHlJKdfz6K06mnySZ0pSLqU0+2ofBzNi8TtMv7aarSV/TKhwe0TcWzv+XZTmroQu9lovnrGJz9hfz9E+AqxSEY40AhqMjfuLV945NMD/d5gI+84Cpow3dQ9SFGUQjR+A36j10+tNNoRf3KVYE9MleVLbV1aCmNOs36 SXRqFQba qs5Jxfm/yoZzEA61LWXafnyIR/Hy35gxxP1Gn7xdo9XzSk9lGdyCzl0loocoH+ncQWpZEqNAXmDQg/jJrxj2Lqq9figh49AAx9Jb+UyngPrqet4CK08ZN+cdH1Zr5N5CIv9OvQk46t7Vgsi9NHcMq4p26kSG7wqNM3QEDzGee6jUhykeOf6zz9agftIHzjpf5BU0gIv87BbG3iGo2ai9fTd01QB4+LMYLbKdA7g+9ZpUUMMV3VHDAUmaCX0cyLu5DInZJFFYsMjrmoLW54cpuGS5OFKSfBq8JInb+UkxYiWGEAuXX7WiaheFNsNyrpp6MkkCO3av7egTNstfKm3KDX++lwd21t+xoS11qvT7H8LIMEEq1TKvB0IIseg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Mar 21, 2024 at 11:25=E2=80=AFPM Nhat Pham wrot= e: > > On Thu, Mar 21, 2024 at 2:28=E2=80=AFAM Chengming Zhou wrote: > > > > On 2024/3/21 14:36, Zhongkun He wrote: > > > On Thu, Mar 21, 2024 at 1:24=E2=80=AFPM Chengming Zhou wrote: > > >> > > >> On 2024/3/21 13:09, Zhongkun He wrote: > > >>> On Thu, Mar 21, 2024 at 12:42=E2=80=AFPM Chengming Zhou > > >>> wrote: > > >>>> > > >>>> On 2024/3/21 12:34, Zhongkun He wrote: > > >>>>> Hey folks, > > >>>>> > > >>>>> Recently, I tested the zswap with memory reclaiming in the mainli= ne > > >>>>> (6.8) and found a memory corruption issue related to exclusive lo= ads. > > >>>> > > >>>> Is this fix included? 13ddaf26be32 ("mm/swap: fix race when skippi= ng swapcache") > > >>>> This fix avoids concurrent swapin using the same swap entry. > > >>>> > > >>> > > >>> Yes, This fix avoids concurrent swapin from different cpu, but the > > >>> reported issue occurs > > >>> on the same cpu. > > >> > > >> I think you may misunderstand the race description in this fix chang= elog, > > >> the CPU0 and CPU1 just mean two concurrent threads, not real two CPU= s. > > >> > > >> Could you verify if the problem still exists with this fix? > > > > > > Yes=EF=BC=8CI'm sure the problem still exists with this patch. > > > There is some debug info, not mainline. > > > > > > bpftrace -e'k:swap_readpage {printf("%lld, %lld,%ld,%ld,%ld\n%s", > > > ((struct page *)arg0)->private,nsecs,tid,pid,cpu,kstack)}' --include > > > linux/mm_types.h > > > > Ok, this problem seems only happen on SWP_SYNCHRONOUS_IO swap backends, > > which now include zram, ramdisk, pmem, nvdimm. > > > > It maybe not good to use zswap on these swap backends? Hi Nhat, > > My gut reaction is to say yes, but I'll refrain from making sweeping > statements about backends I'm not too familiar with. Let's see: > > 1. zram: I don't even know why we're putting a compressed cache... in > front of a compressed faux swap device? Ramdisk =3D=3D other in-memory > swap backend right? It is currently for testing, and will be applied online later as a temporary solution to prevent performance jitter. > 2. I looked it up, and it seemed SWP_SYNCHRONOUS_IO was introduced for > fast swap storage (see the original patch series [1]). If this is the > case, one could argue there are diminishing returns for applying zswap > on top of this. > sounds good. > [1]: https://lore.kernel.org/linux-mm/1505886205-9671-1-git-send-email-mi= nchan@kernel.org/ > > > > > The problem here is the page fault handler tries to skip swapcache to > > swapin the folio (swap entry count =3D=3D 1), but then it can't install= folio > > to pte entry since some changes happened such as concurrent fork of ent= ry. > > > > Maybe we should writeback that folio in this special case. > > But yes, if this is simple maybe we can do this first to fix the bug? >