From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68D4CC54E58 for ; Thu, 21 Mar 2024 04:42:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F219A6B008C; Thu, 21 Mar 2024 00:42:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ED0766B0092; Thu, 21 Mar 2024 00:42:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D98836B0093; Thu, 21 Mar 2024 00:42:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CB1296B008C for ; Thu, 21 Mar 2024 00:42:43 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 60DBA40A71 for ; Thu, 21 Mar 2024 04:42:43 +0000 (UTC) X-FDA: 81919800606.07.609D927 Received: from out-179.mta1.migadu.com (out-179.mta1.migadu.com [95.215.58.179]) by imf03.hostedemail.com (Postfix) with ESMTP id 590FA20004 for ; Thu, 21 Mar 2024 04:42:41 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YRRpA7pQ; spf=pass (imf03.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1710996161; a=rsa-sha256; cv=none; b=AnhQiwcT/OVS72OylJ7FhqwZPaWuXommWp1UVqtp3YoMQkrJeGD7+gp4d8w8QhM/+zj2Rw EmEPAskz4u2BHmYEQTSK0BIEejV5WRqQ3GsP9WBCC6PeV092XDP/GQyGkTsgBvINbjrIRM ZsySIuzs8EhRWuhoY6JpN5DcGQSXRH0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=YRRpA7pQ; spf=pass (imf03.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.179 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1710996161; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v3jK88cZbZaNbVps4XhqIFoJ3udb6RB9UnkGX3kJz/w=; b=6SMaWmO1mxyxO4g8ZLcKjxyuEEsoHusc0062Er94U0s8eFTfZ2sBHhl+HD6z59W5R6RJNZ l8eKjhu5yDFbelkEbNEYB+vuhxAgLwd2cmIt2NOjnxo4Bwp/gTTLCRQfl8BRt48Dad15JB PzbKCB3uey1a7YO0UFyh4uQ1G7FqBAw= Message-ID: <01b0b8e8-af1d-4fbe-951e-278e882283fd@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1710996153; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v3jK88cZbZaNbVps4XhqIFoJ3udb6RB9UnkGX3kJz/w=; b=YRRpA7pQ57wtgjkYD0/IGtp/rzZUqZlgLmgWx2woQDNmjjZJIY4ialWCVjvlir+HsGCZBu jRFO+jYICTm34t0Uxq9al9zu5Ns3MyCA35hA0/D3ov96WJuWCol/DBRFnsBRjHw6CUKero 4bveythpU+4S2O/k9ZvksbXav7J9BYM= Date: Thu, 21 Mar 2024 12:42:26 +0800 MIME-Version: 1.0 Subject: Re: [bug report] mm/zswap :memory corruption after zswap_load(). Content-Language: en-US To: Zhongkun He , Johannes Weiner , Yosry Ahmed , Andrew Morton Cc: linux-mm , wuyun.abel@bytedance.com, zhouchengming@bytedance.com, Nhat Pham References: X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 590FA20004 X-Stat-Signature: ipm85ebnbz7maw7rgkrk3zsof3dbz63t X-Rspam-User: X-HE-Tag: 1710996161-701690 X-HE-Meta: U2FsdGVkX19e2Mv5O8lL8BhmjIlSB3mdwkGmMQAtc6Yu3E82oVn5WW0y7c4mxbyNuWrJ68fKk3EaEW5GM7ejQAS52qGLfqhhPFTudbS1E6nLHj9ZWsGfcRRP5lB1fiBIVUNsnHRPhwUU7Mzhdkbii0bgbO5VK8E9hfMJHlGLIuTiBPD4jBJlwwuxWUvy1IHNsQsSffHtuAbWLlFdR3ZzablGEejJzyiQCAjjMEUl6hud4EV5AltrY+Szo2gY6srZcVvAF7YQmi5/P6isHQLQgshglAd+ZR4KrMpNjLmJaGGhAxFzlHWl26/vI0C3fB9DA2hNntOaS3ordZShZHzehYBPsQA9GKCtWJK5I/i/mJJQD0o3kYHebIW6sdP+4e+kjFfzO+CmC1YAxflXaqiAt106zDkkw4EvHcT/e57FSu8aEcmunzIfB574wN04mNKIIn3yGBv6B+QIG12sZhYUS+XMmavfm4dI+TSYETu+H+tnclYVikqRjnYfBNUT2bX5C5GbZ1PqDzljnsl9vOdws9a34fiM5C6AwzmqAzUNmNbULhc9mTy7KlDEoY/zier2WtItfwONuftZqIWCRvEeZcQnusV6kCOAUNPT6gfD4+mYvaJuJfJETZEDFC8ofDpQzyX9VRMTkaPKcXGb7CrZFerSeuAW59zvTQ1TkpaCyuOAw9hus0ojPCNUrlhy94POYzCTQ0DdnIAiA7vIeTvopz33ZXZPcuwAGMMkqkeXDuS9Uy3sdzgs3FnfcLx9Xb6auQWV5mFjNDn6SAI+tQes4mcYPLtkg22SVbXaDLOT1AkOk5kwXLJpPUIC7DFJqb7w5+EhXu8GXYqpMKrCrYJ38hVr6aaUvDEuZ80mtBtDLfe37uCTjitp9i7T9gaGPn+fWP1xd1FBbLtTZE6Cf6YrKmCpSecwkD3DZooQbQ7Ca0F09QEZoteCDaR5XGZTQ5iqxGvJJnLF++RwMMpM02S cZeyaNPD F/XcuholuFJGQjI077AbEbZ1pEvRTeeHyHCcMt9E5vQj/dim06ecr2ejvy8AWJjTqGpeyASpY8z25/nb5opliNAu+WtADtlnQr61vpPrgngR/pUxr0uG/lEuikPAxwrW19qN6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/3/21 12:34, Zhongkun He wrote: > Hey folks, > > Recently, I tested the zswap with memory reclaiming in the mainline > (6.8) and found a memory corruption issue related to exclusive loads. Is this fix included? 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") This fix avoids concurrent swapin using the same swap entry. Thanks. > > > root@**:/sys/fs/cgroup/zz# stress --vm 5 --vm-bytes 1g --vm-hang 3 --vm-keep > stress: info: [31753] dispatching hogs: 0 cpu, 0 io, 5 vm, 0 hdd > stress: FAIL: [31758] (522) memory corruption at: 0x7f347ed1a010 > stress: FAIL: [31753] (394) <-- worker 31758 returned error 1 > stress: WARN: [31753] (396) now reaping child worker processes > stress: FAIL: [31753] (451) failed run completed in 14s > > > 1. Test step(the frequency of memory reclaiming has been accelerated): > ------------------------- > a. set up the zswap, zram and cgroup V2 > b. echo 0 > /sys/kernel/mm/lru_gen/enabled > (Increase the probability of problems occurring) > c. mkdir /sys/fs/cgroup/zz > echo $$ > /sys/fs/cgroup/zz/cgroup.procs > cd /sys/fs/cgroup/zz/ > stress --vm 5 --vm-bytes 1g --vm-hang 3 --vm-keep > > e. in other shell: > while :;do for i in {1..5};do echo 20g > > /sys/fs/cgroup/zz/memory.reclaim & done;sleep 1;done > > 2. Root cause: > -------------------------- > With a small probability, the page fault will occur twice with the > original pte, even if a new pte has been successfully set. > Unfortunately, zswap_entry has been released during the first page fault > with exclusive loads, so zswap_load will fail, and there is no corresponding > data in swap space, memory corruption occurs. > > bpftrace -e'k:zswap_load {printf("%lld, %lld\n", ((struct page > *)arg0)->private,nsecs)}' > --include linux/mm_types.h > a.txt > > look up the same index: > > index nsecs > 1318876, 8976040736819 > 1318876, 8976040746078 > > 4123110, 8976234682970 > 4123110, 8976234689736 > > 2268896, 8976660124792 > 2268896, 8976660130607 > > 4634105, 8976662117938 > 4634105, 8976662127596 > > 3. Solution > > Should we free zswap_entry in batches so that zswap_entry will be > valid when the next page fault occurs with the > original pte? It would be great if there are other better solutions. >