From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF919C54E71 for ; Fri, 22 Mar 2024 03:28:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75D106B0095; Thu, 21 Mar 2024 23:28:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 70C8D6B0096; Thu, 21 Mar 2024 23:28:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FCAB6B0098; Thu, 21 Mar 2024 23:28:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 515036B0095 for ; Thu, 21 Mar 2024 23:28:05 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0594B1408E0 for ; Fri, 22 Mar 2024 03:28:05 +0000 (UTC) X-FDA: 81923241330.15.F79F321 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) by imf25.hostedemail.com (Postfix) with ESMTP id 1914DA000D for ; Fri, 22 Mar 2024 03:28:02 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=J7x8S4J4; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf25.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711078083; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=N4/c3N5l9zteSd783kncVdkHrI08JwJvNGNaUS1gr7w=; b=LoWNgAiWlVrZ+pGhupThBRA9JYtqT616zPjf36e9t2/gOHmH/tPu6b9b9Lk/64mhQpf64+ ctRlReSMCzyzTd0+6qoYBInbQlQkrf+yWMNYZRQlRdSYCvUeTn9BUP1TLYilprIiNSUNUk 5H5gIW0rgLRAGAyFvgwx/AjalVziCjE= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=J7x8S4J4; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf25.hostedemail.com: domain of chengming.zhou@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=chengming.zhou@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711078083; a=rsa-sha256; cv=none; b=Jw4Di5xiUCBrVu0+/3aXAfovdQm+maxtfP+lu0xOxuCXCWUBNc7SYBPxWwmfN/tAkLufRr MMsNTg3Pte/QQ60L7034MGblMmiRyuAgqPhg/OKegun27oJQls99ZJa7QoZJYaVE9DE+Aw 7kuculLQ5bL37/Z7L/wr60V15tu12kQ= Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1711078078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N4/c3N5l9zteSd783kncVdkHrI08JwJvNGNaUS1gr7w=; b=J7x8S4J4CXxX1egurbfKpHug17LWbvQpjPN4TY8QL/szCUVQstWDVXelb/DNc4uq6ts+KW EdhTypQlaJAvFivhsckB7Lph7U6hm/mL/8GAfkMBWWnSJ0GdXuSdfVcJxG6MI/XFQZawOX xykKFHpLFMrET1ee98boH0QvT8URW/Y= Date: Fri, 22 Mar 2024 11:27:53 +0800 MIME-Version: 1.0 Subject: Re: [External] Re: [bug report] mm/zswap :memory corruption after zswap_load(). Content-Language: en-US To: Yosry Ahmed , Nhat Pham Cc: Zhongkun He , Johannes Weiner , Andrew Morton , linux-mm , wuyun.abel@bytedance.com, zhouchengming@bytedance.com References: <01b0b8e8-af1d-4fbe-951e-278e882283fd@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Chengming Zhou In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 1914DA000D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 1hnmrfumm1injw53dnfjgkkyu6fgm3g1 X-HE-Tag: 1711078082-524087 X-HE-Meta: U2FsdGVkX19IF1Z0KpKgrCfXADF0GsGrXdKeXiPEJP8bsiOQ8bPO9brpcod8qXIgMfL/mnJJJOQNREoekXIReWe6T9Hd3WbZeY+eE3jpNrKhEQhNVM7QIzy5vmqdwQJtAfjLD/YGQ9NWm1aAFtEbm9VzKJWoDf1mlNQxiBOWoaPmQ20RDIrRmw4Oy4YkTFgOOOWLWs8jjprscPC/mMS+etGbIF9NIJEWZjdBDoZ5+vVoLH2P8XJ/9uHwYEiwrrUA88zHnlFbhjXuGS3SKlv//NdHsaK8aP/gUiA1PZel6n6C//Zv9Mu3JpzYlJipQZwnLMS/3ncMiVCQf4u/q+DU27fmBQxnJR2L+Kxlll2PMgRi6NynW2dG0vL535sLBB9xgyhqbu5AVk3mPFzp8y/12t2+AuUQOmiOZVNoy2nZ3/BmHxZHX5iSwOjGXRpI9Sp/Z+tFXl6K22H119Teq73O9YdfKZgE2byPN7LkhArAITbBCXFZz+tG71iJzsirTJwfwFtcgxNd/jEAtzpXCoX5biB4wovYcdcnzC87MPHPNA9rbdc27BZqJvjvbqygNuR+6TscKAQzzs6a9JId6JyBMFS9VpWtRBND1eefFdlguO8Fe/nTBHG0j0x/vn07ZL/6yRzYZCAOKxnF30FO3ZaiX/68p5tLj5LfwcLCZHFpEbZjWyrLgqtjTy5iQ4h3VmG5+sAzaDDoufykzUh1i8KMyQoJgpIXNJMvnnk8lGZl459CFomhmEc/scntchvkwPWnFRpTit9EULu79HJQlYWUVlbRHsEdmvi9/aVb5zOtrqLMXsRSyFa1EDq+nzEhosoBUff1Yf66Fb5e1awuqq0qCkHQeuNEBDMlivKbp32ezXgDuWsRPl5R4IAo+7cGIGtEMSTz1EUQkekQc0JsCzPEA6bQN+SlsIWZ7A2cPacjnMP/LvbY20lzSOZyNOJE5OO7qK+GGjJAtZrtJA7gB3k kcyY7/Sa H4dZqkxg0tuiqBWj6RqeZQiPOl6gX4a3dQ8UmEeXknT5lpG6Ut4P7fFI47dh98C5mTd866Y7ruItTzB187fN7j+/x6C29lw637z36eKX96JsXprj7fVUEj2QeTR8eE58UAHSJXYeQTjQn2+ZMpmZtyUHLyoBzlD0M5fU1E1Dlb0xcdnIQ8jYNFVzfp09vt4wZvufPPqb3WeFwnuMwZmPd1ODJGVAuLzz9CKEX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/3/22 02:32, Yosry Ahmed wrote: > On Thu, Mar 21, 2024 at 08:25:26AM -0700, Nhat Pham wrote: >> On Thu, Mar 21, 2024 at 2:28 AM Chengming Zhou wrote: >>> >>> On 2024/3/21 14:36, Zhongkun He wrote: >>>> On Thu, Mar 21, 2024 at 1:24 PM Chengming Zhou wrote: >>>>> >>>>> On 2024/3/21 13:09, Zhongkun He wrote: >>>>>> On Thu, Mar 21, 2024 at 12:42 PM Chengming Zhou >>>>>> wrote: >>>>>>> >>>>>>> On 2024/3/21 12:34, Zhongkun He wrote: >>>>>>>> Hey folks, >>>>>>>> >>>>>>>> Recently, I tested the zswap with memory reclaiming in the mainline >>>>>>>> (6.8) and found a memory corruption issue related to exclusive loads. >>>>>>> >>>>>>> Is this fix included? 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") >>>>>>> This fix avoids concurrent swapin using the same swap entry. >>>>>>> >>>>>> >>>>>> Yes, This fix avoids concurrent swapin from different cpu, but the >>>>>> reported issue occurs >>>>>> on the same cpu. >>>>> >>>>> I think you may misunderstand the race description in this fix changelog, >>>>> the CPU0 and CPU1 just mean two concurrent threads, not real two CPUs. >>>>> >>>>> Could you verify if the problem still exists with this fix? >>>> >>>> Yes,I'm sure the problem still exists with this patch. >>>> There is some debug info, not mainline. >>>> >>>> bpftrace -e'k:swap_readpage {printf("%lld, %lld,%ld,%ld,%ld\n%s", >>>> ((struct page *)arg0)->private,nsecs,tid,pid,cpu,kstack)}' --include >>>> linux/mm_types.h >>> >>> Ok, this problem seems only happen on SWP_SYNCHRONOUS_IO swap backends, >>> which now include zram, ramdisk, pmem, nvdimm. >>> >>> It maybe not good to use zswap on these swap backends? >> >> My gut reaction is to say yes, but I'll refrain from making sweeping >> statements about backends I'm not too familiar with. Let's see: >> >> 1. zram: I don't even know why we're putting a compressed cache... in >> front of a compressed faux swap device? Ramdisk == other in-memory >> swap backend right? > > I personally use it for testing because it's easy, but I doubt any prod > setups actually do that. That being said, I don't think we need to > disable zswap completely for these swap backends just to address this > bug. Right, agree! We'd better fix it. > >> 2. I looked it up, and it seemed SWP_SYNCHRONOUS_IO was introduced for >> fast swap storage (see the original patch series [1]). If this is the >> case, one could argue there are diminishing returns for applying zswap >> on top of this. >> >> [1]: https://lore.kernel.org/linux-mm/1505886205-9671-1-git-send-email-minchan@kernel.org/ >> >>> >>> The problem here is the page fault handler tries to skip swapcache to >>> swapin the folio (swap entry count == 1), but then it can't install folio >>> to pte entry since some changes happened such as concurrent fork of entry. >>> >>> Maybe we should writeback that folio in this special case. >> >> But yes, if this is simple maybe we can do this first to fix the bug? > > Can we just enforce using the swapcache if zswap is in-use? We cannot > simply check if zswap is enabled, because it could be the case that we > stored some pages into zswap then disabled it. > > Perhaps we could keep track of whether zswap was ever enabled or if any > pages were ever stored in zswap, and skip the no swap cache optimization > then? Hmm, this way we have to add something to the swap_info_struct, to check if it has used zswap or not. Another way I can think of is to add that folio to the swapcache if we can't install it successfully, so next time it can find it in swapcache.