From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0814C54E64 for ; Sat, 23 Mar 2024 01:37:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 08B146B0088; Fri, 22 Mar 2024 21:37:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 013616B0089; Fri, 22 Mar 2024 21:37:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCFFD6B008A; Fri, 22 Mar 2024 21:37:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C6C916B0088 for ; Fri, 22 Mar 2024 21:37:05 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2BCA91409EA for ; Sat, 23 Mar 2024 01:37:05 +0000 (UTC) X-FDA: 81926590410.09.7C70B8E Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) by imf13.hostedemail.com (Postfix) with ESMTP id 5AD1620014 for ; Sat, 23 Mar 2024 01:37:03 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=roMmg3EN; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711157823; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sUKPwNqbJSwCOl6TErCC/cR0RfvEf9GJVMQCculpbj0=; b=PA5p5x3VXyxm4Jn7MjuEZKPvW2iu52aa8l9MHYS72ikOwZ1CYJ3GDFMeHsMYVy5koUNba2 XqljMLTuSYNl1Z5y4LKTpdF3EtKa/3R6kCv2v7TwT7vuphivjKSLO+7iBL2jcZx5U+zGTN y3MW3aUd6UZxBt8Y+mYcBKxykC6J/D8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=roMmg3EN; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf13.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.45 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711157823; a=rsa-sha256; cv=none; b=xWNmrqelbp7Q7I+sDyGn6OI42WpIFg2FLKNGafi5IHUmEPUXQSkf4REaM5CSdDyItAVPvX ROt3lcxvmnoTiQKWp1Ni3SO5dTO84CbaupVtFPnzifp24Uo6RQh8/LmqtPQQTZ//YqcDwE 4KEsRVOGu4aINVZMK3w9MYNc8+WPBY8= Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-a4715d4c2cbso323184766b.1 for ; Fri, 22 Mar 2024 18:37:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1711157821; x=1711762621; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sUKPwNqbJSwCOl6TErCC/cR0RfvEf9GJVMQCculpbj0=; b=roMmg3ENm+OD/dLNvUtsqjlD882wt0aVu4DR7hxghol7rVrgXeQB9RveZ8g4R+OyGO WqCo+gErp7FPy7X5eQ1UUIGH9LAYiaYrHewpipY/36bBag98OxfqPSlAt3W4iMd8yl1a 99xaeHE6qgiIHvBhlnob4P23N4QmoMfWYKCgRerMbClJuRkI3XLuwRHWfwFE08nipE// DfdK6PHZxXlZEItGq9md0PpiA9LTQnoGp+GSK/pAgiUDhO5MMNlZB7rSd0H46kuVCCMw wjHxfFLuHK59mXZRumpGUUDombOVB2tb9PxrZHuCp4ZDcNHj00m/l7zCRahIkf+3NNrO JaOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711157821; x=1711762621; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sUKPwNqbJSwCOl6TErCC/cR0RfvEf9GJVMQCculpbj0=; b=ROaQGQJWHR3wZx0mEKi4CsQQtjXZIwMYZKkoHCNCj1TzEcfNTKn9rnQs6sR6D43Z5V tHQwnkRAC5SJeKPgv+Yus0a+8/16AKkO2QKzNjAKQSp96Z6DX+Plb7ywycbQIJ6swE1z IwXurAW6DbnZTaXoIzrRm04dW9OTbUUADh4PA8CtzN5T8vO6Y2JOt2838s8KIzvf58+h n82HcfSvazyneJnrPR3MxPpfiI7oZCmDgoaRNcf1Tpa0EwNUfCYfd8q2y8+0lwxitE0K B81TOCVY3NrclYzXrZbcoKDsrjCKa+BFQY+Vg8ac93vPW8xFs6umODTUPzjlT8FiK/QM r5JA== X-Forwarded-Encrypted: i=1; AJvYcCWfk4RMgVN5AKfflWbjj/XRV6qwaHxlLdHcme7+WOGHg6r0rCVwRnovM7d0QEJfwRx1qOfrf0UMxnP/zOrldqtYOyI= X-Gm-Message-State: AOJu0Yxkq5A+7GRIbJBGdi8sArzJkyQ/rqQYNxvY6qKGwN9oxmq73a3+ LRigbh52L5SQiLvkAhCygVNBwTzQcElDQbRM618C95I2VG+grHbdPDSbeOTpCtpMCXEepoI8BI6 R21pN6jYhZv0k2TAp5UJHUAHILqehetk3ELf5 X-Google-Smtp-Source: AGHT+IEyL1VjNa9Bk40zXn1W8zxIENivLRMdV0almK/WrQKcw8sywgI8qXlZ6QUN3+7B8uxQ2pXQb6y48FhV1yEBWfA= X-Received: by 2002:a17:906:1b52:b0:a46:1fb:1df with SMTP id p18-20020a1709061b5200b00a4601fb01dfmr903036ejg.42.1711157821489; Fri, 22 Mar 2024 18:37:01 -0700 (PDT) MIME-Version: 1.0 References: <01b0b8e8-af1d-4fbe-951e-278e882283fd@linux.dev> In-Reply-To: From: Yosry Ahmed Date: Fri, 22 Mar 2024 18:36:25 -0700 Message-ID: Subject: Re: [External] Re: [bug report] mm/zswap :memory corruption after zswap_load(). To: Zhongkun He Cc: Chengming Zhou , Johannes Weiner , Andrew Morton , linux-mm , wuyun.abel@bytedance.com, zhouchengming@bytedance.com, Nhat Pham , Kairui Song , Minchan Kim , David Hildenbrand , Barry Song <21cnbao@gmail.com>, Chris Li , Ying Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: yzeht3kpd5de1zqobjsckgmpp9c119au X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5AD1620014 X-HE-Tag: 1711157823-752718 X-HE-Meta: U2FsdGVkX18gmeePsfiUODku+I8jXdTGVX03P96MdHhMjerwaFUTbyVO+GdCEfHtl2LpNAQMZnNVeA2QklGaiGp1KrhI8Z53/P7WGCPlNMpjsWkBey1XPp/dYzuMN1tPtyGunMbriAtZJ2lzVRDP7AZE5ugwNbejmp1Nh7O9vAUkCY7kXSZqDKafniIMkQsNQmg/zVu5TkXC6izEDO/sUWAFEuJPHGngEwA7PpVXzDXP5y+M3HoqQFPl4BrV7Eze9LgfobM+ptIfU/dnrZ/aq+vkDtkkz3woPmPaJsNkca4ifL1dNNT2XX1HwnWgLrxZedUld4m31kEwhQz07yDT8fdNNVGnwRcuccDGk2jIHUxNbPw9t7XJJKVmtbpHoGF90yQi3iG5TQMx835ol/iJJaJCQYEP57S7XpIeCAqCf0gCIfae8LJdmW2rb+8KGKRHtfOSvGvbbAuyAxWxQjF6O5zBcRtrToy3zFqs5eCfrHs+tD7WTwQQ2MKD3vodCdenuoO90NzMAi0FUNqqACd5mGzYGfHYqsTYe0/kCCqWo0wuHPnQyIhN8YPGNtBR7pQdRWDi91A7P6ovpO4bx/plb3Ij7XmtGYMewMZi0X+KVsvpvt5xMMICAzb2GotI1M43z3w3I+2R1eWpA+kCc0ermHLqjBkph+YEZlVNPWDstqUF2LpLO28y8r6eBVHW//eOnrnYbkZeCadsR7Tvxi7lrLHKSjzzPYE3y2Gt/D+coCXVVWwpzGx/NbPwThTA0QV/cFSvQf6hOJDl5KVTDVjT7RJylTjS3mReVhqX3ZUuRhbmv9DgVAdlHIZPOffAhaiDs6HYbDOjvU9m/h8EvLo6Tj81yj6glKCHYJN0kOFcUXUlr4pc3gxp719/AJ/kIg1o/dD34/anV5JKhaSMtV5xCE0iJBFcQ8tZTk4KdXwhwuZ9BYSpyF+GCrXBDGlPFBDmCxTylUXfmpd0tYHFbGl 6eOLQI8C 5xucPSBpfTG6ytloAzFzYCdEncmNvSgbKBllNNLlQhiRTS7hkE+a/3EZ6SRbIB5riJ7ohJniGX1JvHVHFd/wCruh2YBOhbHiP6Lg6KECWKReWOTpdUpdVoAutc557GspFCUs2LnQ2yoU7d3KqGZLXY4RTm7zW8O2nZygFz1sSqONqyoFMQUFxvebL6VwHGRDDjFGxslpKH4pQ34eUnbFjAgJgSY8OUJ3ZgMnozG0N6h0jnNSK5NmzHLHES5p1VwcwpOUFcA6crb3eO16JNmvd2rUKoPicalf3qoyftooT/appaW1LeXcWBo77p6CzKBMiwHeubO3C8DCqL6HrpgC2OV9eA/3I93bS2B3+zKFEk7BS/GM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Mar 22, 2024 at 6:35=E2=80=AFPM Zhongkun He wrote: > > On Sat, Mar 23, 2024 at 3:35=E2=80=AFAM Yosry Ahmed wrote: > > > > On Thu, Mar 21, 2024 at 8:04=E2=80=AFPM Zhongkun He > > wrote: > > > > > > On Thu, Mar 21, 2024 at 5:29=E2=80=AFPM Chengming Zhou wrote: > > > > > > > > On 2024/3/21 14:36, Zhongkun He wrote: > > > > > On Thu, Mar 21, 2024 at 1:24=E2=80=AFPM Chengming Zhou wrote: > > > > >> > > > > >> On 2024/3/21 13:09, Zhongkun He wrote: > > > > >>> On Thu, Mar 21, 2024 at 12:42=E2=80=AFPM Chengming Zhou > > > > >>> wrote: > > > > >>>> > > > > >>>> On 2024/3/21 12:34, Zhongkun He wrote: > > > > >>>>> Hey folks, > > > > >>>>> > > > > >>>>> Recently, I tested the zswap with memory reclaiming in the ma= inline > > > > >>>>> (6.8) and found a memory corruption issue related to exclusiv= e loads. > > > > >>>> > > > > >>>> Is this fix included? 13ddaf26be32 ("mm/swap: fix race when sk= ipping swapcache") > > > > >>>> This fix avoids concurrent swapin using the same swap entry. > > > > >>>> > > > > >>> > > > > >>> Yes, This fix avoids concurrent swapin from different cpu, but = the > > > > >>> reported issue occurs > > > > >>> on the same cpu. > > > > >> > > > > >> I think you may misunderstand the race description in this fix c= hangelog, > > > > >> the CPU0 and CPU1 just mean two concurrent threads, not real two= CPUs. > > > > >> > > > > >> Could you verify if the problem still exists with this fix? > > > > > > > > > > Yes=EF=BC=8CI'm sure the problem still exists with this patch. > > > > > There is some debug info, not mainline. > > > > > > > > > > bpftrace -e'k:swap_readpage {printf("%lld, %lld,%ld,%ld,%ld\n%s", > > > > > ((struct page *)arg0)->private,nsecs,tid,pid,cpu,kstack)}' --incl= ude > > > > > linux/mm_types.h > > > > > > > > Ok, this problem seems only happen on SWP_SYNCHRONOUS_IO swap backe= nds, > > > > which now include zram, ramdisk, pmem, nvdimm. > > > > > > Yes. > > > > > > > > > > > It maybe not good to use zswap on these swap backends? > > > > > > > > The problem here is the page fault handler tries to skip swapcache = to > > > > swapin the folio (swap entry count =3D=3D 1), but then it can't ins= tall folio > > > > to pte entry since some changes happened such as concurrent fork of= entry. > > > > > > > > > > The first page fault returned VM_FAULT_RETRY because > > > folio_lock_or_retry() failed. > > > > Hi Yosry, > > > How so? The folio is newly allocated and not visible to any other > > threads or CPUs. swap_read_folio() unlocks it and then returns and we > > immediately try to lock it again with folio_lock_or_retry(). How does > > this fail? > > Haha, it makes me very confused. Based on the steps to reproduce the prob= lem, > I think the page is locked by shrink_folio_list(). Please see the > following situation. I missed the call to folio_add_lru() before swap_read_folio(). Reclaim would be able to lock the folio in this case once it's unlocked by swap_read_folio(). Thanks for elaborating.