From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=66.187.233.73; helo=mx1.redhat.com; envelope-from=lersek@redhat.com; receiver=edk2-devel@lists.01.org Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 86E0D202E5351 for ; Wed, 27 Jun 2018 05:00:09 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C2C244075798; Wed, 27 Jun 2018 12:00:08 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-121-54.rdu2.redhat.com [10.10.121.54]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7A35E2027047; Wed, 27 Jun 2018 12:00:07 +0000 (UTC) To: "Ni, Ruiyu" References: <20180625025402.201636-1-ruiyu.ni@intel.com> <4DFBB17A-3FCF-448D-B8F0-C4D66A33CF9F@apple.com> From: Laszlo Ersek Cc: edk2-devel@lists.01.org Message-ID: Date: Wed, 27 Jun 2018 14:00:06 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 12:00:08 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 27 Jun 2018 12:00:08 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'lersek@redhat.com' RCPT:'' Subject: Re: [PATCH] UefiCpuPkg/MpInitLib: AP uses memory preceding IDT to store CpuMpData X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jun 2018 12:00:10 -0000 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit On 06/27/18 08:00, Ni, Ruiyu wrote: > On 6/27/2018 2:57 AM, Laszlo Ersek wrote: >> Second, even assuming that PushCpuMpData() and PopCpuMpData() are >> atomic, the order in which APs complete the AP procedure is not >> deterministic, and it need not be the exact reverse of the entry order. >> We may have the following order, for example: >> >> - AP 1 writes CpuMpData, and saves the original PEI services pointer on >>    its stack, >> - AP 2 writes CpuMpData, and saves the same CpuMpData value (originally >>    written by AP 1) on its stack, >> - AP 1 completes the AP procedure and restores the original PEI services >>    pointer from its stack, >> - AP 2 completes the AP procedure, and overwrites the PEI services >>    pointer, with the CpuMpData value from its stack, that was originally >>    written by AP 1. >> > > Thanks for the analysis. It's a stupid bug that I should be aware of. > That can also explain why I cannot reproduce it. It's random. > >> Assuming I (remotely) understand what's going on, I'd (vaguely) suggest >> three alternatives: >> >> - instead of the idea captured in this patch, we should use an APIC ID >>    search similar to the one done initially in "MpFuncs.nasm", > > Don't understand. Can you kindly explain more? Roughly, I figured that we could set the pre-IDT pointer on the BSP, *before* starting up the APs, to a mapping table. The mapping table would consist of two columns, the first containing the APIC ID (of each CPU), the second containing AP-specific pointer values. Then, after starting up the APs, each AP could locate its own row (based on APIC ID) in the table, and the context pointer that belongs to it. But, this is likely not useful, as we want to expose just the same one pointer value to all APs together. > >> >> - or else, we should stick with the current idea, but use atomic >>    compare-and-swap operations, so that the original PEI services pointer >>    value be restored ultimately, at the least, > > I like this idea. Will generate the V2 patch. > >> >> - or else (possibly the simplest fix), allocate a separate IDT for each >>    AP, including the UINTN that precedes the (now AP-specific) IDT. This >>    means that the PEI services pointer*location*  would be separate for >>    each AP, and no contention would occur. > > I think it's the most complicated fix:) It might take the most code, but I guess it would be the simplest to reason about. No data sharing --> no locking necessary, and no races possible. Anyway, I saw your v2 (just the subject, and your note that we should ignore v2 for now). I'll stay tuned for v3. Meta-question: some of your emails are apparently addressed to the list only, and not CC'd to people who commented earlier on the thread. Did you drop CCs deliberately, or is it some kind of mail agent glitch? Thanks! Laszlo