From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=2607:f8b0:4001:c06::241; helo=mail-io0-x241.google.com; envelope-from=ard.biesheuvel@linaro.org; receiver=edk2-devel@lists.01.org Received: from mail-io0-x241.google.com (mail-io0-x241.google.com [IPv6:2607:f8b0:4001:c06::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id B7FFC203525EC for ; Fri, 10 Nov 2017 02:57:33 -0800 (PST) Received: by mail-io0-x241.google.com with SMTP id e89so13181363ioi.11 for ; Fri, 10 Nov 2017 03:01:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=WGtZFGPUXYL6JUWf9DBwuxY/52oMWhnABSSdrseKstg=; b=f9Ag6cAkW6h9Svng/Ruex1RxZo+CKEo/tfI1Tm954NdsTnDZ2Wcr45B+qcHs7vaKqi OSHQJjCfs2HwmJxfUdoKljU4894jJURBXtYDazlV5Lzh/I04lhSwDS+5TJeiVZZgJTJM jfzNemXupvAjdGV8FsR2OOkKSgIWyCj4ks/78= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=WGtZFGPUXYL6JUWf9DBwuxY/52oMWhnABSSdrseKstg=; b=Q6Qw+MpuwV5EgnWwpvL8IHSkFXYGq4HVflj8bguLYdRTAbkDh6s+0CVQcgJNU+YQkE AyVe4Avv3KxxjGYJZt8HjXewOFfXZbOI1Iiz5DW5OzbwSutdSd8f/TDQy89oVfWdcvwe DHqRYZJl6yl5Hj4OQb6gSF99xBzC/DmU7JPnIO2uLLq96IrQd/3DsjatGulEWOctAsVh KRRNiXz/xq2uN2yyLdEsMsOQfcvjKq5KDUSVzLaY9/ca/AW853J+R9l2bOCY0zpziczI mDEur6soBwZX6owL9ms83C2YR/HRjHktqpqVGxfyqJlQpx1NdKKY0e92qZl1rCHvRfIk bXbg== X-Gm-Message-State: AJaThX6wWPHXQiQrKluOipIn3kUgponAHAfzeY2yN2+IJkB5buDmsk+T 5B24eg6MQNzynu8uZ+yH+AGmF5c/fgANqh3b5M8jjw== X-Google-Smtp-Source: ABhQp+RCtVJkou68G/24JTuh5XIqX3TWWLKy3TnCvC0XQBuLa5Uu2NBtXc6UZnnJ3Ul1sgplNoTcQFDgL1zKMy+uYXA= X-Received: by 10.107.174.206 with SMTP id n75mr4070196ioo.43.1510311695270; Fri, 10 Nov 2017 03:01:35 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.104.20 with HTTP; Fri, 10 Nov 2017 03:01:34 -0800 (PST) In-Reply-To: <74fd1f40-70d9-ffd9-01c3-d628efb4dd44@redhat.com> References: <20171103113352.8604-1-ard.biesheuvel@linaro.org> <20171105055245.xbicmlagfeu7xt2o@bivouac.eciton.net> <74fd1f40-70d9-ffd9-01c3-d628efb4dd44@redhat.com> From: Ard Biesheuvel Date: Fri, 10 Nov 2017 11:01:34 +0000 Message-ID: To: Laszlo Ersek Cc: Leif Lindholm , "edk2-devel@lists.01.org" , "Gao, Liming" Subject: Re: [PATCH v2] ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI core X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Nov 2017 10:57:33 -0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On 10 November 2017 at 09:29, Laszlo Ersek wrote: > On 11/09/17 22:11, Ard Biesheuvel wrote: >> On 7 November 2017 at 18:13, Ard Biesheuvel = wrote: >>> On 7 November 2017 at 18:09, Laszlo Ersek wrote: >>>> On 11/05/17 17:29, Ard Biesheuvel wrote: >>>>> On 5 November 2017 at 16:27, Ard Biesheuvel wrote: >>>>>> On 5 November 2017 at 05:52, Leif Lindholm wrote: >>>>>>> On Fri, Nov 03, 2017 at 11:33:52AM +0000, Ard Biesheuvel wrote: >>>>>>>> DEBUG builds of PEI code will print a diagnostic message regarding >>>>>>>> the utilization of temporary RAM before switching to permanent RAM= . >>>>>>>> For example, >>>>>>>> >>>>>>>> Total temporary memory: 16352 bytes. >>>>>>>> temporary memory stack ever used: 4820 bytes. >>>>>>>> temporary memory heap used for HobList: 4720 bytes. >>>>>>>> >>>>>>>> Tracking stack utilization like this requires the stack to be seed= ed >>>>>>>> with a known magic value, and this needs to occur before entering = C >>>>>>>> code, given that it uses the stack. Currently, only Nt32Pkg appear= s >>>>>>>> to implement this feature, but it is useful nonetheless, so let's >>>>>>>> wire it up for PrePeiCore as well. >>>>>>>> >>>>>>>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=3D748 >>>>>>>> Contributed-under: TianoCore Contribution Agreement 1.1 >>>>>>>> Signed-off-by: Ard Biesheuvel >>>>>>> >>>>>>> OK, this may sound completely unreasonable, but seeing those >>>>>>> implementations overwrite callee-saved registers without saving the= m >>>>>>> makes my brain unhappy. (Yes, I know.) >>>>>>> >>>>>>> Could they either: >>>>>>> - Have a comment prepended establishing the implicit ABI of which >>>>>>> registers the caller cannot rely on reusing after return. >>>>>>> Preferably somewhat echoed at the call site. >>>>>>> - Be rewritten to use only scratch registers? >>>>>>> >>>>>> >>>>>> I think it is implied that the startup code does not adhere to the >>>>>> AAPCS. That code already uses r5 and r6 without stacking them, simpl= y >>>>>> because we're in the middle of preparing the stack and other executi= on >>>>>> context, precisely so the C code we call into can rely on AAPCS >>>>>> guarantees. >>>>> >>>>> >>>>> Ehm, hold on, what do you mean by 'call site'? This code just runs an= d >>>>> jumps back to a local label. There are no functions calls here until >>>>> the point where we call into C (with the exception of the lovely >>>>> ArmPlatformPeiBootAction() we added so Juno can find out how much DRA= M >>>>> it can use) >>>> >>>> Please continue the discussion with Leif on this; from my side, I'm >>>> happy with the patch (I've sort of deduced what the assembly code does= , >>>> also relying on your v1 notes). >>>> >>>> The only eyebrow-raising part was: >>>> >>>> + MOV64 (x9, FixedPcdGet32 (PcdInitValueInTempStack) |\ >>>> + FixedPcdGet32 (PcdInitValueInTempStack) << 32) >>>> >>>> where we left-shift a constant that is "in theory" UINT32 by 32 binary >>>> places, using the << operator. In C that would be undefined behavior, >>>> but this is assembly, so what do I know? =C2=AF\_(=E3=83=84)_/=C2=AF >>>> >>>> Acked-by: Laszlo Ersek >>>> >>> >>> Thanks. And you're right, this is not C so no need to worry about that. >>> >>>> ( >>>> >>>> By the way, just to see if I remember correctly, isn't STP: >>>> >>>> +0:stp x9, x9, [x8], #16 >>>> >>>> the kind of instruction that modifies multiple operands at once, and s= o >>>> if it faults, it cannot be virtualized well? (Because the syndrome >>>> register or whatever does not tell the VMM the whole picture about the >>>> fault?) >>>> >>>> Totally irrelevant here, I'm just curious. >>>> >>> >>> STP =3D=3D STore Pair, and so it stores the values in the registers to >>> memory. The only register that gets modified here is x8, due to the >>> post-increment. >>> >> >> ... which actually doesn't mean it is not affected by the same issue. >> >> The reason such instructions are more difficult to virtualize is that >> it requires KVM to decode the instruction, rather than read the >> syndrome registers that can tell it which register we intended to >> read/write from. So it is in fact perfectly feasible to virtualize it, >> but the KVM authors just haven't bothered yet. > > Hm, I'm slightly curious if and how this differs from x86 KVM :) In x86 > KVM there are huge instruction tables for emulation etc. > It does differ from x86: on ARM, you can derive most information you need to emulate an instruction from the CPU registers that describe the fault condition (i.e.. the syndrome register and the fault address register). Only, those registers can only describe a single general purpose register, anything that uses more is difficult to emulate. It is essentially laziness on the part of the KVM/ARM authors, because they have been able to get away with it up to this point :-)