From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=2607:f8b0:4001:c06::242; helo=mail-io0-x242.google.com; envelope-from=ard.biesheuvel@linaro.org; receiver=edk2-devel@lists.01.org Received: from mail-io0-x242.google.com (mail-io0-x242.google.com [IPv6:2607:f8b0:4001:c06::242]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 1A1AE224DCA42 for ; Thu, 8 Mar 2018 14:49:51 -0800 (PST) Received: by mail-io0-x242.google.com with SMTP id b34so1511240ioj.6 for ; Thu, 08 Mar 2018 14:56:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=xDa/oCtgc6KMBcafGTt2XH2ucQkpd5dX9sXH7FHwI9M=; b=XEeRazE+4WUEPdyWIJZdnCmswfLUI2MUfNExITHz+pBXav+iDi8D7ej7OSSJfuzviY Jjc3phAJxBwC6AREj8yx7jKxUyGLv20f7/OUCZBn9W45DFQzTEwg3U5G00LwfYgbPwf0 TQnhdNnVMHi7bzTzm3RdZhiaz/BhAcskjQ1oA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=xDa/oCtgc6KMBcafGTt2XH2ucQkpd5dX9sXH7FHwI9M=; b=CugW9JY72bu0yGOCxIX27gp7l7HyZXckJmLTQcTy1PTTN3hCAwCmKO4oiWVUpOX0CN MthXhcjChxZQJ+LpRtlwL30I1rBDxZJJl0wJMRC/x1tyGoCa5V3bPMR3VQpEWl8JCXut O+d3mWoQS1ayiHH9arxE8WJ4cX3fsOTiN3gFqYYZiqoH+1MMOJMyFSTakMujNGOaoA6v hXhMI18pIeIn3uoClm6yHO6U3ltZmzz7zXZ1AB/Cd+DAsSEMmjdQIUFIvHE9uBDR+6+M ouVS+Xuu9mYF2B90xp5lxmopkDoyPa3hpWyXQ8wrlOj8XcEjfaTStW42qcgvdDmKcyyZ 82sw== X-Gm-Message-State: AElRT7EPxs4T9yWTphM6LvQN7sGfVBeY+vxLbGpiHt5BH/hDdscfRMBa 7WMtdnptVzW55N2hntN5ASL0k1J5fawJqWsAXkCx+A== X-Google-Smtp-Source: AG47ELu0yPlu/QiUXd081PmMRZt2q47H2zcQfyGj6aDnjQf8aeor74PcHP6Q6WEFgBksK3nRxruqXrCdNhdDBPR5WxI= X-Received: by 10.107.151.74 with SMTP id z71mr31171753iod.277.1520549767767; Thu, 08 Mar 2018 14:56:07 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.138.209 with HTTP; Thu, 8 Mar 2018 14:56:07 -0800 (PST) In-Reply-To: <60cdf9e6-7c5e-37cc-997a-a740a30569d5@arm.com> References: <20180308131353.17389-1-ard.biesheuvel@linaro.org> <60cdf9e6-7c5e-37cc-997a-a740a30569d5@arm.com> From: Ard Biesheuvel Date: Thu, 8 Mar 2018 22:56:07 +0000 Message-ID: To: Jeremy Linton Cc: "edk2-devel@lists.01.org" , Leif Lindholm , Graeme Gregory Subject: Re: [PATCH edk2-platforms] Silicon/SynQuacer: add PPTT ACPI table to describe cache topology X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Mar 2018 22:49:52 -0000 Content-Type: text/plain; charset="UTF-8" On 8 March 2018 at 21:48, Jeremy Linton wrote: > Hi, > > > On 03/08/2018 02:38 PM, Ard Biesheuvel wrote: >> >> On 8 March 2018 at 17:59, Ard Biesheuvel >> wrote: >>> >>> On 8 March 2018 at 17:50, Jeremy Linton wrote: >>>> >>>> Hi, >>>> >>>> >>>> On 03/08/2018 11:27 AM, Ard Biesheuvel wrote: >>>>> >>>>> >>>>> On 8 March 2018 at 17:24, Jeremy Linton wrote: >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> >>>>>> On 03/08/2018 07:13 AM, Ard Biesheuvel wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Add a ACPI Processor Properties Topology Table (PPTT) to the >>>>>>> SynQuacer >>>>>>> builds. This information is used by the OS to tune the scheduler. >>>>>>> >>>>>>> Contributed-under: TianoCore Contribution Agreement 1.1 >>>>>>> Signed-off-by: Ard Biesheuvel >>>>>>> --- >>>>>>> This produces the following topology after applying Jeremy's patches: >>>>>>> >>>>>>> $ lstopo-no-graphics >>>>>>> Machine (31GB) >>>>>>> Package L#0 + L3 L#0 (4096KB) >>>>>>> L2 L#0 (256KB) >>>>>>> L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) >>>>>>> L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1) >>>>>>> L2 L#1 (256KB) >>>>>>> L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2) >>>>>>> L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) >>>>>>> L2 L#2 (256KB) >>>>>>> L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4) >>>>>>> L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5) >>>>>>> L2 L#3 (256KB) >>>>>>> L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6) >>>>>>> L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7) >>>>>>> L2 L#4 (256KB) >>>>>>> L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8) >>>>>>> L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9) >>>>>>> L2 L#5 (256KB) >>>>>>> L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 >>>>>>> (P#10) >>>>>>> L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 >>>>>>> (P#11) >>>>>>> L2 L#6 (256KB) >>>>>>> L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 >>>>>>> (P#12) >>>>>>> L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 >>>>>>> (P#13) >>>>>>> L2 L#7 (256KB) >>>>>>> L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 >>>>>>> (P#14) >>>>>>> L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 >>>>>>> (P#15) >>>>>>> L2 L#8 (256KB) >>>>>>> L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 >>>>>>> (P#16) >>>>>>> L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 >>>>>>> (P#17) >>>>>>> L2 L#9 (256KB) >>>>>>> L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 >>>>>>> (P#18) >>>>>>> L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 >>>>>>> (P#19) >>>>>>> L2 L#10 (256KB) >>>>>>> L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 >>>>>>> (P#20) >>>>>>> L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 >>>>>>> (P#21) >>>>>>> L2 L#11 (256KB) >>>>>>> L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 >>>>>>> (P#22) >>>>>>> L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 >>>>>>> (P#23) >>>>>>> HostBridge L#0 >>>>>>> PCIBridge >>>>>>> PCIBridge >>>>>>> PCI 1b21:0612 >>>>>>> Block(Disk) L#0 "sda" >>>>>>> HostBridge L#3 >>>>>>> PCI 10de:128b >>>>>>> GPU L#1 "renderD128" >>>>>>> GPU L#2 "card0" >>>>>>> GPU L#3 "controlD64" >>>>>>> >>>>>>> Silicon/Socionext/SynQuacer/AcpiTables/AcpiTables.inf | 1 + >>>>>>> Silicon/Socionext/SynQuacer/AcpiTables/Pptt.aslc | 221 >>>>>>> ++++++++++++++++++++ >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> So, the above looks good. OTOH, this is yet another manually >>>>>> created/hard-coded ACPI table, subject to change every-time another >>>>>> SOC >>>>>> is >>>>>> released. I have a couple similar ones, but haven't post them because >>>>>> I >>>>>> believe the HiSi folks have done us a favor and created a table >>>>>> generator >>>>>> which does 90% of this work by probing the hardware, and creating a >>>>>> "compressed" representation of the table. Leaving the individual >>>>>> platforms >>>>>> to only fill out LLCs and such which can't be probed. >>>>>> >>>>>> It would be great if the remaining HiSi bits were removed and the >>>>>> whole >>>>>> thing made generic enough to plug in to these individual platforms, so >>>>>> that >>>>>> they only need supply their nonstandard bits and the rest is taken >>>>>> care >>>>>> of. >>>>>> >>>>> >>>>> Yeah, I am aware of that. But to be honest, for a platform such as >>>>> this one, where the information is 100% static, I'd much rather have a >>>>> single .c file like this that never changes once you check it in. >>>>> >>>> >>>> Maybe, but even if there is never an identical machine with a few cores >>>> extra (or removed), you have to deal with the possibility that the >>>> standard >>>> is going to be updated (say to add a leaf node flag). If/when that >>>> happens >>>> you now have the technical debt of having to go manually touch the >>>> tables vs >>>> updating the generator code and being done with it across all the >>>> platforms. >>>> Particularly since other people are just going to take the same shortcut >>>> next time of just copy-pasting this table and tweaking it, rather than >>>> trying to create a library out of the HiSi code. >>> >>> >>> I agree up to a point, and I think we are conflating two different >>> things here. I am all for abstracting PPTT table generation, but only >>> if it doesn't move processing of information known at compile time to >>> runtime. >> >> >> BTW the architecture does not permit inferring cache geometry from the >> CCSIDR field: >> >> """ >> The parameters NumSets, Associativity, and LineSize in these registers >> define the architecturally visible parameters >> that are required for the cache maintenance by Set/Way instructions. >> They are not guaranteed to represent the actual >> microarchitectural features of a design. You cannot make any inference >> about the actual sizes of caches based on >> these parameters. >> """ >> >> and we know that there are cores that describe their caches as having >> a single set and a single way, because they don't implement >> maintenance by set/way at all and can only be cleaned or invalidated >> wholesale. >> > > That is true in the architectural sense, but as we all know for many of > these machines it can be inferred, and since the firmware blobs would be per > machine it would be a question of overriding/correcting whatever is wrong > per machine rather than having the entire table be per machine. > Yes, but my point is that if you're doing that, why do you read the cache registers at all, and not simply fill out all the info. Firmware is tightly coupled to the platform anyway, unlike the kernel, so I don't see the value in having runtime code that infers information that never changes. > Put another way, you can support a wide swath of machines with A5x & A7x's > with a single piece of code which only hard-codes the L3 (if any) > information.