From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id A127A2195DA61 for ; Tue, 2 May 2017 11:16:28 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C12F422A85F; Tue, 2 May 2017 18:16:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com C12F422A85F Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=lersek@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com C12F422A85F Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-154.phx2.redhat.com [10.3.116.154]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4F5E117AA4; Tue, 2 May 2017 18:16:24 +0000 (UTC) Cc: edk2-devel-01 , Gerd Hoffmann , Paolo Bonzini To: Jeff Fan , Michael Kinney , Jiewen Yao From: Laszlo Ersek Message-ID: <1382eb04-9646-133b-9ce5-8293cb54745f@redhat.com> Date: Tue, 2 May 2017 20:16:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 02 May 2017 18:16:28 +0000 (UTC) Subject: SMRAM sizes on large hosts X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 02 May 2017 18:16:28 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Hi All, in your experience, how much SMRAM do "big hosts" provide? (Machines that have, say, ~300 CPU cores.) With QEMU's Q35 board, which provides 8MB of SMRAM (TSEG), we're hitting various out-of-SMRAM conditions with OVMF at around 230-240 VCPUs. We'd like to go to a higher VCPU count than that. * So, in your experience, how much SMRAM do physical boards, that carry such a high number of cores, provide? * Perhaps we'll have to do something about the SMRAM size on QEMU in the longer term, but until then, can you guys recommend various "cheap tricks" to decrease per-VCPU SMRAM usage? For example, in OVMF we have a 16KB SMM stack per VCPU, and we also enable the SMM stack overflow guard page -- we had been hit by an SMM stack overflow with the original 8KB stack size, and so we increased both the stack size and enabled the guard page; see commits 509f8425b75d UefiCpuPkg: change PcdCpuSmmStackGuard default to TRUE 0d0c245dfb14 OvmfPkg: set SMM stack size to 16KB I've now tried to decrease the stack size to the "middle point" 12KB. That stack size does not reproduce the SMM stack overflow seen originally, but it also doesn't help with the SMRAM exhaustion -- we cannot go to any higher VCPU count with it. Are there any other "tweakables" (PCDs) we could massage to see the per-VCPU SMRAM usage go down? Here's a (somewhat indiscriminate) list of PCDs, from the OVMF Ia32X64 build report file, where each PCD's name contains "smm": PcdCpuSmmBlockStartupThisAp : FLAG (BOOLEAN) = 0 PcdCpuSmmDebug : FLAG (BOOLEAN) = 0 PcdCpuSmmFeatureControlMsrLock : FLAG (BOOLEAN) = 1 PcdCpuSmmProfileEnable : FLAG (BOOLEAN) = 0 PcdCpuSmmProfileRingBuffer : FLAG (BOOLEAN) = 0 PcdCpuSmmStackGuard : FLAG (BOOLEAN) = 1 *P PcdCpuSmmEnableBspElection : FLAG (BOOLEAN) = 0 *P PcdSmmSmramRequire : FLAG (BOOLEAN) = 1 PcdCpuSmmCodeAccessCheckEnable : FIXED (BOOLEAN) = 1 PcdCpuSmmProfileSize : FIXED (UINT32) = 0x200000 PcdCpuSmmStaticPageTable : FIXED (BOOLEAN) = 1 *P PcdCpuSmmStackSize : FIXED (UINT32) = 0x4000 PcdLoadFixAddressSmmCodePageNumber : PATCH (UINT32) = 0 PcdS3BootScriptTablePrivateSmmDataPtr : DYN (UINT64) = 0x0 *P PcdCpuSmmApSyncTimeout : DYN (UINT64) = 100000 *P PcdCpuSmmSyncMode : DYN (UINT8) = 0x01 (BTW, security features should not be disabled, even if they saved some SMRAM.) Thank you, Laszlo