From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-in5.apple.com (mail-out5.apple.com [17.151.62.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 16E9421A13482 for ; Thu, 4 May 2017 04:32:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1493897576; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-version:Content-type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=pQv0l2ZeEW6e0pYEjxZCBH1+yT6NFYTo0PkVbyN462U=; b=wyPLJ8n4rOftpJrhxjMtPcM5/NddsmoVsrwtirEWce1JWzhDTfu123VZc8lys1G2 WkzMby8uAQIV74e+My/KmpsVufU7DsNVhtRmN8BuOTxbHwzPT7Oxc70SobFv2LPc OyXjYAsyf7CqDCHyZZlaOdAlZvMQHpKOv0fCci3gMQc5m6Sr3AMOFSaViVaxurIR WpEvLGj68kUSVNIAHfdx8dJF80uZu5q9jZzXMTgNpsx6fPSnX1wfQW/zwKWnDRSx 3mZt4064agH6B20sRtmX2y6iPq3lcrRgn6k4hUzXe7JYMvyL3za5Npm/03RTvN8L ir9FirAwPqAzY/cTy9x4uw==; Received: from relay6.apple.com (relay6.apple.com [17.128.113.90]) (using TLS with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail-in5.apple.com (Apple Secure Mail Relay) with SMTP id A0.9C.25795.8611B095; Thu, 4 May 2017 04:32:56 -0700 (PDT) X-AuditID: 11973e13-4cd389a0000064c3-7b-590b11689f5b Received: from nwk-mmpp-sz11.apple.com (nwk-mmpp-sz11.apple.com [17.128.115.155]) by relay6.apple.com (Apple SCV relay) with SMTP id FF.92.09762.8611B095; Thu, 4 May 2017 04:32:56 -0700 (PDT) MIME-version: 1.0 Received: from [17.153.91.186] (unknown [17.153.91.186]) by nwk-mmpp-sz11.apple.com (Oracle Communications Messaging Server 8.0.1.2.20170210 64bit (built Feb 10 2017)) with ESMTPSA id <0OPF000ADEQVKU60@nwk-mmpp-sz11.apple.com>; Thu, 04 May 2017 04:32:56 -0700 (PDT) Sender: afish@apple.com From: Andrew Fish In-reply-to: Date: Thu, 04 May 2017 04:32:55 -0700 Cc: Mike Kinney , "edk2-devel@lists.01.org" Message-id: References: <0E40AA0F-3FDD-420D-9982-43FB8E0DE81A@apple.com> To: Amit kumar X-Mailer: Apple Mail (2.3273) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrCLMWRmVeSWpSXmKPExsUi2FAYpZshyB1p8OqSnsW+1zuZLPYcOsps 0dHxj8mB2eNxzxk2j8V7XjJ5dM/+xxLAHMVlk5Kak1mWWqRvl8CVsa7tP3vBCe2KhpdvmBsY Fyt1MXJySAiYSLRf+M7SxcjFISSwmkni5OLjjDCJ7ofXmCEShxgllq76wQSS4BUQlPgx+R5Q BwcHs4C8xMHzsiBhZgEtie+PWllAbCGBiUwSr3bqg9jCAuIS785sYoawrSVeLWwGm88moCyx Yv4HdhCbUyBeYvGzdWwgNouAqkTv+9OsEDOTJaYtOsEOsdZG4veio6wQ9zQyS5zfNRVskAhQ Q3/7FHaIo2Ulbs2+xAxh72CTODC9egKj8CwkZ89COHsWkrMXMDKvYhTKTczM0c3MM9VLLCjI SdVLzs/dxAgK9ul2wjsYT6+yOsQowMGoxMPL8YAzUog1say4MvcQozQHi5I4r4gmUEggPbEk NTs1tSC1KL6oNCe1+BAjEwenVAPjesvyttZsFbkN3Vpfdjzik1NJj5g9i0NBOX6z+7HukJoP J07MeCHQoB5+WFY7o3/q8fLf+llsm8Okq+a32jwpO1bo4nnE6v/EI46c6Zbqlhd0518XnfKj ZVbQm/ycA6yvnix+Vnn8/7GuCbVTZ1vfYzwgL742w23ZnXg7u/VrggIj7qSd4y1UYinOSDTU Yi4qTgQAaBubhlcCAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprOIsWRmVeSWpSXmKPExsUi2FA8WzdDkDvS4P5XeYt9r3cyWew5dJTZ oqPjH5MDs8fjnjNsHov3vGTy6J79jyWAOYrLJiU1J7MstUjfLoErY13bf/aCE9oVDS/fMDcw LlbqYuTkkBAwkeh+eI25i5GLQ0jgEKPE0lU/mEASvAKCEj8m32PpYuTgYBaQlzh4XhYkzCyg JfH9USsLiC0kMJFJ4tVOfRBbWEBc4t2ZTcwQtrXEq4XNjCA2m4CyxIr5H9hBbE6BeInFz9ax gdgsAqoSve9Ps0LMTJaYtugEO8RaG4nfi46yQtzTyCxxftdUsEEiQA397VPYIY6Wlbg1+xLz BEaBWUhOnYVw6iwkpy5gZF7FKFCUmpNYaaaXWFCQk6qXnJ+7iREcnoVROxgbllsdYhTgYFTi 4d1wlzNSiDWxrLgyFxgWHMxKIrzFnNyRQrwpiZVVqUX58UWlOanFhxirgB6YyCwlmpwPjJ28 knhDExMDE2NjM2NjcxNzqggrifNOy2aKFBJITyxJzU5NLUgtglnOxMEp1cAosiEjPu9ruO+P J2c0Xodb3M70ONTowbLgKUf3K9e7Piv+S/IcmL1s+odA0VPZDF6zH7exus1mzbm+/YlWaOWq ZsPDU32eXL0Q6qYfeGdG7Vfdd6x/pBwf+4g/2zJRacIEzynlrz29Zy2dpfKvhqt6//VYyVMJ h2p0NNf57rx10fXk9Lw9gm/7lViKMxINtZiLihMBbQNiXaoCAAA= Subject: Re: Accessing AVX/AVX2 instruction in UEFI. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 May 2017 11:32:57 -0000 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII > On May 4, 2017, at 4:13 AM, Amit kumar wrote: > > Hi, > > > Even after using AVX2 instruction my code shown no performance improvement in UEFI although there is substantial improvement when i run the similar code in windows . > > Am i missing something ? > Is the data aligned the same in both environments? Thanks, Andrew Fish > Using MSVC compiler and the codes written in ASM. > > Thanks And Regards > > Amit > > ________________________________ > From: edk2-devel on behalf of Amit kumar > Sent: Wednesday, May 3, 2017 11:18:39 AM > To: Kinney, Michael D; Andrew Fish > Cc: edk2-devel@lists.01.org > Subject: Re: [edk2] Accessing AVX/AVX2 instruction in UEFI. > > Thank you Michael and Andrew > > > Regards > > Amit > > ________________________________ > From: Kinney, Michael D > Sent: Tuesday, May 2, 2017 10:33:45 PM > To: Andrew Fish; Amit kumar; Kinney, Michael D > Cc: edk2-devel@lists.01.org > Subject: RE: [edk2] Accessing AVX/AVX2 instruction in UEFI. > > Amit, > > The information from Andrew is correct. > > The document that covers this topic is the > Intel(r) 64 and IA-32 Architectures Software Developer Manuals > > https://software.intel.com/en-us/articles/intel-sdm > > Volume 1, Section 13.5.3 describes the AVX State. There are > More details about detecting and enabling different AVX features > in that document. > > If the CPU supports AVX, then the basic assembly instructions > required to use AVX instructions are the following that sets > bits 0, 1, 2 of XCR0. > > mov rcx, 0 > xgetbv > or rax, 0007h > xsetbv > > One additional item you need to be aware of is that UEFI firmware only > saves/Restores CPU registers required for the UEFI ABI calling convention > when a timer interrupt or exception is processed. > > This means CPU state such as the YMM registers are not saved/restored > across an interrupt and may be modified if code in interrupt context > also uses YMM registers. > > When you enable the use of extended registers, interrupts should be > saved/disabled and restored around the extended register usage. > > You can use the following functions from MdePkg BaseLib to do this > > /** > Disables CPU interrupts and returns the interrupt state prior to the disable > operation. > > @retval TRUE CPU interrupts were enabled on entry to this call. > @retval FALSE CPU interrupts were disabled on entry to this call. > > **/ > BOOLEAN > EFIAPI > SaveAndDisableInterrupts ( > VOID > ); > > /** > Set the current CPU interrupt state. > > Sets the current CPU interrupt state to the state specified by > InterruptState. If InterruptState is TRUE, then interrupts are enabled. If > InterruptState is FALSE, then interrupts are disabled. InterruptState is > returned. > > @param InterruptState TRUE if interrupts should enabled. FALSE if > interrupts should be disabled. > > @return InterruptState > > **/ > BOOLEAN > EFIAPI > SetInterruptState ( > IN BOOLEAN InterruptState > ); > > Algorithm: > ============ > { > BOOLEAN InterruptState; > > InterruptState = SaveAndDisableInterrupts(); > > // Enable use of AVX/AVX2 instructions > > // Use AVX/AVX2 instructions > > SetInterruptState (InterruptState); > } > > Best regards, > > Mike > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Andrew Fish >> Sent: Tuesday, May 2, 2017 8:12 AM >> To: Amit kumar >> Cc: edk2-devel@lists.01.org >> Subject: Re: [edk2] Accessing AVX/AVX2 instruction in UEFI. >> >> >>> On May 2, 2017, at 6:57 AM, Amit kumar wrote: >>> >>> Hi, >>> >>> Am trying to optimize an application using AVX/AVX2, but my code hangs while trying >> to access YMM registers. >>> The instruction where my code hangs is : >>> >>> >>> vmovups ymm0, YMMWORD PTR [rax] >>> >>> >>> I have verified the cpuid in OS and it supports AVX and AVX2 instruction. Processor >> i7 6th gen. >>> Can somebody help me out here ? Is there a way to enable YMM registers ? >>> >> >> Amit, >> >> I think these instructions will generate an illegal instruction fault until you enable >> AVX. You need to check the Cpu ID bits in your code, then write BIT18 of CR4. After >> that XGETBV/XSETBV instructions are enabled and you can or in the lower 2 bits of >> XCR0. This basic operation is in the Intel Docs, it is just hard to find. Usually the >> OS has done this for the programmer and all the code needs to do is check the CPU ID. >> >> Thanks, >> >> Andrew Fish >> >>> >>> Thanks And Regards >>> Amit Kumar >>> >>> _______________________________________________ >>> edk2-devel mailing list >>> edk2-devel@lists.01.org >>> https://lists.01.org/mailman/listinfo/edk2-devel >> >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel