From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM04-SN1-obe.outbound.protection.outlook.com (mail-sn1nam04olkn082c.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe4c::82c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 4703021A134AC for ; Thu, 4 May 2017 05:18:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=T3ljsc0V2psZFOauV3tUtw8aJIUgIHZ97SDrB7Z/s+g=; b=FVFAzReWdj4EY1U5bNzNllqAcbHzdfabLJcmcAKKVqD8P4ev1paoNDpp2ckq/1/ITgV86Do/pWtHd3/2dPqTYUIH+wHHJdLHfcTJ6Bj9tzb+P5pEIaHMmfeNlYLFHUcUk02hDZY5hAGIjw1o5KXqow58B5M1glBCtleDtXerVeUmZyVCqrHQgwLK7aq5hdSbJN0LnDcVDlbq2ckmh06djutAv5JBEVGIJcI6rUdGo/mheqyPm9zsxOJgY6mw8FldUguuMdF1qKrq/0pLLIDiIGxOdj+Y+PzlbL1HWATxB1+tA4aS/FhSyvaDPYDUQEtxzeOCu0B+YKF9qZXE3hlx8w== Received: from SN1NAM04FT042.eop-NAM04.prod.protection.outlook.com (10.152.88.58) by SN1NAM04HT032.eop-NAM04.prod.protection.outlook.com (10.152.89.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.1047.9; Thu, 4 May 2017 12:18:11 +0000 Received: from MWHPR11MB1822.namprd11.prod.outlook.com (10.152.88.53) by SN1NAM04FT042.mail.protection.outlook.com (10.152.89.36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1047.9 via Frontend Transport; Thu, 4 May 2017 12:18:12 +0000 Received: from MWHPR11MB1822.namprd11.prod.outlook.com ([10.175.53.137]) by MWHPR11MB1822.namprd11.prod.outlook.com ([10.175.53.137]) with mapi id 15.01.1075.010; Thu, 4 May 2017 12:18:11 +0000 From: Amit kumar To: Andrew Fish CC: Mike Kinney , "edk2-devel@lists.01.org" Thread-Topic: [edk2] Accessing AVX/AVX2 instruction in UEFI. Thread-Index: AQHSw2YUrPw3Gy/dW0iHxTLtc/HNpqHiGq96gAHsTDmAAAaRgIAACzCB Date: Thu, 4 May 2017 12:18:11 +0000 Message-ID: References: <0E40AA0F-3FDD-420D-9982-43FB8E0DE81A@apple.com> , In-Reply-To: Accept-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: apple.com; dkim=none (message not signed) header.d=none;apple.com; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:67C034600A50FB178498598C27A315F62F9E03971578FBA770AF55404E5E7486; UpperCasedChecksum:840F01FF3C8F4EC2357DCE4DF3823B4A42BCB76ABE30F07EAE32FE14DB37E2DD; SizeAsReceived:8657; Count:46 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [vmOqBH0SdZnbpxMIvtIru97MlBfF/tiB] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; SN1NAM04HT032; 5:gZ5ERdFhTsfYJrpGp3nqddlqNRpIgyYN4rheHzqfOzGZX/h0fEVePa5F0PMUPHZOXm072z1uj5WmMVasiLeH3X+Oph84uxzJFFAvgPKxQnRWLxnbZfDzvN/LyeJBRRZqH48wZ10VdVXRhY/Jayz4DQ==; 24:0AApK0gQ6k4/IsB6ECnZZlOy9fIlCrWQwmthnOrId12pF0/y9vFII3WBYAk+zA10YjMqRFI6CRuOPhiCpYNREBTZd6v84LOA2+5bqRy36Cs=; 7:gTb/osUXOk/qz+UiIEKV1Wem5E+Wau70i+TtooVwd+nXUEtcU+XY2ojjmxtg79KpG5jLV+gTaE0EBwxiUubfT7FSdAWvluA0BB+mD3IjelbSDum9ATIJAUhH0AMfLEPejBHJe7wQZXc44hzcCePzsZ/BhP22M+RBRPdq/tW6tNe07mFcUiEKQ20HD4pXWZp1gTgVxfregw9i/uwN+ZwLBfWeiF9QYKvMA0QYj1Fmyeqq/dZX0GqIP69WSNHAhskUNmppbevH3u6XMazFYH5rFVTckFIya2Q8Hdf3pF++Gs9cyiSiFwR59KjkU6h5uj+B x-incomingheadercount: 46 x-eopattributedmessage: 0 x-forefront-antispam-report: EFV:NLI; SFV:NSPM; SFS:(7070007)(98901004); DIR:OUT; SFP:1901; SCL:1; SRVR:SN1NAM04HT032; H:MWHPR11MB1822.namprd11.prod.outlook.com; FPR:; SPF:None; LANG:en; x-ms-office365-filtering-correlation-id: 5a50b11b-7bbd-4e2d-665c-08d492e7a1f1 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(201702061074)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322274)(1601125374)(1603101448)(1701031045); SRVR:SN1NAM04HT032; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(444000031); SRVR:SN1NAM04HT032; BCL:0; PCL:0; RULEID:; SRVR:SN1NAM04HT032; x-forefront-prvs: 02973C87BC spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 04 May 2017 12:18:11.6413 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1NAM04HT032 X-Content-Filtered-By: Mailman/MimeDel 2.1.22 Subject: Re: Accessing AVX/AVX2 instruction in UEFI. X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 May 2017 12:18:13 -0000 Content-Language: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Yes am aligning the data at 32 byte boundary while allocating memory in bot= h environments. in windows using _alligned_malloc(size,32); in UEFI Offset =3D (UINTN)src & 0xFF; src =3D (CHAR8 *)((UINTN) src - Offset + 0x20); Thanks Amit ________________________________ From: afish@apple.com on behalf of Andrew Fish Sent: Thursday, May 4, 2017 5:02:55 PM To: Amit kumar Cc: Mike Kinney; edk2-devel@lists.01.org Subject: Re: [edk2] Accessing AVX/AVX2 instruction in UEFI. > On May 4, 2017, at 4:13 AM, Amit kumar wrote: > > Hi, > > > Even after using AVX2 instruction my code shown no performance improvemen= t in UEFI although there is substantial improvement when i run the similar = code in windows . > > Am i missing something ? > Is the data aligned the same in both environments? Thanks, Andrew Fish > Using MSVC compiler and the codes written in ASM. > > Thanks And Regards > > Amit > > ________________________________ > From: edk2-devel on behalf of Amit kuma= r > Sent: Wednesday, May 3, 2017 11:18:39 AM > To: Kinney, Michael D; Andrew Fish > Cc: edk2-devel@lists.01.org > Subject: Re: [edk2] Accessing AVX/AVX2 instruction in UEFI. > > Thank you Michael and Andrew > > > Regards > > Amit > > ________________________________ > From: Kinney, Michael D > Sent: Tuesday, May 2, 2017 10:33:45 PM > To: Andrew Fish; Amit kumar; Kinney, Michael D > Cc: edk2-devel@lists.01.org > Subject: RE: [edk2] Accessing AVX/AVX2 instruction in UEFI. > > Amit, > > The information from Andrew is correct. > > The document that covers this topic is the > Intel(r) 64 and IA-32 Architectures Software Developer Manuals > > https://software.intel.com/en-us/articles/intel-sdm > > Volume 1, Section 13.5.3 describes the AVX State. There are > More details about detecting and enabling different AVX features > in that document. > > If the CPU supports AVX, then the basic assembly instructions > required to use AVX instructions are the following that sets > bits 0, 1, 2 of XCR0. > > mov rcx, 0 > xgetbv > or rax, 0007h > xsetbv > > One additional item you need to be aware of is that UEFI firmware only > saves/Restores CPU registers required for the UEFI ABI calling convention > when a timer interrupt or exception is processed. > > This means CPU state such as the YMM registers are not saved/restored > across an interrupt and may be modified if code in interrupt context > also uses YMM registers. > > When you enable the use of extended registers, interrupts should be > saved/disabled and restored around the extended register usage. > > You can use the following functions from MdePkg BaseLib to do this > > /** > Disables CPU interrupts and returns the interrupt state prior to the dis= able > operation. > > @retval TRUE CPU interrupts were enabled on entry to this call. > @retval FALSE CPU interrupts were disabled on entry to this call. > > **/ > BOOLEAN > EFIAPI > SaveAndDisableInterrupts ( > VOID > ); > > /** > Set the current CPU interrupt state. > > Sets the current CPU interrupt state to the state specified by > InterruptState. If InterruptState is TRUE, then interrupts are enabled. = If > InterruptState is FALSE, then interrupts are disabled. InterruptState is > returned. > > @param InterruptState TRUE if interrupts should enabled. FALSE if > interrupts should be disabled. > > @return InterruptState > > **/ > BOOLEAN > EFIAPI > SetInterruptState ( > IN BOOLEAN InterruptState > ); > > Algorithm: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > { > BOOLEAN InterruptState; > > InterruptState =3D SaveAndDisableInterrupts(); > > // Enable use of AVX/AVX2 instructions > > // Use AVX/AVX2 instructions > > SetInterruptState (InterruptState); > } > > Best regards, > > Mike > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of A= ndrew Fish >> Sent: Tuesday, May 2, 2017 8:12 AM >> To: Amit kumar >> Cc: edk2-devel@lists.01.org >> Subject: Re: [edk2] Accessing AVX/AVX2 instruction in UEFI. >> >> >>> On May 2, 2017, at 6:57 AM, Amit kumar wrote: >>> >>> Hi, >>> >>> Am trying to optimize an application using AVX/AVX2, but my code hangs = while trying >> to access YMM registers. >>> The instruction where my code hangs is : >>> >>> >>> vmovups ymm0, YMMWORD PTR [rax] >>> >>> >>> I have verified the cpuid in OS and it supports AVX and AVX2 instructio= n. Processor >> i7 6th gen. >>> Can somebody help me out here ? Is there a way to enable YMM registers = ? >>> >> >> Amit, >> >> I think these instructions will generate an illegal instruction fault un= til you enable >> AVX. You need to check the Cpu ID bits in your code, then write BIT18 of= CR4. After >> that XGETBV/XSETBV instructions are enabled and you can or in the lower = 2 bits of >> XCR0. This basic operation is in the Intel Docs, it is just hard to find= . Usually the >> OS has done this for the programmer and all the code needs to do is chec= k the CPU ID. >> >> Thanks, >> >> Andrew Fish >> >>> >>> Thanks And Regards >>> Amit Kumar >>> >>> _______________________________________________ >>> edk2-devel mailing list >>> edk2-devel@lists.01.org >>> https://lists.01.org/mailman/listinfo/edk2-devel >> >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel