From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-it0-x230.google.com (mail-it0-x230.google.com [IPv6:2607:f8b0:4001:c0b::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 4BF0A1A1E12 for ; Mon, 1 Aug 2016 04:53:10 -0700 (PDT) Received: by mail-it0-x230.google.com with SMTP id j124so169995488ith.1 for ; Mon, 01 Aug 2016 04:53:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=e1fIOCUwQzI0hm3qBC9mROiAan7++tkMQjgJGJ0ocrg=; b=Wo9Qs4DoxOAo2AsodxOR2EoMXeF9Bb8ZVDqKeO9JMrWz+wM+diJlMc6NYIUCYddZdZ 69H0W//LBUV5qkN68V++4sJjAE79JAlN+NiTNQCRFNZiGv42kdg33MI5f3bxu/uzifgL EYzP/6KYqEkrP+MpuFAUjoKhubysSA8NaL9vA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=e1fIOCUwQzI0hm3qBC9mROiAan7++tkMQjgJGJ0ocrg=; b=cD22dmU/0uZIV/QmcQ9I7ZD7Gdn9qe86CsCbfgSQJ+OkjBrnPUOC8XaUbQwINtE6eJ xdwKO2jDC094E8KV83AxPCU7szuw6/9cNt1Bu9m8b7//+8IY57mymZ1aFyZhdz0BYh5C NAAOY6B+lNxvz5FaEC/tAkmHXvRZPE59s1/kffg3xyTCMju6f3H5V79r6sTTOvYe9MZm o9KoTiq0OkUkI/L+/bG15JlNPE8VGyusx2jiiEdIS1XXjz7vivtSw6bRHPRTaGO6SiOx tqZA4lnhzUZzHlZoFWn3sbctPVJjRkpoQ6KfMhkEybyE9A5PAOk5YWbTnJPeBRiFhQrB uOsA== X-Gm-Message-State: AEkoousnODl+1p9I/HPsPh1to/42gX7YGN78EU2C+Ku9Oj9TzRPDZCG0g5BQjRAtg95EOs9l7pJjcrhHEjd0f0/s X-Received: by 10.36.57.215 with SMTP id l206mr13979907ita.5.1470052389560; Mon, 01 Aug 2016 04:53:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.204.195 with HTTP; Mon, 1 Aug 2016 04:53:09 -0700 (PDT) In-Reply-To: <1469618762-7648-1-git-send-email-ard.biesheuvel@linaro.org> References: <1469618762-7648-1-git-send-email-ard.biesheuvel@linaro.org> From: Ard Biesheuvel Date: Mon, 1 Aug 2016 13:53:09 +0200 Message-ID: To: edk2-devel-01 , "Gao, Liming" , "Zhu, Yonghong" Cc: Leif Lindholm , "Cohen, Eugene" , Ard Biesheuvel Subject: Re: [PATCH 1/2] BaseTools/GenFw AARCH64: convert ADRP to ADR if binary size allows it X-BeenThere: edk2-devel@lists.01.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: EDK II Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Aug 2016 11:53:10 -0000 Content-Type: text/plain; charset=UTF-8 On 27 July 2016 at 13:26, Ard Biesheuvel wrote: > The ADRP instruction in the AArch64 ISA requires the link time and load > time offsets of a binary to be equal modulo 4 KB. The reason is that this > instruction always produces a multiple of 4 KB, and relies on a subsequent > ADD or LDR instruction to set the offset into the page. The resulting > symbol reference only produces the correct value if the symbol in question > resides at that exact offset into the page, and so loading the binary at > arbitrary offsets is not possible. > > Due to the various levels of padding when packing FVs into FVs into FDs, > this alignment is very costly for XIP code, and so we would like to relax > this alignment requirement if possible. > > Given that symbols that are sufficiently close (within 1 MB) of the > reference can also be reached using an ADR instruction which does not > suffer from this alignment issue, let's replace ADRP instructions with ADR > after linking if the offset can be encoded in this instruction's immediate > field. Note that this only makes sense if the section alignment is < 4 KB. > Otherwise, replacing the ADRP has no benefit, considering that the > subsequent ADD or LDR instruction is retained, and that micro-architectures > are more likely to be optimized for ADRP/ADD pairs (i.e., via micro op > fusing) than for ADR/ADD pairs, which are non-typical. > > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Ard Biesheuvel @Liming, @Leif: are there any objections to these patches? I know it is unfortunate that we need to modify instructions as part of the ELF to PE/COFF conversion, but it is very effective ArmVirtQemu-AARCH64 built with CLANG35: Before: FVMAIN_COMPACT [41%Full] 2093056 total, 868416 used, 1224640 free FVMAIN [99%Full] 4848064 total, 4848008 used, 56 free After: FVMAIN_COMPACT [36%Full] 2093056 total, 768064 used, 1324992 free FVMAIN [99%Full] 4848064 total, 4848008 used, 56 free For comparision, GCC49 FVMAIN_COMPACT [35%Full] 2093056 total, 749960 used, 1343096 free FVMAIN [99%Full] 3929088 total, 3929032 used, 56 free and GCC5 (with LTO) FVMAIN_COMPACT [34%Full] 2093056 total, 732400 used, 1360656 free FVMAIN [99%Full] 3730240 total, 3730216 used, 24 free In other words, it turns CLANG35 from a pathetic outlier into something usable :-) Regards, Ard.