From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mx.groups.io with SMTP id smtpd.web11.17634.1668696667541889534 for ; Thu, 17 Nov 2022 06:51:07 -0800 Authentication-Results: mx.groups.io; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=Pz22Bl4k; spf=pass (domain: linux.microsoft.com, ip: 13.77.154.182, mailfrom: mikuback@linux.microsoft.com) Received: from [192.168.4.22] (unknown [47.201.8.94]) by linux.microsoft.com (Postfix) with ESMTPSA id 7150020B83C2; Thu, 17 Nov 2022 06:51:06 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 7150020B83C2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1668696667; bh=sswikFZ/4QqF4jzJOK6EiMlOvtlJ+n8xDuthekp56Zk=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=Pz22Bl4kNizMiqRb6OUJ66w+eeqiwGRapH5/Of/V+o6UTAeXfZqbUP6MT7KphgMtK IbL6mwFqHMyqNNkZ953XpXz8CeIxabdz6OJx7B3T8D4rdJoi67hsCa+hnpebCU4T8M eDgBQ2L00kBOdwBRykTv5fz93a7fvbUfonGLmkOo= Message-ID: Date: Thu, 17 Nov 2022 09:51:05 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.13.1 Subject: Re: [edk2-devel] [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page To: devel@edk2.groups.io, michael.d.kinney@intel.com, "michael.kubacki@outlook.com" Cc: Sean Brogan References: From: "Michael Kubacki" In-Reply-To: Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit The wiki page is now published: https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Code-Scanning Thanks, Michael On 11/16/2022 12:24 PM, Michael D Kinney wrote: > Hi Michael, > > Thank you for putting this summary together on CodeQL. Looks like a great start to this content. > > Reviewed-by: Michael D Kinney > > Mike > >> -----Original Message----- >> From: devel@edk2.groups.io On Behalf Of Michael Kubacki >> Sent: Wednesday, November 16, 2022 7:57 AM >> To: devel@edk2.groups.io >> Cc: Sean Brogan ; Michael Kubacki ; Kinney, Michael D >> >> Subject: [edk2-devel] [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page >> >> From: Michael Kubacki >> >> This page includes: >> >> 1. An explanation of how Code Scanning works in edk2 >> 2. An overview of CodeQL >> 3. Links to key CodeQL resources >> 4. Links to key CodeQL files in the edk2 repo >> 5. A CodeQL query target list for edk2 with completion status >> 6. An explanation on how query filtering is used in edk2 >> 7. The process for proposing new CodeQL queries for edk2 >> 8. Basic CodeQL CLI usage for the edk2 build process >> >> This is meant to be a starting point to document CodeQL information >> for edk2. It will be updated over time. >> >> Cc: Sean Brogan >> Cc: Michael Kubacki >> Cc: Michael D Kinney >> Signed-off-by: Michael Kubacki >> --- >> >> Notes: >> A rendered version of this markdown document is >> available on my fork: >> >> https://github.com/makubacki/tianocore.github.io/blob/add_edk2_code_scanning_page/EDK-II-Code-Scanning.md >> >> EDK-II-Code-Scanning.md | 232 ++++++++++++++++++++ >> 1 file changed, 232 insertions(+) >> >> diff --git a/EDK-II-Code-Scanning.md b/EDK-II-Code-Scanning.md >> new file mode 100644 >> index 000000000000..c54c7bef4214 >> --- /dev/null >> +++ b/EDK-II-Code-Scanning.md >> @@ -0,0 +1,232 @@ >> +# EDK II Code Scanning >> + >> +CodeQL is a code analysis engine developed by Github to automate security checks. >> + >> +It is used for Code Scanning in the TianoCore edk2 repository. >> + >> +## Table of Contents >> + >> +1. [Overview](#overview) >> +2. [CodeQL Usage in edk2](#codeql-usage-in-edk2) >> + - [Query Target List](#query-target-list) >> + - [Query Filtering in edk2](#query-filtering-in-edk2) >> + - [Process for Suggesting New Queries for edk2](#process-for-suggesting-new-queries-for-edk2) >> + - [Query Enabling Process](#query-enabling-process) >> + - [CodeQL in Pull Requests](#codeql-in-pull-requests) >> + - [Dismissing CodeQL Alerts](#dismissing-codeql-alerts) >> +3. [CodeQL CLI Local Commands](#codeql-cli-local-commands) >> +4. [The CodeQL Project](#the-codeql-project) >> + >> +## Overview >> + >> +CodeQL is open source and free for open source projects. It is maintained by GitHub and naturally has excellent >> +integration with GitHub projects. CodeQL uses a semantic code analysis engine to discover vulnerabilities in a >> +number of programming languages (both compiled and interpreted). >> + >> +[General CodeQL Information](https://codeql.github.com/) >> + >> +TianoCore uses CodeQL C/C++ queries to find common programming errors and security vulnerabilities in firmware code. >> +Many open-source queries are officially supported and comprise the vulnerability analysis performed against the >> +database. >> + >> +[CodeQL Query Repository](https://github.com/github/codeql) >> + >> +In addition, anyone can leverage the code analysis engine by writing a custom query. Information around writing a >> +custom query is available in the official documentation. >> + >> +[CodeQL Query Documentation](https://codeql.github.com/docs/writing-codeql-queries/codeql-queries/#codeql-queries). >> + >> +The [edk2](https://github.com/tianocore/edk2) repository uses GitHub's Code Scanning feature (free for public >> +repositories on GitHub.com) to show alerts directly in the repository and run CodeQL on pull requests and pushes >> +to the repository. >> + >> +[About GitHub Code Scanning](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for- >> vulnerabilities-and-errors/about-code-scanning) >> + >> +Current CodeQL scanning results in the edk2 project are available in the "Actions" page of the GitHub repository. >> + >> +[edk2 CodeQL Workflow](https://github.com/tianocore/edk2/actions) >> + >> +A CodeQL command-line interface (CLI) is also available which can be run locally. A CodeQL CLI reference and manual >> +are available in the documentation to learn how to use the CLI. >> + >> +[CodeQL CLI Documenation](https://codeql.github.com/docs/codeql-cli/) >> + >> +At a high-level, there's two main phases of CodeQL execution to be aware of. >> + >> +1. CodeQL database generation >> + - [CodeQL Database Documentation](https://codeql.github.com/docs/codeql-cli/creating-codeql-databases/) >> +2. CodeQL database analysis >> + - [CodeQL Analysis Documentation](https://codeql.github.com/docs/codeql-cli/analyzing-databases-with-the-codeql-cli/) >> + >> +The CodeQL CLI hooks into the normal firmware build process to generate a CodeQL database. Once the database is >> +generated, any number of CodeQL queries can be run against the database for analysis. >> + >> +CodeQL analysis results can be stored in the >> +[SARIF](https://sarifweb.azurewebsites.net/) (Static Analysis Results Interchange Format) file format. >> + >> +[CodeQL SARIF documentation](https://codeql.github.com/docs/codeql-cli/sarif-output/) >> + >> +SARIF files are JSON following the SARIF specification/schema. The files can be opened with SARIF viewers to more >> +conveniently view the results in the file. >> + >> +For example, the [SARIF Viewer extension for VS Code](https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif- >> viewer) >> +can open a .sarif file generated by the CodeQL CLI and allow you to click links directly to the problematic line in >> +source files. >> + >> +In summary, the edk2 repository runs CodeQL on pull requests and CI builds. Any alerts will be flagged in the pull >> +request status checks area. The queries used by the edk2 repository are stored in the edk2 CodeQL query set file. >> + >> +[edk2 CodeQL Query Set](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls) >> + >> +## CodeQL Usage in edk2 >> + >> +CodeQL provides the capability to debug the actual queries and for our (TianoCore) community to write our own queries >> +and even contribute back to the upstream repo when appropriate. In other cases, we might choose to keep our own >> +queries in a separate TianoCore repo or within a directory in the edk2 code tree. >> + >> +This is all part of CodeQL Scanning. Information on the particular topic of running additional custom queries in >> +Code Scanning is documented [here](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for- >> vulnerabilities-and-errors/configuring-code-scanning#running-additional-queries) >> +in that page. >> + >> +In addition, CodeQL offers the flexibility to: >> + >> +- Build databases locally >> +- Retrieve databases from server builds >> +- Relatively quickly test queries locally against a database for a fast feedback loop >> +- Suppress false positives >> +- Customize the files and queries used in the edk2 project and quickly keep this list in sync between the server and local >> execution >> + >> +### Query Target List >> + >> +While CodeQL can scan various languages including Python and C/C++, the TianoCore project is only focused on C/C++ >> +checks at this time. TianoCore has an initial set of queries to evaluate shown below (checked boxes are done). >> + >> +- [x] [cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 457/ConditionallyUninitializedVariable.ql) >> +- [x] [cpp/infinite-loop-with-unsatisfiable-exit- >> condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 835/InfiniteLoopWithUnsatisfiableExitCondition.ql) >> +- [x] [cpp/overflow-buffer](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-119/OverflowBuffer.ql) >> +- [x] [cpp/pointer-overflow- >> check](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PointerOverflow.ql) >> +- [x] [cpp/potential-buffer- >> overflow](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PotentialBufferOverflow.ql) >> +- [ ] [cpp/toctou-race-condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 367/TOCTOUFilesystemRace.ql) >> +- [ ] [cpp/unclear-array-index-validation](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 129/ImproperArrayIndexValidation.ql) >> +- [ ] [cpp/unsafe- >> strncat](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/SuspiciousCallToStrncat.ql) >> +- [ ] [cpp/use-after-free](https://github.com/github/codeql/blob/main/cpp/ql/src/Critical/UseAfterFree.ql) >> +- [ ] [cpp/user-controlled-null-termination-tainted](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 170/ImproperNullTerminationTainted.ql) >> +- [ ] [cpp/wrong-number-format- >> arguments](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongNumberOfFormatArguments.ql) >> +- [ ] [cpp/wrong-type-format- >> argument](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongTypeFormatArguments.ql) >> + >> +Additional queries completed: >> + >> +- [x] [cpp/overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-120/OverrunWrite.ql) >> +- [x] [cpp/overrunning-write-with-float](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 120/OverrunWriteFloat.ql) >> +- [x] [cpp/very-likely-overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 120/VeryLikelyOverrunWrite.ql) >> + >> +### Query Filtering in edk2 >> + >> +CodeQL query files (`.ql` files) contain metadata about the query. For example, >> +[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 457/ConditionallyUninitializedVariable.ql) >> +states the following about the query: >> + >> +```plaintext >> +/** >> + * @name Conditionally uninitialized variable >> + * @description An initialization function is used to initialize a local variable, but the >> + * returned status code is not checked. The variable may be left in an uninitialized >> + * state, and reading the variable may result in undefined behavior. >> + * @kind problem >> + * @problem.severity warning >> + * @security-severity 7.8 >> + * @id cpp/conditionally-uninitialized-variable >> + * @tags security >> + * external/cwe/cwe-457 >> + */ >> +``` >> + >> +edk2 automatically include queries against certain criteria using "query filters". For example, this could include any >> +`problem` query above a certain `security-severity` level. Or all queries with `security` in `tags`. >> + >> +Because edk2 favors consistency in CI results, the project maintains a relatively fixed query set that is updated with >> +individual queries over time. >> + >> +- [edk2 Active Queries](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls) >> +- [edk2 Active CodeQL Configuration](https://github.com/tianocore/edk2/blob/master/.github/codeql/codeql-config.yml) >> + >> +> _Note:_ Additional queries can be found here as well - https://lgtm.com/search?q=cpp&t=rules >> + >> +### Process for Suggesting New Queries for edk2 >> + >> +New query adoption in edk2 can be proposed by sending an RFC to the TianoCore development mailing list >> +(devel@edk2.groups.io) with the query link and justification for adopting the query in edk2. >> + >> +Everyone is welcome to suggest new queries. >> + >> +### Query Enabling Process >> + >> +Enabling a new query may trigger zero to thousands of alerts. Therefore, two paths are used to enable a new query in >> +the project. >> + >> +1. A single patch series - The first set of patches fixes the issues needed for the query to pass. The later set of >> + patches enables the query. >> +2. A query enabling branch - A branch is created where multiple contributors can work together on fixing issues related >> + to enabling a new query. Once the branch is ready, the history is cleaned up into a patch series that is submitted >> + to the edk2 project. >> + >> +(1) is recommended if the query is relatively simple to enable and one or two people are doing the work. (2) is >> +recommended if a lot of effort is needed to fix issues for the query especially issues spanning across packages. >> + >> +If a query is deemed fruitless during enabling testing, it can simply be rejected. The goal for CodeQL in edk2 is to >> +enable an effective set of queries that improve the codebase. As the list of enabled queries grows, total CodeQL >> +coverage will increase against active pull requests. We want to have relevant and effective coverage. >> + >> +### CodeQL in Pull Requests >> + >> +TianoCore is enabling CodeQL in a step-by-step fashion. The goal with this approach is to make steady progress >> +enabling CodeQL to become more comprehensive and useful while not impacting day-to-day code contributions. >> + >> +Throughout the process described in this section, CodeQL Code Scanning is be a mandatory status check for edk2 >> +pull requests. >> + >> +### Dismissing CodeQL Alerts >> + >> +The following documentation describes how to dismiss alerts: >> +[Dismissing Alerts](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for- >> vulnerabilities-and-errors/managing-code-scanning-alerts-for-your-repository#dismissing--alerts) >> + >> +> _Note:_ If query has a false positive a GitHub Issue can be submitted in the >> +> [CodeQL repo issues page](https://github.com/github/codeql/issues) with the `false-positive` tag to help improve >> +> the query. >> + >> +## CodeQL CLI Local Commands >> + >> +The [CodeQL CLI](https://codeql.github.com/docs/codeql-cli/) can be used as follows to wrap around the edk2 build >> +process (`MdeModulePkg` in this case) to generate a database in the directory `cpp-database`. The example shown uses >> +[stuart](https://github.com/tianocore/edk2-pytool-extensions) build commands. >> + >> +```cmd >> +codeql database create cpp-database --language=cpp --command="stuart_ci_build -c .pytool/CISettings.py -p MdeModulePkg >> +-a IA32,X64 TOOL_CHAIN_TAG=VS2019 Target=DEBUG --clean" --overwrite >> +``` >> + >> +The following command can be used to generate a [SARIF file](https://codeql.github.com/docs/codeql-cli/sarif-output/) >> +(called `query-results.sarif`) from that database with the results of the >> +[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE- >> 457/ConditionallyUninitializedVariable.ql) query: >> + >> +```cmd >> +codeql database analyze cpp-database codeql\cpp\ql\src\Security\CWE\CWE-457\ConditionallyUninitializedVariable.ql -- >> format=sarifv2.1.0 --output=query-results.sarif >> +``` >> + >> +SARIF logs can be read by log viewers such as the [Sarif Viewer](https://marketplace.visualstudio.com/items?itemName=MS- >> SarifVSCode.sarif-viewer) extension for [VS Code](https://code.visualstudio.com/). >> + >> +## The CodeQL Project >> + >> +CodeQL is an actively maintained project. Here is a comparison of edk2 commit activity versus CodeQL for reference: >> + >> +- [CodeQL Commit Activity](https://github.com/github/codeql/graphs/commit-activity) >> +- [edk2 Commit Activity](https://github.com/github/codeql/graphs/commit-activity) >> + >> +Because CodeQL does maintain a strong open-source presence, the TianoCore community should be able to file >> +[issues](https://github.com/github/codeql/issues) and [pull requests](https://github.com/github/codeql/pulls) >> +into the project. >> + >> +--- >> + >> +> The original RFC for adoption of CodeQL in edk2 is available here for reference: >> +> [Adoption of CodeQL in edk2](https://github.com/tianocore/edk2/discussions/3258#discussioncomment-3682099) >> -- >> 2.28.0.windows.1 >> >> >> >> >> > > > > > >