public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Michael Kubacki" <mikuback@linux.microsoft.com>
To: devel@edk2.groups.io, michael.d.kinney@intel.com,
	"michael.kubacki@outlook.com" <michael.kubacki@outlook.com>
Cc: Sean Brogan <sean.brogan@microsoft.com>
Subject: Re: [edk2-devel] [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page
Date: Thu, 17 Nov 2022 09:51:05 -0500	[thread overview]
Message-ID: <b95d97b2-4bee-c819-15fc-101a42a23a67@linux.microsoft.com> (raw)
In-Reply-To: <CO1PR11MB4929D0464D9AFC1D65CD24D8D2079@CO1PR11MB4929.namprd11.prod.outlook.com>

The wiki page is now published: 
https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Code-Scanning

Thanks,
Michael

On 11/16/2022 12:24 PM, Michael D Kinney wrote:
> Hi Michael,
> 
> Thank you for putting this summary together on CodeQL.  Looks like a great start to this content.
> 
> Reviewed-by: Michael D Kinney <michael.d.kinney@intel.com>
> 
> Mike
> 
>> -----Original Message-----
>> From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Michael Kubacki
>> Sent: Wednesday, November 16, 2022 7:57 AM
>> To: devel@edk2.groups.io
>> Cc: Sean Brogan <sean.brogan@microsoft.com>; Michael Kubacki <mikuback@linux.microsoft.com>; Kinney, Michael D
>> <michael.d.kinney@intel.com>
>> Subject: [edk2-devel] [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page
>>
>> From: Michael Kubacki <michael.kubacki@microsoft.com>
>>
>> This page includes:
>>
>> 1. An explanation of how Code Scanning works in edk2
>> 2. An overview of CodeQL
>> 3. Links to key CodeQL resources
>> 4. Links to key CodeQL files in the edk2 repo
>> 5. A CodeQL query target list for edk2 with completion status
>> 6. An explanation on how query filtering is used in edk2
>> 7. The process for proposing new CodeQL queries for edk2
>> 8. Basic CodeQL CLI usage for the edk2 build process
>>
>> This is meant to be a starting point to document CodeQL information
>> for edk2. It will be updated over time.
>>
>> Cc: Sean Brogan <sean.brogan@microsoft.com>
>> Cc: Michael Kubacki <mikuback@linux.microsoft.com>
>> Cc: Michael D Kinney <michael.d.kinney@intel.com>
>> Signed-off-by: Michael Kubacki <michael.kubacki@microsoft.com>
>> ---
>>
>> Notes:
>>      A rendered version of this markdown document is
>>      available on my fork:
>>
>>      https://github.com/makubacki/tianocore.github.io/blob/add_edk2_code_scanning_page/EDK-II-Code-Scanning.md
>>
>>   EDK-II-Code-Scanning.md | 232 ++++++++++++++++++++
>>   1 file changed, 232 insertions(+)
>>
>> diff --git a/EDK-II-Code-Scanning.md b/EDK-II-Code-Scanning.md
>> new file mode 100644
>> index 000000000000..c54c7bef4214
>> --- /dev/null
>> +++ b/EDK-II-Code-Scanning.md
>> @@ -0,0 +1,232 @@
>> +# EDK II Code Scanning
>> +
>> +CodeQL is a code analysis engine developed by Github to automate security checks.
>> +
>> +It is used for Code Scanning in the TianoCore edk2 repository.
>> +
>> +## Table of Contents
>> +
>> +1. [Overview](#overview)
>> +2. [CodeQL Usage in edk2](#codeql-usage-in-edk2)
>> +   - [Query Target List](#query-target-list)
>> +   - [Query Filtering in edk2](#query-filtering-in-edk2)
>> +   - [Process for Suggesting New Queries for edk2](#process-for-suggesting-new-queries-for-edk2)
>> +   - [Query Enabling Process](#query-enabling-process)
>> +   - [CodeQL in Pull Requests](#codeql-in-pull-requests)
>> +   - [Dismissing CodeQL Alerts](#dismissing-codeql-alerts)
>> +3. [CodeQL CLI Local Commands](#codeql-cli-local-commands)
>> +4. [The CodeQL Project](#the-codeql-project)
>> +
>> +## Overview
>> +
>> +CodeQL is open source and free for open source projects. It is maintained by GitHub and naturally has excellent
>> +integration with GitHub projects. CodeQL uses a semantic code analysis engine to discover vulnerabilities in a
>> +number of programming languages (both compiled and interpreted).
>> +
>> +[General CodeQL Information](https://codeql.github.com/)
>> +
>> +TianoCore uses CodeQL C/C++ queries to find common programming errors and security vulnerabilities in firmware code.
>> +Many open-source queries are officially supported and comprise the vulnerability analysis performed against the
>> +database.
>> +
>> +[CodeQL Query Repository](https://github.com/github/codeql)
>> +
>> +In addition, anyone can leverage the code analysis engine by writing a custom query. Information around writing a
>> +custom query is available in the official documentation.
>> +
>> +[CodeQL Query Documentation](https://codeql.github.com/docs/writing-codeql-queries/codeql-queries/#codeql-queries).
>> +
>> +The [edk2](https://github.com/tianocore/edk2) repository uses GitHub's Code Scanning feature (free for public
>> +repositories on GitHub.com) to show alerts directly in the repository and run CodeQL on pull requests and pushes
>> +to the repository.
>> +
>> +[About GitHub Code Scanning](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-
>> vulnerabilities-and-errors/about-code-scanning)
>> +
>> +Current CodeQL scanning results in the edk2 project are available in the "Actions" page of the GitHub repository.
>> +
>> +[edk2 CodeQL Workflow](https://github.com/tianocore/edk2/actions)
>> +
>> +A CodeQL command-line interface (CLI) is also available which can be run locally. A CodeQL CLI reference and manual
>> +are available in the documentation to learn how to use the CLI.
>> +
>> +[CodeQL CLI Documenation](https://codeql.github.com/docs/codeql-cli/)
>> +
>> +At a high-level, there's two main phases of CodeQL execution to be aware of.
>> +
>> +1. CodeQL database generation
>> +   - [CodeQL Database Documentation](https://codeql.github.com/docs/codeql-cli/creating-codeql-databases/)
>> +2. CodeQL database analysis
>> +   - [CodeQL Analysis Documentation](https://codeql.github.com/docs/codeql-cli/analyzing-databases-with-the-codeql-cli/)
>> +
>> +The CodeQL CLI hooks into the normal firmware build process to generate a CodeQL database. Once the database is
>> +generated, any number of CodeQL queries can be run against the database for analysis.
>> +
>> +CodeQL analysis results can be stored in the
>> +[SARIF](https://sarifweb.azurewebsites.net/) (Static Analysis Results Interchange Format) file format.
>> +
>> +[CodeQL SARIF documentation](https://codeql.github.com/docs/codeql-cli/sarif-output/)
>> +
>> +SARIF files are JSON following the SARIF specification/schema. The files can be opened with SARIF viewers to more
>> +conveniently view the results in the file.
>> +
>> +For example, the [SARIF Viewer extension for VS Code](https://marketplace.visualstudio.com/items?itemName=MS-SarifVSCode.sarif-
>> viewer)
>> +can open a .sarif file generated by the CodeQL CLI and allow you to click links directly to the problematic line in
>> +source files.
>> +
>> +In summary, the edk2 repository runs CodeQL on pull requests and CI builds. Any alerts will be flagged in the pull
>> +request status checks area. The queries used by the edk2 repository are stored in the edk2 CodeQL query set file.
>> +
>> +[edk2 CodeQL Query Set](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls)
>> +
>> +## CodeQL Usage in edk2
>> +
>> +CodeQL provides the capability to debug the actual queries and for our (TianoCore) community to write our own queries
>> +and even contribute back to the upstream repo when appropriate. In other cases, we might choose to keep our own
>> +queries in a separate TianoCore repo or within a directory in the edk2 code tree.
>> +
>> +This is all part of CodeQL Scanning. Information on the particular topic of running additional custom queries in
>> +Code Scanning is documented [here](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-
>> vulnerabilities-and-errors/configuring-code-scanning#running-additional-queries)
>> +in that page.
>> +
>> +In addition, CodeQL offers the flexibility to:
>> +
>> +- Build databases locally
>> +- Retrieve databases from server builds
>> +- Relatively quickly test queries locally against a database for a fast feedback loop
>> +- Suppress false positives
>> +- Customize the files and queries used in the edk2 project and quickly keep this list in sync between the server and local
>> execution
>> +
>> +### Query Target List
>> +
>> +While CodeQL can scan various languages including Python and C/C++, the TianoCore project is only focused on C/C++
>> +checks at this time. TianoCore has an initial set of queries to evaluate shown below (checked boxes are done).
>> +
>> +- [x] [cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 457/ConditionallyUninitializedVariable.ql)
>> +- [x] [cpp/infinite-loop-with-unsatisfiable-exit-
>> condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 835/InfiniteLoopWithUnsatisfiableExitCondition.ql)
>> +- [x] [cpp/overflow-buffer](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-119/OverflowBuffer.ql)
>> +- [x] [cpp/pointer-overflow-
>> check](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PointerOverflow.ql)
>> +- [x] [cpp/potential-buffer-
>> overflow](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/PotentialBufferOverflow.ql)
>> +- [ ] [cpp/toctou-race-condition](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 367/TOCTOUFilesystemRace.ql)
>> +- [ ] [cpp/unclear-array-index-validation](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 129/ImproperArrayIndexValidation.ql)
>> +- [ ] [cpp/unsafe-
>> strncat](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Memory%20Management/SuspiciousCallToStrncat.ql)
>> +- [ ] [cpp/use-after-free](https://github.com/github/codeql/blob/main/cpp/ql/src/Critical/UseAfterFree.ql)
>> +- [ ] [cpp/user-controlled-null-termination-tainted](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 170/ImproperNullTerminationTainted.ql)
>> +- [ ] [cpp/wrong-number-format-
>> arguments](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongNumberOfFormatArguments.ql)
>> +- [ ] [cpp/wrong-type-format-
>> argument](https://github.com/github/codeql/blob/main/cpp/ql/src/Likely%20Bugs/Format/WrongTypeFormatArguments.ql)
>> +
>> +Additional queries completed:
>> +
>> +- [x] [cpp/overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-120/OverrunWrite.ql)
>> +- [x] [cpp/overrunning-write-with-float](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 120/OverrunWriteFloat.ql)
>> +- [x] [cpp/very-likely-overrunning-write](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 120/VeryLikelyOverrunWrite.ql)
>> +
>> +### Query Filtering in edk2
>> +
>> +CodeQL query files (`.ql` files) contain metadata about the query. For example,
>> +[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 457/ConditionallyUninitializedVariable.ql)
>> +states the following about the query:
>> +
>> +```plaintext
>> +/**
>> + * @name Conditionally uninitialized variable
>> + * @description An initialization function is used to initialize a local variable, but the
>> + *              returned status code is not checked. The variable may be left in an uninitialized
>> + *              state, and reading the variable may result in undefined behavior.
>> + * @kind problem
>> + * @problem.severity warning
>> + * @security-severity 7.8
>> + * @id cpp/conditionally-uninitialized-variable
>> + * @tags security
>> + *       external/cwe/cwe-457
>> + */
>> +```
>> +
>> +edk2 automatically include queries against certain criteria using "query filters". For example, this could include any
>> +`problem` query above a certain `security-severity` level. Or all queries with `security` in `tags`.
>> +
>> +Because edk2 favors consistency in CI results, the project maintains a relatively fixed query set that is updated with
>> +individual queries over time.
>> +
>> +- [edk2 Active Queries](https://github.com/tianocore/edk2/blob/master/.github/codeql/edk2.qls)
>> +- [edk2 Active  CodeQL Configuration](https://github.com/tianocore/edk2/blob/master/.github/codeql/codeql-config.yml)
>> +
>> +> _Note:_ Additional queries can be found here as well - https://lgtm.com/search?q=cpp&t=rules
>> +
>> +### Process for Suggesting New Queries for edk2
>> +
>> +New query adoption in edk2 can be proposed by sending an RFC to the TianoCore development mailing list
>> +(devel@edk2.groups.io) with the query link and justification for adopting the query in edk2.
>> +
>> +Everyone is welcome to suggest new queries.
>> +
>> +### Query Enabling Process
>> +
>> +Enabling a new query may trigger zero to thousands of alerts. Therefore, two paths are used to enable a new query in
>> +the project.
>> +
>> +1. A single patch series - The first set of patches fixes the issues needed for the query to pass. The later set of
>> +   patches enables the query.
>> +2. A query enabling branch - A branch is created where multiple contributors can work together on fixing issues related
>> +   to enabling a new query. Once the branch is ready, the history is cleaned up into a patch series that is submitted
>> +   to the edk2 project.
>> +
>> +(1) is recommended if the query is relatively simple to enable and one or two people are doing the work. (2) is
>> +recommended if a lot of effort is needed to fix issues for the query especially issues spanning across packages.
>> +
>> +If a query is deemed fruitless during enabling testing, it can simply be rejected. The goal for CodeQL in edk2 is to
>> +enable an effective set of queries that improve the codebase. As the list of enabled queries grows, total CodeQL
>> +coverage will increase against active pull requests. We want to have relevant and effective coverage.
>> +
>> +### CodeQL in Pull Requests
>> +
>> +TianoCore is enabling CodeQL in a step-by-step fashion. The goal with this approach is to make steady progress
>> +enabling CodeQL to become more comprehensive and useful while not impacting day-to-day code contributions.
>> +
>> +Throughout the process described in this section, CodeQL Code Scanning is be a mandatory status check for edk2
>> +pull requests.
>> +
>> +### Dismissing CodeQL Alerts
>> +
>> +The following documentation describes how to dismiss alerts:
>> +[Dismissing Alerts](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-
>> vulnerabilities-and-errors/managing-code-scanning-alerts-for-your-repository#dismissing--alerts)
>> +
>> +> _Note:_ If query has a false positive a GitHub Issue can be submitted in the
>> +> [CodeQL repo issues page](https://github.com/github/codeql/issues) with the `false-positive` tag to help improve
>> +> the query.
>> +
>> +## CodeQL CLI Local Commands
>> +
>> +The [CodeQL CLI](https://codeql.github.com/docs/codeql-cli/) can be used as follows to wrap around the edk2 build
>> +process (`MdeModulePkg` in this case) to generate a database in the directory `cpp-database`. The example shown uses
>> +[stuart](https://github.com/tianocore/edk2-pytool-extensions) build commands.
>> +
>> +```cmd
>> +codeql database create cpp-database --language=cpp --command="stuart_ci_build -c .pytool/CISettings.py -p MdeModulePkg
>> +-a IA32,X64 TOOL_CHAIN_TAG=VS2019 Target=DEBUG --clean" --overwrite
>> +```
>> +
>> +The following command can be used to generate a [SARIF file](https://codeql.github.com/docs/codeql-cli/sarif-output/)
>> +(called `query-results.sarif`) from that database with the results of the
>> +[cpp/conditionally-uninitialized-variable](https://github.com/github/codeql/blob/main/cpp/ql/src/Security/CWE/CWE-
>> 457/ConditionallyUninitializedVariable.ql) query:
>> +
>> +```cmd
>> +codeql database analyze cpp-database codeql\cpp\ql\src\Security\CWE\CWE-457\ConditionallyUninitializedVariable.ql --
>> format=sarifv2.1.0 --output=query-results.sarif
>> +```
>> +
>> +SARIF logs can be read by log viewers such as the [Sarif Viewer](https://marketplace.visualstudio.com/items?itemName=MS-
>> SarifVSCode.sarif-viewer) extension for [VS Code](https://code.visualstudio.com/).
>> +
>> +## The CodeQL Project
>> +
>> +CodeQL is an actively maintained project. Here is a comparison of edk2 commit activity versus CodeQL for reference:
>> +
>> +- [CodeQL Commit Activity](https://github.com/github/codeql/graphs/commit-activity)
>> +- [edk2 Commit Activity](https://github.com/github/codeql/graphs/commit-activity)
>> +
>> +Because CodeQL does maintain a strong open-source presence, the TianoCore community should be able to file
>> +[issues](https://github.com/github/codeql/issues) and [pull requests](https://github.com/github/codeql/pulls)
>> +into the project.
>> +
>> +---
>> +
>> +> The original RFC for adoption of CodeQL in edk2 is available here for reference:
>> +> [Adoption of CodeQL in edk2](https://github.com/tianocore/edk2/discussions/3258#discussioncomment-3682099)
>> --
>> 2.28.0.windows.1
>>
>>
>>
>>
>>
> 
> 
> 
> 
> 
> 

      reply	other threads:[~2022-11-17 14:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16 15:56 [edk2-wiki][PATCH v1 1/1] Adds EDK II Code Scanning page Michael Kubacki
2022-11-16 17:24 ` [edk2-devel] " Michael D Kinney
2022-11-17 14:51   ` Michael Kubacki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b95d97b2-4bee-c819-15fc-101a42a23a67@linux.microsoft.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox