Add correct relationships to MergeableContent by DaveTryon · Pull Request #1133 · microsoft/sbom-tool (original) (raw)

Overview

This PR is the first step toward including relationship data in aggregated SBOMs. It applies only to SPDX 2.2 files (support for aggregating SPDX 3.0 files will be added at a later time). It focuses on getting the relationship data into each MergeableContent object so that a future PR can incorporate them into the aggregated output. Specific changes:

  1. Add all DEPENDS_ON relationships to the MergeableContent
  2. Within each source SBOM:
    • Assign a unique SPDX ID to the SBOM root package (the old code used the same ID for each root, meaning that if we were aggregating N SBOMs, then we'd have N packages with the same ID--they were the same because they were each generated by hashing the same string of "SPDXRef-RootPackage"
    • Remap all DEPENDS_ON relationships that had a source of "SPDXRef-RootPackage" and point them to the unique SPDX ID we just generated
    • Add a DEPENDS_ON relationship to connect the aggregated root to unique SPDX ID that we just assigned
  3. Refactor some SPDX ID generation code that was duplicated in the 2 parsers and put it where we can use it centrally, including from the MergeableContentProvider. Add a special case to allow package ID's that are already well-formed SPDX IDs to be preserved. This allows us to persist the same SPDX ID from the input SBOM to the output SBOM. There is a tiny chance that this could impact the generation code if the package name were to begin with SPDXRef-Package, but that seems like a case that we can probably live with. We'd need to set a global state to protect against this. I've chosen not to do this but could be persuaded otherwise.
  4. Add logging to help record the activity. Things that occur once per source SBOM are logged at Debug, while things that occur more than once per source SBOM are logged at Verbose
  5. Unit tests!

Redacted sample of output logging:

##[debug]Attempting to parse SPDX file at 'C:\repos\CsvToJson\CsvToJson\bin\Debug\net8.0\_manifest\spdx_2.2\manifest.spdx.json'.
##[debug]Remapped root package ID from 'SPDXRef-RootPackage' to 'SPDXRef-Package-C8D4982D8356503F1912C637E4DFB7A53400AF98C08BA4732BB9F3CF70F628A9'
##[debug]Added new root relationship from 'SPDXRef-RootPackage' to 'SPDXRef-Package-C8D4982D8356503F1912C637E4DFB7A53400AF98C08BA4732BB9F3CF70F628A9'
##[debug]MergeableContent includes 3 package(s) and 1 relationship(s).
##[debug]Attempting to parse SPDX file at 'C:\scratch\SBOMTool-GitHub-Test\_manifest\spdx_2.2\manifest.spdx.json'.
##[debug]Remapped root package ID from 'SPDXRef-RootPackage' to 'SPDXRef-Package-6C3FCBFE8A8061781D6FA5DB922F6B292BEE13D2E107B3789F93C4343A48974D'
##[verbose]Remapped root package dependency on 'SPDXRef-Package-4CC76DE73D5AAB43B651EC777CEE740E936A6095E1DD28A99DCBB15C55909C50'
##[verbose]Remapped root package dependency on 'SPDXRef-Package-31A255ADCAF03261EC7411AE2C6E6FCC954C2A3A7919C3442AF4FA85159BA5EA'

(skipping about 260 verbose outputs)

##[verbose]Remapped root package dependency on 'SPDXRef-Package-696CFF09D3DFC001C46BC23A9CC041F600107F995A2395E67C16B93975364346'
##[verbose]Remapped root package dependency on 'SPDXRef-Package-59734D799ADDC55D5E50CD7B75B8C6344238FDE24BA85C53F2E4D4B58C3AEF98'
##[verbose]Remapped root package dependency on 'SPDXRef-Package-35FDBB7ACF69BB379B4E6C8CE96F08DADF1B593026AFDDABAC1AB6AFC96A5620'
##[debug]Added new root relationship from 'SPDXRef-RootPackage' to 'SPDXRef-Package-6C3FCBFE8A8061781D6FA5DB922F6B292BEE13D2E107B3789F93C4343A48974D'
##[debug]MergeableContent includes 266 package(s) and 266 relationship(s).