Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

ESS-ENN · 2025-03-03T11:03:09Z

Summary by CodeRabbit

New Features
- Introduced enhanced security rules to detect insecure cryptographic hashing and unsafe XML parsing in C and C++ projects, guiding use of stronger alternatives.
Tests
- Added comprehensive test cases and detailed snapshots to validate secure usage of cryptographic functions and XML processing.
- Included scenarios to ensure safe handling of system commands and to flag insecure operations effectively.

CLAassistant · 2025-03-03T11:03:15Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ ESS-ENN
❌ Sakshis

Sakshis seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2025-03-03T11:03:15Z

Walkthrough

This pull request introduces several new YAML configuration files that define security rules for detecting insecure hashing algorithms in both C and C++ and for auditing XML parsing mechanisms using the libxml2 library. The rules include detailed match patterns, warning severities, and relevant CWE references. In parallel, new snapshot files have been added to record the actual usage patterns of these functions. Additionally, test configuration files for both C and C++ have been created to validate the correct and incorrect usages of hash functions, XML parsing calls, and system command prohibitions, ensuring thorough static analysis coverage.

Changes

File(s)	Change Summary
`rules/c/security/insecure-hash-c.yml`, `rules/c/security/libxml2-audit-parser-c.yml`	New rules for detecting insecure hashing algorithms and auditing libxml2 XML parsing in C code.
`rules/cpp/insecure-hash-cpp.yml`, `rules/cpp/libxml2-audit-parser-cpp.yml`	New rules for identifying insecure hash usage and auditing libxml2 XML parsing in C++ code.
`tests/__snapshots__/insecure-hash-c-snapshot.yml`, `tests/__snapshots__/insecure-hash-cpp-snapshot.yml`	New snapshot files capturing insecure hash function call patterns in C and C++.
`tests/__snapshots__/libxml2-audit-parser-c-snapshot.yml`, `tests/__snapshots__/libxml2-audit-parser-cpp-snapshot.yml`	New snapshot files for auditing XML parsing function calls of libxml2 in C and C++.
`tests/__snapshots__/null-library-function-c-snapshot.yml`	New snapshot capturing function behavior involving `strcpy` and `getenv` in C.
`tests/c/dont-call-system-c-test.yml`	New test cases to validate prohibition of system command calls in C code.
`tests/c/insecure-hash-c-test.yml`, `tests/c/libxml2-audit-parser-c-test.yml`, `tests/cpp/insecure-hash-cpp-test.yml`, `tests/cpp/libxml2-audit-parser-cpp-test.yml`	New test configuration files for insecure hash and libxml2 audit rules in C and C++.

Sequence Diagram(s)

sequenceDiagram
    participant Developer
    participant StaticAnalyzer
    participant RuleEngine
    participant SnapshotTester

    Developer->>StaticAnalyzer: Submit C/C++ code
    StaticAnalyzer->>RuleEngine: Load YAML rule configurations (insecure-hash & libxml2-audit-parser)
    RuleEngine->>StaticAnalyzer: Return match results for insecure patterns
    StaticAnalyzer->>SnapshotTester: Execute snapshot tests to validate call patterns
    SnapshotTester-->>Developer: Report warnings and test outcomes

Possibly related PRs

Rules - insecure-hash-c/cpp #22: Introduces similar rule configurations for detecting insecure hashing in C and C++ with overlapping patterns and functions.

Suggested reviewers

ganeshpatro321

Poem

I'm a rabbit with a coding hop,
Secure rules make my heart go pop.
Hashes and XML under careful sight,
Each snapshot and test shines security light.
With YAML tunes and pattern delight,
I nibble bugs away with all my might!
🐇 Hoppity-code, day and night!

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

tests/c/libxml2-audit-parser-c-test.yml (1)

1-26: YAML Structure and Content Validation

The YAML file is well structured and clearly defines both valid and invalid XML parsing function call scenarios for the libxml2 C library. The multi-line block scalars for function calls are consistently indented, and the separation between the valid and invalid sections is clear. One suggestion: consider adding inline comments or descriptive keys (if supported) for each invalid case to aid future maintainers in understanding the rationale behind marking certain usages as invalid.

tests/c/dont-call-system-c-test.yml (1)

1-35: Test Cases for Prohibiting Unsafe System Calls

The file successfully demonstrates both valid and invalid examples concerning the use of system calls. The valid test case avoids system command invocation, while the invalid cases clearly illustrate typical error handling gaps when invoking system(). One consideration: verify that the macro BUFFERSIZE is defined in the test environment or document its expected definition, as its absence could lead to confusion during test execution.

tests/__snapshots__/null-library-function-c-snapshot.yml (1)

1-125: Robust Snapshot Definitions for Library Function Analysis

The snapshot file provides detailed mappings of function calls and their label information. The use of complex keys (with block scalars) to capture function implementations—such as the void f() example with calls to strcpy and getenv—is well executed. The multiple label entries with precise start and end positions help create a rich context for static analysis and regression testing.
A couple of suggestions:

Consider providing a brief commentary or description (as a YAML comment) above each snapshot entry to explain its intent, which could improve maintainability.

Verify that all numerical positions (start/end) are correctly synchronized with the actual content if the snapshots are auto-generated.

tests/__snapshots__/insecure-hash-cpp-snapshot.yml (1)

1-267: Detailed Snapshot Definitions for Insecure Hash Functions in C++.
The snapshot file is comprehensive and well-annotated. The use of multiple snapshots along with detailed primary and secondary labeling ensures that various insecure hash calls (via both EVP_MD_fetch/EVP_get_digestbyname and GNU Crypto functions) are properly captured.
One minor nitpick: the trailing line (line 267) appears to be a stray number without associated content. Please verify if this is intentional; if not, it should be removed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2024633 and 020e923.

📒 Files selected for processing (14)

rules/c/security/insecure-hash-c.yml (1 hunks)
rules/c/security/libxml2-audit-parser-c.yml (1 hunks)
rules/cpp/insecure-hash-cpp.yml (1 hunks)
rules/cpp/libxml2-audit-parser-cpp.yml (1 hunks)
tests/__snapshots__/insecure-hash-c-snapshot.yml (1 hunks)
tests/__snapshots__/insecure-hash-cpp-snapshot.yml (1 hunks)
tests/__snapshots__/libxml2-audit-parser-c-snapshot.yml (1 hunks)
tests/__snapshots__/libxml2-audit-parser-cpp-snapshot.yml (1 hunks)
tests/__snapshots__/null-library-function-c-snapshot.yml (1 hunks)
tests/c/dont-call-system-c-test.yml (1 hunks)
tests/c/insecure-hash-c-test.yml (1 hunks)
tests/c/libxml2-audit-parser-c-test.yml (1 hunks)
tests/cpp/insecure-hash-cpp-test.yml (1 hunks)
tests/cpp/libxml2-audit-parser-cpp-test.yml (1 hunks)

✅ Files skipped from review due to trivial changes (3)

tests/snapshots/libxml2-audit-parser-c-snapshot.yml
tests/snapshots/libxml2-audit-parser-cpp-snapshot.yml
tests/snapshots/insecure-hash-c-snapshot.yml

🔇 Additional comments (13)

tests/cpp/insecure-hash-cpp-test.yml (1)

1-30: Configuration File for Insecure C++ Hash Functions

This new YAML configuration file cleanly distinguishes between valid and invalid usages of cryptographic hash functions in C++. The valid example is minimal and the invalid block covers a comprehensive range of insecure functions as expected. Ensure that any corresponding rules (such as those in rules/cpp/insecure-hash-cpp.yml) are kept in sync with these test cases.

tests/cpp/libxml2-audit-parser-cpp-test.yml (1)

1-26: Consistency in XML Parser Testing for C++

This configuration file mirrors the structure of its C counterpart and properly encapsulates valid and invalid test cases for libxml2 usage in C++. It promotes consistency across C and C++ testing by reusing similar function call patterns in both valid and invalid sections. Please ensure that any shared expectations in the snapshot files are updated accordingly.

tests/c/insecure-hash-c-test.yml (1)

1-30: YAML Test Cases for Insecure Hash Functions in C.
The file is well-structured with clear separation between the valid and invalid cases. The invalid list covers a diverse set of function calls (including fetching digest algorithms and various initialization/update routines) that align with the security rules.

rules/c/security/insecure-hash-c.yml (1)

1-242: Comprehensive Rule Definition for Insecure Hash Functions in C.
This rule file thoroughly covers multiple insecure hash algorithms (including MD2, MD4, MD5, and SHA1 variants) using several util patterns. The regexes are well defined and the structure is clear, which supports maintainability and potential future extensions.

rules/cpp/insecure-hash-cpp.yml (1)

1-128: Consistent and Detailed Rule for Insecure Hash Functions in C++.
The configuration mirrors the C variant well and uses similarly comprehensive patterns to detect insecure hash function usage. The utility patterns and the overall rule logic are clear and align with the security objectives.

rules/c/security/libxml2-audit-parser-c.yml (1)

1-266: Robust Audit Rule for libxml2 XML Parsing in C.
This file defines several granular utility patterns to match XML parsing functions that should be audited for secure processing (e.g., ensuring safe parsing options and no unwanted network access). The detailed patterns for different argument counts (three, four, five, six, and seven children) make the rule both flexible and precise. Consider adding inline comments to explain the rationale behind each pattern block to aid future maintainability, though the current structure is very comprehensive.

rules/cpp/libxml2-audit-parser-cpp.yml (7)

1-13: Well-Defined Rule Header
The header section establishes a clear rule identity with fields for id, language, and severity as well as detailed multi-line descriptions in message and note (including the CWE reference and a helpful external link). This clear description helps set the context for the static analysis rule.

14-49: Clear Definition for Pattern_having_three_child
The utility block defining Pattern_having_three_child is well structured. It properly matches call expressions for xmlReadFile by ensuring exactly three arguments (each non-comment) and that a fourth argument does not exist. For maintainability, consider adding inline comments summarizing the intent of this pattern.

50-96: Robust Pattern_having_five_child Implementation
This pattern correctly handles multiple function identifiers (e.g., xmlParseInNodeContext, xmlReadMemory, etc.) with exactly five arguments by checking each child for non-comment status and excluding a potential sixth argument. The use of a regex to encapsulate the function names is appropriate.

97-137: Well-Structured Pattern_having_four_child
The pattern for functions expecting four arguments (e.g., xmlReadDoc, xmlReadFd, xmlCtxtReadFile) is consistent with the approach taken in the other utility patterns. Each argument is validated to not be a comment, and an extra argument is explicitly excluded.

138-190: Accurate Pattern_having_six_child Verification
The definition for Pattern_having_six_child for functions like xmlReadIO and xmlCtxtReadMemory correctly stipulates the requirement for exactly six children. The rule meticulously checks each expected argument, ensuring no extra (non-comment) argument is present.

191-249: Comprehensive Pattern_having_seven_child Definition
This pattern, targeting xmlCtxtReadIO, is clearly defined to require exactly seven child arguments. The structure mirrors the patterns above, ensuring consistency and clarity in rule creation. Consider adding a short comment to explain why exactly seven arguments are required, if that rationale isn’t documented elsewhere.

250-266: Consolidated Rule Construction
The final rule section integrates all the individual patterns using an any list and then excludes nodes that have ERROR kinds via not conditions. This aggregation is well thought out and aligns with the overall objective to audit XML parsing calls securely. As a best practice, double-check that the properties (such as stopBy) are correctly handled by the rule engine.

#162) * removed missing-secure-java * httponly-false-csharp * use-of-md5-digest-utils-java * removing use-of-md5-digest-utils and httponly-false-csharp * insecure-hash-c * libxml2-audit-parser-c * Added test file for dont-call-system-c * insecure-hash-cpp * libxml2-audit-parser-cpp --------- Co-authored-by: Sakshis <sakshil@abc.com>

Sakshis and others added 24 commits December 16, 2024 13:09

removed missing-secure-java

bb7b8e1

Merge branch 'coderabbitai:main' into main

308e947

Merge branch 'coderabbitai:main' into main

46efd55

Merge branch 'coderabbitai:main' into main

3ac0204

Merge branch 'coderabbitai:main' into main

7caba92

Merge branch 'coderabbitai:main' into main

d3a2776

Merge branch 'coderabbitai:main' into main

2e86380

Merge branch 'coderabbitai:main' into main

b072db7

Merge branch 'coderabbitai:main' into main

65dce29

httponly-false-csharp

12bb3aa

use-of-md5-digest-utils-java

2c5ea88

removing use-of-md5-digest-utils and httponly-false-csharp

d3067f1

Merge branch 'coderabbitai:main' into main

56d4011

Merge branch 'coderabbitai:main' into main

f7f1d72

Merge branch 'coderabbitai:main' into main

0d53f8e

Merge branch 'coderabbitai:main' into main

b622bca

Merge branch 'coderabbitai:main' into main

67459bf

Merge branch 'coderabbitai:main' into main

b8b2adb

Merge branch 'coderabbitai:main' into main

9b29d0a

insecure-hash-c

e9c5b1d

libxml2-audit-parser-c

a166001

Added test file for dont-call-system-c

f7fb1b1

insecure-hash-cpp

f3a3bd7

libxml2-audit-parser-cpp

020e923

coderabbitai bot changed the title ~~@coderabbitai~~ Add YAML rules for insecure C/C++ hash detection and XML parsing audit Mar 3, 2025

coderabbitai bot reviewed Mar 3, 2025

View reviewed changes

coderabbitai bot approved these changes Mar 3, 2025

View reviewed changes

ganeshpatro321 approved these changes Mar 3, 2025

View reviewed changes

ganeshpatro321 merged commit a7b3a4c into coderabbitai:main Mar 3, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

Uh oh!

ESS-ENN commented Mar 3, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Mar 3, 2025

Uh oh!

coderabbitai bot commented Mar 3, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

Uh oh!

Conversation

ESS-ENN commented Mar 3, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

CLAassistant commented Mar 3, 2025

Uh oh!

coderabbitai bot commented Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ESS-ENN commented Mar 3, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 3, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)