Skip to content

Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ESS-ENN
Copy link
Contributor

@ESS-ENN ESS-ENN commented Mar 3, 2025

Summary by CodeRabbit

  • New Features

    • Introduced enhanced security rules to detect insecure cryptographic hashing and unsafe XML parsing in C and C++ projects, guiding use of stronger alternatives.
  • Tests

    • Added comprehensive test cases and detailed snapshots to validate secure usage of cryptographic functions and XML processing.
    • Included scenarios to ensure safe handling of system commands and to flag insecure operations effectively.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ ESS-ENN
❌ Sakshis


Sakshis seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

coderabbitai bot commented Mar 3, 2025

Walkthrough

This pull request introduces several new YAML configuration files that define security rules for detecting insecure hashing algorithms in both C and C++ and for auditing XML parsing mechanisms using the libxml2 library. The rules include detailed match patterns, warning severities, and relevant CWE references. In parallel, new snapshot files have been added to record the actual usage patterns of these functions. Additionally, test configuration files for both C and C++ have been created to validate the correct and incorrect usages of hash functions, XML parsing calls, and system command prohibitions, ensuring thorough static analysis coverage.

Changes

File(s) Change Summary
rules/c/security/insecure-hash-c.yml, rules/c/security/libxml2-audit-parser-c.yml New rules for detecting insecure hashing algorithms and auditing libxml2 XML parsing in C code.
rules/cpp/insecure-hash-cpp.yml, rules/cpp/libxml2-audit-parser-cpp.yml New rules for identifying insecure hash usage and auditing libxml2 XML parsing in C++ code.
tests/__snapshots__/insecure-hash-c-snapshot.yml, tests/__snapshots__/insecure-hash-cpp-snapshot.yml New snapshot files capturing insecure hash function call patterns in C and C++.
tests/__snapshots__/libxml2-audit-parser-c-snapshot.yml, tests/__snapshots__/libxml2-audit-parser-cpp-snapshot.yml New snapshot files for auditing XML parsing function calls of libxml2 in C and C++.
tests/__snapshots__/null-library-function-c-snapshot.yml New snapshot capturing function behavior involving strcpy and getenv in C.
tests/c/dont-call-system-c-test.yml New test cases to validate prohibition of system command calls in C code.
tests/c/insecure-hash-c-test.yml, tests/c/libxml2-audit-parser-c-test.yml, tests/cpp/insecure-hash-cpp-test.yml, tests/cpp/libxml2-audit-parser-cpp-test.yml New test configuration files for insecure hash and libxml2 audit rules in C and C++.

Sequence Diagram(s)

sequenceDiagram
    participant Developer
    participant StaticAnalyzer
    participant RuleEngine
    participant SnapshotTester

    Developer->>StaticAnalyzer: Submit C/C++ code
    StaticAnalyzer->>RuleEngine: Load YAML rule configurations (insecure-hash & libxml2-audit-parser)
    RuleEngine->>StaticAnalyzer: Return match results for insecure patterns
    StaticAnalyzer->>SnapshotTester: Execute snapshot tests to validate call patterns
    SnapshotTester-->>Developer: Report warnings and test outcomes
Loading

Possibly related PRs

Suggested reviewers

  • ganeshpatro321

Poem

I'm a rabbit with a coding hop,
Secure rules make my heart go pop.
Hashes and XML under careful sight,
Each snapshot and test shines security light.
With YAML tunes and pattern delight,
I nibble bugs away with all my might!
🐇 Hoppity-code, day and night!


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai bot changed the title @coderabbitai Add YAML rules for insecure C/C++ hash detection and XML parsing audit Mar 3, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
tests/c/libxml2-audit-parser-c-test.yml (1)

1-26: YAML Structure and Content Validation

The YAML file is well structured and clearly defines both valid and invalid XML parsing function call scenarios for the libxml2 C library. The multi-line block scalars for function calls are consistently indented, and the separation between the valid and invalid sections is clear. One suggestion: consider adding inline comments or descriptive keys (if supported) for each invalid case to aid future maintainers in understanding the rationale behind marking certain usages as invalid.

tests/c/dont-call-system-c-test.yml (1)

1-35: Test Cases for Prohibiting Unsafe System Calls

The file successfully demonstrates both valid and invalid examples concerning the use of system calls. The valid test case avoids system command invocation, while the invalid cases clearly illustrate typical error handling gaps when invoking system(). One consideration: verify that the macro BUFFERSIZE is defined in the test environment or document its expected definition, as its absence could lead to confusion during test execution.

tests/__snapshots__/null-library-function-c-snapshot.yml (1)

1-125: Robust Snapshot Definitions for Library Function Analysis

The snapshot file provides detailed mappings of function calls and their label information. The use of complex keys (with block scalars) to capture function implementations—such as the void f() example with calls to strcpy and getenv—is well executed. The multiple label entries with precise start and end positions help create a rich context for static analysis and regression testing.
A couple of suggestions:

  • Consider providing a brief commentary or description (as a YAML comment) above each snapshot entry to explain its intent, which could improve maintainability.
  • Verify that all numerical positions (start/end) are correctly synchronized with the actual content if the snapshots are auto-generated.
tests/__snapshots__/insecure-hash-cpp-snapshot.yml (1)

1-267: Detailed Snapshot Definitions for Insecure Hash Functions in C++.
The snapshot file is comprehensive and well-annotated. The use of multiple snapshots along with detailed primary and secondary labeling ensures that various insecure hash calls (via both EVP_MD_fetch/EVP_get_digestbyname and GNU Crypto functions) are properly captured.
One minor nitpick: the trailing line (line 267) appears to be a stray number without associated content. Please verify if this is intentional; if not, it should be removed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2024633 and 020e923.

📒 Files selected for processing (14)
  • rules/c/security/insecure-hash-c.yml (1 hunks)
  • rules/c/security/libxml2-audit-parser-c.yml (1 hunks)
  • rules/cpp/insecure-hash-cpp.yml (1 hunks)
  • rules/cpp/libxml2-audit-parser-cpp.yml (1 hunks)
  • tests/__snapshots__/insecure-hash-c-snapshot.yml (1 hunks)
  • tests/__snapshots__/insecure-hash-cpp-snapshot.yml (1 hunks)
  • tests/__snapshots__/libxml2-audit-parser-c-snapshot.yml (1 hunks)
  • tests/__snapshots__/libxml2-audit-parser-cpp-snapshot.yml (1 hunks)
  • tests/__snapshots__/null-library-function-c-snapshot.yml (1 hunks)
  • tests/c/dont-call-system-c-test.yml (1 hunks)
  • tests/c/insecure-hash-c-test.yml (1 hunks)
  • tests/c/libxml2-audit-parser-c-test.yml (1 hunks)
  • tests/cpp/insecure-hash-cpp-test.yml (1 hunks)
  • tests/cpp/libxml2-audit-parser-cpp-test.yml (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tests/snapshots/libxml2-audit-parser-c-snapshot.yml
  • tests/snapshots/libxml2-audit-parser-cpp-snapshot.yml
  • tests/snapshots/insecure-hash-c-snapshot.yml
🔇 Additional comments (13)
tests/cpp/insecure-hash-cpp-test.yml (1)

1-30: Configuration File for Insecure C++ Hash Functions

This new YAML configuration file cleanly distinguishes between valid and invalid usages of cryptographic hash functions in C++. The valid example is minimal and the invalid block covers a comprehensive range of insecure functions as expected. Ensure that any corresponding rules (such as those in rules/cpp/insecure-hash-cpp.yml) are kept in sync with these test cases.

tests/cpp/libxml2-audit-parser-cpp-test.yml (1)

1-26: Consistency in XML Parser Testing for C++

This configuration file mirrors the structure of its C counterpart and properly encapsulates valid and invalid test cases for libxml2 usage in C++. It promotes consistency across C and C++ testing by reusing similar function call patterns in both valid and invalid sections. Please ensure that any shared expectations in the snapshot files are updated accordingly.

tests/c/insecure-hash-c-test.yml (1)

1-30: YAML Test Cases for Insecure Hash Functions in C.
The file is well-structured with clear separation between the valid and invalid cases. The invalid list covers a diverse set of function calls (including fetching digest algorithms and various initialization/update routines) that align with the security rules.

rules/c/security/insecure-hash-c.yml (1)

1-242: Comprehensive Rule Definition for Insecure Hash Functions in C.
This rule file thoroughly covers multiple insecure hash algorithms (including MD2, MD4, MD5, and SHA1 variants) using several util patterns. The regexes are well defined and the structure is clear, which supports maintainability and potential future extensions.

rules/cpp/insecure-hash-cpp.yml (1)

1-128: Consistent and Detailed Rule for Insecure Hash Functions in C++.
The configuration mirrors the C variant well and uses similarly comprehensive patterns to detect insecure hash function usage. The utility patterns and the overall rule logic are clear and align with the security objectives.

rules/c/security/libxml2-audit-parser-c.yml (1)

1-266: Robust Audit Rule for libxml2 XML Parsing in C.
This file defines several granular utility patterns to match XML parsing functions that should be audited for secure processing (e.g., ensuring safe parsing options and no unwanted network access). The detailed patterns for different argument counts (three, four, five, six, and seven children) make the rule both flexible and precise. Consider adding inline comments to explain the rationale behind each pattern block to aid future maintainability, though the current structure is very comprehensive.

rules/cpp/libxml2-audit-parser-cpp.yml (7)

1-13: Well-Defined Rule Header
The header section establishes a clear rule identity with fields for id, language, and severity as well as detailed multi-line descriptions in message and note (including the CWE reference and a helpful external link). This clear description helps set the context for the static analysis rule.


14-49: Clear Definition for Pattern_having_three_child
The utility block defining Pattern_having_three_child is well structured. It properly matches call expressions for xmlReadFile by ensuring exactly three arguments (each non-comment) and that a fourth argument does not exist. For maintainability, consider adding inline comments summarizing the intent of this pattern.


50-96: Robust Pattern_having_five_child Implementation
This pattern correctly handles multiple function identifiers (e.g., xmlParseInNodeContext, xmlReadMemory, etc.) with exactly five arguments by checking each child for non-comment status and excluding a potential sixth argument. The use of a regex to encapsulate the function names is appropriate.


97-137: Well-Structured Pattern_having_four_child
The pattern for functions expecting four arguments (e.g., xmlReadDoc, xmlReadFd, xmlCtxtReadFile) is consistent with the approach taken in the other utility patterns. Each argument is validated to not be a comment, and an extra argument is explicitly excluded.


138-190: Accurate Pattern_having_six_child Verification
The definition for Pattern_having_six_child for functions like xmlReadIO and xmlCtxtReadMemory correctly stipulates the requirement for exactly six children. The rule meticulously checks each expected argument, ensuring no extra (non-comment) argument is present.


191-249: Comprehensive Pattern_having_seven_child Definition
This pattern, targeting xmlCtxtReadIO, is clearly defined to require exactly seven child arguments. The structure mirrors the patterns above, ensuring consistency and clarity in rule creation. Consider adding a short comment to explain why exactly seven arguments are required, if that rationale isn’t documented elsewhere.


250-266: Consolidated Rule Construction
The final rule section integrates all the individual patterns using an any list and then excludes nodes that have ERROR kinds via not conditions. This aggregation is well thought out and aligns with the overall objective to audit XML parsing calls securely. As a best practice, double-check that the properties (such as stopBy) are correctly handled by the rule engine.

@ganeshpatro321 ganeshpatro321 merged commit a7b3a4c into coderabbitai:main Mar 3, 2025
1 of 2 checks passed
gatsby003 pushed a commit that referenced this pull request Jul 24, 2025
#162)

* removed missing-secure-java

* httponly-false-csharp

* use-of-md5-digest-utils-java

* removing use-of-md5-digest-utils and httponly-false-csharp

* insecure-hash-c

* libxml2-audit-parser-c

* Added test file for dont-call-system-c

* insecure-hash-cpp

* libxml2-audit-parser-cpp

---------

Co-authored-by: Sakshis <sakshil@abc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants