-
Notifications
You must be signed in to change notification settings - Fork 9
Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add YAML rules for insecure C/C++ hash detection and XML parsing audit #162
Conversation
Sakshis seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
WalkthroughThis pull request introduces several new YAML configuration files that define security rules for detecting insecure hashing algorithms in both C and C++ and for auditing XML parsing mechanisms using the libxml2 library. The rules include detailed match patterns, warning severities, and relevant CWE references. In parallel, new snapshot files have been added to record the actual usage patterns of these functions. Additionally, test configuration files for both C and C++ have been created to validate the correct and incorrect usages of hash functions, XML parsing calls, and system command prohibitions, ensuring thorough static analysis coverage. Changes
Sequence Diagram(s)sequenceDiagram
participant Developer
participant StaticAnalyzer
participant RuleEngine
participant SnapshotTester
Developer->>StaticAnalyzer: Submit C/C++ code
StaticAnalyzer->>RuleEngine: Load YAML rule configurations (insecure-hash & libxml2-audit-parser)
RuleEngine->>StaticAnalyzer: Return match results for insecure patterns
StaticAnalyzer->>SnapshotTester: Execute snapshot tests to validate call patterns
SnapshotTester-->>Developer: Report warnings and test outcomes
Possibly related PRs
Suggested reviewers
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
tests/c/libxml2-audit-parser-c-test.yml (1)
1-26
: YAML Structure and Content ValidationThe YAML file is well structured and clearly defines both valid and invalid XML parsing function call scenarios for the libxml2 C library. The multi-line block scalars for function calls are consistently indented, and the separation between the valid and invalid sections is clear. One suggestion: consider adding inline comments or descriptive keys (if supported) for each invalid case to aid future maintainers in understanding the rationale behind marking certain usages as invalid.
tests/c/dont-call-system-c-test.yml (1)
1-35
: Test Cases for Prohibiting Unsafe System CallsThe file successfully demonstrates both valid and invalid examples concerning the use of system calls. The valid test case avoids system command invocation, while the invalid cases clearly illustrate typical error handling gaps when invoking
system()
. One consideration: verify that the macroBUFFERSIZE
is defined in the test environment or document its expected definition, as its absence could lead to confusion during test execution.tests/__snapshots__/null-library-function-c-snapshot.yml (1)
1-125
: Robust Snapshot Definitions for Library Function AnalysisThe snapshot file provides detailed mappings of function calls and their label information. The use of complex keys (with block scalars) to capture function implementations—such as the
void f()
example with calls tostrcpy
andgetenv
—is well executed. The multiple label entries with precise start and end positions help create a rich context for static analysis and regression testing.
A couple of suggestions:
- Consider providing a brief commentary or description (as a YAML comment) above each snapshot entry to explain its intent, which could improve maintainability.
- Verify that all numerical positions (start/end) are correctly synchronized with the actual content if the snapshots are auto-generated.
tests/__snapshots__/insecure-hash-cpp-snapshot.yml (1)
1-267
: Detailed Snapshot Definitions for Insecure Hash Functions in C++.
The snapshot file is comprehensive and well-annotated. The use of multiple snapshots along with detailed primary and secondary labeling ensures that various insecure hash calls (via bothEVP_MD_fetch
/EVP_get_digestbyname
and GNU Crypto functions) are properly captured.
One minor nitpick: the trailing line (line 267) appears to be a stray number without associated content. Please verify if this is intentional; if not, it should be removed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
rules/c/security/insecure-hash-c.yml
(1 hunks)rules/c/security/libxml2-audit-parser-c.yml
(1 hunks)rules/cpp/insecure-hash-cpp.yml
(1 hunks)rules/cpp/libxml2-audit-parser-cpp.yml
(1 hunks)tests/__snapshots__/insecure-hash-c-snapshot.yml
(1 hunks)tests/__snapshots__/insecure-hash-cpp-snapshot.yml
(1 hunks)tests/__snapshots__/libxml2-audit-parser-c-snapshot.yml
(1 hunks)tests/__snapshots__/libxml2-audit-parser-cpp-snapshot.yml
(1 hunks)tests/__snapshots__/null-library-function-c-snapshot.yml
(1 hunks)tests/c/dont-call-system-c-test.yml
(1 hunks)tests/c/insecure-hash-c-test.yml
(1 hunks)tests/c/libxml2-audit-parser-c-test.yml
(1 hunks)tests/cpp/insecure-hash-cpp-test.yml
(1 hunks)tests/cpp/libxml2-audit-parser-cpp-test.yml
(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- tests/snapshots/libxml2-audit-parser-c-snapshot.yml
- tests/snapshots/libxml2-audit-parser-cpp-snapshot.yml
- tests/snapshots/insecure-hash-c-snapshot.yml
🔇 Additional comments (13)
tests/cpp/insecure-hash-cpp-test.yml (1)
1-30
: Configuration File for Insecure C++ Hash FunctionsThis new YAML configuration file cleanly distinguishes between valid and invalid usages of cryptographic hash functions in C++. The valid example is minimal and the invalid block covers a comprehensive range of insecure functions as expected. Ensure that any corresponding rules (such as those in
rules/cpp/insecure-hash-cpp.yml
) are kept in sync with these test cases.tests/cpp/libxml2-audit-parser-cpp-test.yml (1)
1-26
: Consistency in XML Parser Testing for C++This configuration file mirrors the structure of its C counterpart and properly encapsulates valid and invalid test cases for libxml2 usage in C++. It promotes consistency across C and C++ testing by reusing similar function call patterns in both valid and invalid sections. Please ensure that any shared expectations in the snapshot files are updated accordingly.
tests/c/insecure-hash-c-test.yml (1)
1-30
: YAML Test Cases for Insecure Hash Functions in C.
The file is well-structured with clear separation between the valid and invalid cases. The invalid list covers a diverse set of function calls (including fetching digest algorithms and various initialization/update routines) that align with the security rules.rules/c/security/insecure-hash-c.yml (1)
1-242
: Comprehensive Rule Definition for Insecure Hash Functions in C.
This rule file thoroughly covers multiple insecure hash algorithms (including MD2, MD4, MD5, and SHA1 variants) using several util patterns. The regexes are well defined and the structure is clear, which supports maintainability and potential future extensions.rules/cpp/insecure-hash-cpp.yml (1)
1-128
: Consistent and Detailed Rule for Insecure Hash Functions in C++.
The configuration mirrors the C variant well and uses similarly comprehensive patterns to detect insecure hash function usage. The utility patterns and the overall rule logic are clear and align with the security objectives.rules/c/security/libxml2-audit-parser-c.yml (1)
1-266
: Robust Audit Rule for libxml2 XML Parsing in C.
This file defines several granular utility patterns to match XML parsing functions that should be audited for secure processing (e.g., ensuring safe parsing options and no unwanted network access). The detailed patterns for different argument counts (three, four, five, six, and seven children) make the rule both flexible and precise. Consider adding inline comments to explain the rationale behind each pattern block to aid future maintainability, though the current structure is very comprehensive.rules/cpp/libxml2-audit-parser-cpp.yml (7)
1-13
: Well-Defined Rule Header
The header section establishes a clear rule identity with fields forid
,language
, andseverity
as well as detailed multi-line descriptions inmessage
andnote
(including the CWE reference and a helpful external link). This clear description helps set the context for the static analysis rule.
14-49
: Clear Definition for Pattern_having_three_child
The utility block definingPattern_having_three_child
is well structured. It properly matches call expressions forxmlReadFile
by ensuring exactly three arguments (each non-comment) and that a fourth argument does not exist. For maintainability, consider adding inline comments summarizing the intent of this pattern.
50-96
: Robust Pattern_having_five_child Implementation
This pattern correctly handles multiple function identifiers (e.g.,xmlParseInNodeContext
,xmlReadMemory
, etc.) with exactly five arguments by checking each child for non-comment status and excluding a potential sixth argument. The use of a regex to encapsulate the function names is appropriate.
97-137
: Well-Structured Pattern_having_four_child
The pattern for functions expecting four arguments (e.g.,xmlReadDoc
,xmlReadFd
,xmlCtxtReadFile
) is consistent with the approach taken in the other utility patterns. Each argument is validated to not be a comment, and an extra argument is explicitly excluded.
138-190
: Accurate Pattern_having_six_child Verification
The definition forPattern_having_six_child
for functions likexmlReadIO
andxmlCtxtReadMemory
correctly stipulates the requirement for exactly six children. The rule meticulously checks each expected argument, ensuring no extra (non-comment) argument is present.
191-249
: Comprehensive Pattern_having_seven_child Definition
This pattern, targetingxmlCtxtReadIO
, is clearly defined to require exactly seven child arguments. The structure mirrors the patterns above, ensuring consistency and clarity in rule creation. Consider adding a short comment to explain why exactly seven arguments are required, if that rationale isn’t documented elsewhere.
250-266
: Consolidated Rule Construction
The final rule section integrates all the individual patterns using anany
list and then excludes nodes that have ERROR kinds vianot
conditions. This aggregation is well thought out and aligns with the overall objective to audit XML parsing calls securely. As a best practice, double-check that the properties (such asstopBy
) are correctly handled by the rule engine.
#162) * removed missing-secure-java * httponly-false-csharp * use-of-md5-digest-utils-java * removing use-of-md5-digest-utils and httponly-false-csharp * insecure-hash-c * libxml2-audit-parser-c * Added test file for dont-call-system-c * insecure-hash-cpp * libxml2-audit-parser-cpp --------- Co-authored-by: Sakshis <sakshil@abc.com>
Summary by CodeRabbit
New Features
Tests