Skip to content

DWARF CFI Checker prototype #148296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from
Draft

Conversation

amsen20
Copy link
Contributor

@amsen20 amsen20 commented Jul 11, 2025

In this RFC, the design behind this prototype is explained.

This prototype implements the full algorithm explained in the RFC, as opposed to the current DWARFCFIChecker in LLVM, which is only capable of implementing the invalidation part of the algorithm.

The main difference is that this implementation has access to semantic information through BOLT MCPlusBuilder, which is an extension of MCInstrAnalysis providing the following information:

  • matches, which can be used to see if an instruction is an add/sub/multiple/... and what the arguments are, are they const or registers.
  • Helper functions for extracting load and storing memory address targets.
  • And a lot more information about control flow.

Using this information, this implementation is capable of applying the abstract execution mentioned in the RFC properly.
Consider the following example:

pushq   %rdi # spill rdi
.cfi_adjust_cfa_offset 8
.cfi_rel_offset %rdi, 0
pushq   %rsi # spill rsi
.cfi_adjust_cfa_offset 8
.cfi_rel_offset %rsi, 0

callq some_func

popq    %rsi # load rsi
.cfi_adjust_cfa_offset -8
.cfi_same_value %rdi # prototype: error: loading %rdi from wrong location
popq    %rdi # spill rdi
# (landed and prototype) error: changed register RDI, that register RDI's unwinding rule uses, but there is no CFI directives about it
.cfi_adjust_cfa_offset -8
.cfi_same_value %rsi # prototype: error: loading %rsi from wrong location

In this example, two registers (rdi and rsi) are spilled but are loaded in reverse order. The prototype can identify when rdi and rsi are loaded from incorrect memory locations, a capability not shared by the currently landed checker, which only detects modifications to rdi, with its unwinding rule set to same_value.

To fix the issue, we need to add more semantic information to MCInst. There are a couple of options proposed in the RFC. I recommend the first option as my personal choice, which is:
Option 1: Extend MCInstrAnalysis with functionalities currently in BOLT’s MCPlus.
The reason to choose this option is that it’s the only truly incremental option that has already been implemented, tested, and explored by BOLT.

@amsen20 amsen20 marked this pull request as draft July 11, 2025 21:08
Copy link
Member

@paschalis-mpeis paschalis-mpeis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @amsen20,

I've only read the introduction to your proposal and it looks interesting!
Thanks for your effort. I'll try to have a better look in the coming days.

I know this is only a draft, but regarding the couple of 'nit' bolt changes, those could be added in a separate patch?

@@ -146,7 +146,7 @@ cl::opt<bool> Lite("lite", cl::desc("skip processing of cold functions"),
cl::cat(BoltCategory));

cl::opt<std::string>
OutputFilename("o",
OutputFilename("ooo",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is renamed by mistake?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a temporary fix for linking the llvm-mc tool with BOLT, when the linker tried to link them, having the same CL option with the same value caused errors.

@amsen20
Copy link
Contributor Author

amsen20 commented Jul 14, 2025

Hey @amsen20,

I've only read the introduction to your proposal and it looks interesting! Thanks for your effort. I'll try to have a better look in the coming days.

I know this is only a draft, but regarding the couple of 'nit' bolt changes, those could be added in a separate patch?

This draft is not meant to be merged; the reviewers are added automatically because I changed parts of BOLT's code.
The minimal version of the checker has landed in this PR.
However, I was wondering if any of the BOLT owners are interested in moving MCPlusBuilder to the main LLVM MC layer. In the RFC, I have discussed that the main bottleneck of the analysis is the semantic information, which only MCPlusBuilder in BOLT currently possesses.

Thank you for your time. Let me know if you have any suggestions/ideas on how we should add more semantic information to the MC layer.

@amsen20
Copy link
Contributor Author

amsen20 commented Jul 16, 2025

Hey @amsen20,
I've only read the introduction to your proposal and it looks interesting! Thanks for your effort. I'll try to have a better look in the coming days.
I know this is only a draft, but regarding the couple of 'nit' bolt changes, those could be added in a separate patch?

This draft is not meant to be merged; the reviewers are added automatically because I changed parts of BOLT's code. The minimal version of the checker has landed in this PR. However, I was wondering if any of the BOLT owners are interested in moving MCPlusBuilder to the main LLVM MC layer. In the RFC, I have discussed that the main bottleneck of the analysis is the semantic information, which only MCPlusBuilder in BOLT currently possesses.

Thank you for your time. Let me know if you have any suggestions/ideas on how we should add more semantic information to the MC layer.

@paschalis-mpeis
Just wanted to remind you in case you missed this.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants