-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Introduce PyUtf8Str and fix(sqlite): validate surrogates in SQL statements #5969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce PyUtf8Str and fix(sqlite): validate surrogates in SQL statements #5969
Conversation
Warning Rate limit exceeded@ever0de has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 48 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThe Changes
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
stdlib/src/sqlite.rs
Outdated
let _ = sql.try_to_str(vm)?; | ||
let sql_cstr = sql.to_cstring(vm)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means we shouldn't create cstring from wtf8, but have to create one from str.
The underlying logic of this approach is fine. But let _ = sql.try_to_str(vm)?;
to discard the result of try_to_str
is weird.
How about adding a new helper method, fn ensure_utf8(&self) -> PyResult<()>
, which provides basically the same feature but with better naming? I can call try_to_str
inside.
Another suggestion is a proper way to handle it.
impl PyRef<PyStr> {
...
fn into_utf8(self) -> PyRef<PyUtf8Str>
...
}
struct PyUtf8Str(PyStr);
This will be useful when we actually need utf8 string instead of wtf8
vm/src/builtins/str.rs
Outdated
/// Returns the underlying string slice. This is safe because the | ||
/// type invariant guarantees UTF-8 validity. | ||
pub fn as_str(&self) -> &str { | ||
debug_assert!( | ||
self.0.is_utf8(), | ||
"PyUtf8Str invariant violated: inner string is not valid UTF-8" | ||
); | ||
unsafe { self.0.to_str().unwrap_unchecked() } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety is trivial for users, but not for devs
/// Returns the underlying string slice. This is safe because the | |
/// type invariant guarantees UTF-8 validity. | |
pub fn as_str(&self) -> &str { | |
debug_assert!( | |
self.0.is_utf8(), | |
"PyUtf8Str invariant violated: inner string is not valid UTF-8" | |
); | |
unsafe { self.0.to_str().unwrap_unchecked() } | |
} | |
/// Returns the underlying string slice. | |
pub fn as_str(&self) -> &str { | |
debug_assert!( | |
self.0.is_utf8(), | |
"PyUtf8Str invariant violated: inner string is not valid UTF-8" | |
); | |
// Safety: This is safe because the type invariant guarantees UTF-8 validity. | |
unsafe { self.0.to_str().unwrap_unchecked() } | |
} |
#[derive(Debug)] | ||
pub struct PyUtf8Str(PyStr); | ||
|
||
impl std::ops::Deref for PyUtf8Str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
impl std::ops::Deref for PyUtf8Str { | |
// TODO: Remove this Deref which may hide missing optimized methods of PyUtf8Str | |
impl std::ops::Deref for PyUtf8Str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Summary by CodeRabbit