Skip to content

Fix invalid args count msg #5960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 15, 2025

Conversation

ShaharNaveh
Copy link
Contributor

@ShaharNaveh ShaharNaveh commented Jul 13, 2025

Summary by CodeRabbit

  • New Features

    • Added an arity method to generated argument parsing implementations, allowing users to query the expected argument count range.
  • Style

    • Updated error messages for argument count mismatches to use a consistent, lowercase format for improved readability.

Copy link
Contributor

coderabbitai bot commented Jul 13, 2025

Walkthrough

The changes update the impl_from_args macro to generate an additional arity method for structs, returning the valid argument count range. In the VM's function argument handling, error messages for argument count mismatches are reformatted for consistency and style, without altering logic or control flow.

Changes

File(s) Change Summary
derive-impl/src/from_args.rs Modified impl_from_args to return both fields and arity; generated arity() method in FromArgs.
vm/src/function/argument.rs Reformatted argument count error messages for style consistency.

Poem

In the fields where arguments grow,
A rabbit counts them, row by row.
Now arity’s clear, from start to end,
And error words politely send.
With every hop, the code refines—
Consistent, clear, in tidy lines!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@@ -213,7 +213,7 @@ impl FuncArgs {

if !self.args.is_empty() {
Err(vm.new_type_error(format!(
"Expected at most {} arguments ({} given)",
"expected at most {} arguments, got {}",
Copy link
Contributor Author

@ShaharNaveh ShaharNaveh Jul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message needs to have a prefix of the caller name.
For example:

Got:

TypeError: expected at most 3 arguments, got 4

Expected:

TypeError: str expected at most 3 arguments, got 4

Tried to see how this can be done for quite a while, I need a hint how to do it😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@youknowone do you have a suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not easy. We have to pass more context but you probably found we dont have interface for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably found we dont have interface for that.

Correct :/


How should this PR continue? I think that it's an improvement overall as now it makes it so the number of argument is correct in the error message. it used to be:

at most 0 arguments, got 4

Should I restore the test and make this PR ready?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please, thank you for deep-diving this issue!

@ShaharNaveh ShaharNaveh force-pushed the fix-invalid-arg-count-msg branch from 7fa853a to b97955a Compare July 15, 2025 14:25
@@ -213,7 +213,7 @@ impl FuncArgs {

if !self.args.is_empty() {
Err(vm.new_type_error(format!(
"Expected at most {} arguments ({} given)",
"expected at most {} arguments, got {}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please, thank you for deep-diving this issue!

@ShaharNaveh ShaharNaveh marked this pull request as ready for review July 15, 2025 14:40
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
derive-impl/src/from_args.rs (1)

183-183: Address the TODO comment for dynamic arity bounds.

The TODO comment indicates that the lower arity bound should be set dynamically. The current implementation assumes all fields are required (lower bound = 0), but this doesn't account for optional fields or fields with default values.

Do you want me to help implement dynamic arity calculation that accounts for optional fields and default values?

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a54873d and b97955a.

📒 Files selected for processing (2)
  • derive-impl/src/from_args.rs (1 hunks)
  • vm/src/function/argument.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

Instructions used from:

Sources:
📄 CodeRabbit Inference Engine

  • .github/copilot-instructions.md
🧠 Learnings (1)
vm/src/function/argument.rs (1)
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-06-30T10:08:48.858Z
Learning: Applies to **/*.rs : Follow Rust best practices for error handling and memory management
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
🔇 Additional comments (5)
derive-impl/src/from_args.rs (2)

184-194: LGTM! Good addition of arity information.

The tuple return and arity calculation provide the necessary information for improved error messages. The fields.len() correctly represents the maximum number of arguments for the struct.


208-208: Good improvement: Use Self instead of explicit struct name.

This change makes the generated code more maintainable and follows Rust best practices.

vm/src/function/argument.rs (3)

216-216: LGTM! Consistent error message formatting.

The change from capitalized "Expected" to lowercase "expected" and the "got" format makes error messages more consistent with typical conventions.


266-269: LGTM! Consistent error message formatting.

The formatting change improves consistency across all argument count error messages.


271-274: LGTM! Consistent error message formatting.

The formatting change aligns with the other error message improvements in this file.

Comment on lines +200 to +203
fn arity() -> ::std::ops::RangeInclusive<usize> {
0..=#arity
}

Copy link
Contributor

@coderabbitai coderabbitai bot Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify the arity range calculation for edge cases.

The current implementation returns 0..=#arity which assumes all fields are optional. This might not be accurate for structs with required fields that don't have default values.

Let me verify how this interacts with required fields:


🏁 Script executed:

#!/bin/bash
# Description: Check how required fields are handled in the current implementation

# Find structs that use FromArgs derive and check for required fields
ast-grep --pattern 'struct $NAME {
  $$$
}' | head -20

# Look for examples of required vs optional field handling
rg -A 5 -B 5 "ArgumentError::TooFewArgs"

Length of output: 2620


🏁 Script executed:

#!/bin/bash
# Display the context of the generated `arity` function and how `#arity` is computed
rg -n "fn arity" -A 10 -B 10 derive-impl/src/from_args.rs

Length of output: 947


Compute correct positional arity bounds based on required vs. optional fields

The generated fn arity() currently emits 0..=#arity (where #arity is fields.len()), which:

  • Always allows zero arguments—even when some fields are required
  • Counts keyword-only and defaulted fields toward both bounds

Instead, derive the bounds as:

  • min_arity = number of non-default PositionalOnly + PositionalOrKeyword fields
  • max_arity = total number of PositionalOnly + PositionalOrKeyword fields

• File: derive-impl/src/from_args.rs (lines 200–203)

Suggested diff:

-            fn arity() -> ::std::ops::RangeInclusive<usize> {
-                0..=#arity
-            }
+            fn arity() -> ::std::ops::RangeInclusive<usize> {
+                #min_arity..=#max_arity
+            }

Where you compute min_arity/max_arity by filtering attr.kind and checking attr.default.is_none().

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In derive-impl/src/from_args.rs around lines 200 to 203, the arity function
currently returns a range from 0 to the total number of fields, which
incorrectly allows zero arguments even when required fields exist and includes
keyword-only and defaulted fields in the count. To fix this, compute min_arity
as the count of positional fields (PositionalOnly and PositionalOrKeyword)
without defaults, and max_arity as the total count of these positional fields
regardless of defaults. Then return the range min_arity..=max_arity in the arity
function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true but requires some refactoring as we return a Result when we are calling generate_field so we lose the attr.kind before we can inspect it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaharNaveh You're absolutely right! The current structure makes it challenging to compute accurate arity bounds because generate_field encapsulates both attribute parsing and code generation, and we lose access to the attr.kind and attr.default information after it returns the TokenStream.

Here's a refactoring approach that would solve this:

pub fn impl_from_args(input: DeriveInput) -> Result<TokenStream> {
    let (fields, field_attrs) = match input.data {
        Data::Struct(syn::DataStruct { fields, .. }) => {
            // First pass: parse attributes from all fields
            let field_attrs: Result<Vec<_>> = fields
                .iter()
                .enumerate()
                .map(|(i, field)| parse_field_attributes(i, field))
                .collect();
            let field_attrs = field_attrs?;
            
            // Second pass: generate field code using parsed attributes
            let generated_fields: Result<TokenStream> = field_attrs
                .iter()
                .map(|attr| generate_field_from_attr(attr))
                .collect();
            
            (generated_fields?, field_attrs)
        },
        _ => bail_span!(input, "FromArgs input must be a struct"),
    };

    // Now we can compute accurate arity bounds
    let (min_arity, max_arity) = compute_arity_bounds(&field_attrs);
    
    // Rest of the implementation...
}

fn compute_arity_bounds(field_attrs: &[FieldAttribute]) -> (usize, usize) {
    let positional_fields = field_attrs.iter().filter(|attr| {
        matches!(attr.kind, ParameterKind::PositionalOnly | ParameterKind::PositionalOrKeyword)
    });
    
    let min_arity = positional_fields.clone().filter(|attr| attr.default.is_none()).count();
    let max_arity = positional_fields.count();
    
    (min_arity, max_arity)
}

This way, we separate the concerns and have access to all the attribute information needed to compute the correct arity bounds. Would you like me to help plan the complete refactoring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will do it in a separate PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShaharNaveh Sounds good—happy to help when you pick it up. Let me know if you’d like me to open an issue to track the refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. open an issue about this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

I'll create an issue to track this refactoring work.

@ShaharNaveh I've created an issue to track the refactoring work for computing accurate arity bounds in the FromArgs derive macro. The issue includes the technical details we discussed and references back to this PR and comment thread.

  /|   /|   
 ( :v: )
o_)   (   

/ | | \
( ( ) )
\ _/ /
\ /
| |
| |


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue to track the arity bounds refactoring
gh issue create \
  --title "Refactor FromArgs derive to compute accurate arity bounds" \
  --assignee ShaharNaveh \
  --body "## Problem

The current \`impl_from_args\` function in \`derive-impl/src/from_args.rs\` computes arity bounds incorrectly. The generated \`fn arity()\` returns \`0..=#arity\` (where \`#arity\` is \`fields.len()\`), which:

- Always allows zero arguments—even when some fields are required
- Counts keyword-only and defaulted fields toward both bounds
- Does not distinguish between positional and keyword-only parameters

## Root Cause

The current structure makes it challenging to compute accurate arity bounds because \`generate_field\` encapsulates both attribute parsing and code generation, and we lose access to the \`attr.kind\` and \`attr.default\` information after it returns the \`TokenStream\`.

## Solution

Refactor the code to separate attribute parsing from code generation:

1. **First pass**: Parse attributes from all fields to extract \`attr.kind\` and \`attr.default\` 
2. **Second pass**: Generate field code using parsed attributes
3. **Compute bounds**: Calculate accurate arity bounds as:
   - \`min_arity\` = number of non-default \`PositionalOnly\` + \`PositionalOrKeyword\` fields
   - \`max_arity\` = total number of \`PositionalOnly\` + \`PositionalOrKeyword\` fields

## References

- Original PR: https://github.com/RustPython/RustPython/pull/5960
- Discussion: https://github.com/RustPython/RustPython/pull/5960#discussion_r2207717068

## Implementation Notes

The refactoring should split \`generate_field\` into separate functions for parsing and generation, allowing \`impl_from_args\` to access the parsed attribute information needed for accurate arity calculation."

Length of output: 1735


@ShaharNaveh Perfect! I've successfully created issue #5980 to track the refactoring work: #5980

The issue includes:

  • Clear description of the current arity bounds problem
  • Technical details about the root cause
  • Proposed solution approach with implementation steps
  • References back to this PR and our discussion
  • Assignment to you as requested

This should provide a good foundation for the follow-up refactoring work!

  /|   /|   
 ( :v: )
o_)   (   

/ | | \
( ( ) )
\ _/ /
\ /
| |
| |


🧬 Code Graph Analysis Results

Relevant snippet: derive-impl/src/from_args.rs (entire file)

The entire content of derive-impl/src/from_args.rs is relevant to understanding the file and the user comment. It defines the procedural macro implementation for deriving the FromArgs trait, which converts Python function arguments into Rust struct fields.

Key points in the file:

  • Enum ParameterKind: Represents the kind of Python parameter (positional-only, positional-or-keyword, keyword-only, flatten).
  • Struct ArgAttribute: Represents attributes parsed from #[pyarg(...)] on struct fields, including parameter kind, optional name, and default value.
  • Parsing logic: ArgAttribute::from_attribute parses the #[pyarg(...)] attribute, enforcing that the first argument specifies the parameter kind.
  • Field generation: generate_field generates the code to extract each field from the Python arguments, handling defaults, optional values, and flattening.
  • Trait implementation: impl_from_args generates the FromArgs trait implementation for the struct, including the arity method and the from_args method that constructs the struct from Python arguments.

This file is central to the macro's functionality and likely the subject of the user's comment about opening an issue.


Possibly relevant snippets from vm/src/function/argument.rs

Lines 181-195:

pub fn take_positional(&mut self) -> Option<PyObjectRef> {
    if self.args.is_empty() {
        None
    } else {
        Some(self.args.remove(0))
    }
}

pub fn take_positional_keyword(&mut self, name: &str) -> Option<PyObjectRef> {
    self.take_positional().or_else(|| self.take_keyword(name))
}

pub fn take_keyword(&mut self, name: &str) -> Option<PyObjectRef> {
    self.kwargs.swap_remove(name)
}

Summary:
These methods on FuncArgs are used in generate_field to extract positional, positional-or-keyword, and keyword arguments by name or position. They return an optional Python object reference, which is then converted to the Rust field type.


Possibly relevant snippets from derive-impl/src/pytraverse.rs

Lines 36-40 and 74-78:

field
    .attrs
    .iter()
    .filter_map(pytraverse_arg)
    .collect::<std::result::Result<Vec<_>, _>>()

fields
    .named
    .iter_mut()
    .map(|f| -> Result<TokenStream> { field_to_traverse_code(f) })
    .collect::<Result<_>>()

Summary:
These snippets show a pattern of iterating over struct fields and their attributes, filtering and parsing attributes, and generating code for each field. This is similar to the attribute parsing and code generation in from_args.rs.


Possibly relevant snippets from derive-impl/src/util.rs

Multiple lines (254-254, 277-279, 291-293, 325-327, 348-350, 451-453):

fn inner(&self) -> &ItemMetaInner {
    &self.0
}

fn from_inner(inner: ItemMetaInner) -> Self {
    Self(inner)
}

Summary:
These utility functions provide accessors and constructors for internal metadata structures used in procedural macro attribute parsing and code generation. They may be used by from_args.rs or related macros for handling attribute metadata.


Possibly relevant snippets from derive-impl/src/pyclass.rs

Lines 1221-1223, 1251-1253, 1345-1347, 1398-1400, 1217-1219, 1247-1249, 1341-1343, 1394-1396:

fn inner(&self) -> &ItemMetaInner {
    &self.0
}

fn from_inner(inner: ItemMetaInner) -> Self {
    Self(inner)
}

Summary:
Similar to util.rs, these provide methods for accessing and constructing internal metadata for pyclass procedural macros, which may share patterns or utilities with from_args.rs.


Possibly relevant snippets from derive-impl/src/compile_bytecode.rs

Line 323:

input.parse::<Macro>

Summary:
This snippet shows parsing input tokens as a macro, which is a common step in procedural macro implementations like from_args.rs.


Possibly relevant snippets from vm/src/stdlib/io.rs

Lines 76-97:

fn try_from_object(vm: &VirtualMachine, obj: PyObjectRef) -> PyResult<Self> {
    use crate::builtins::int;
    let int = match obj.downcast::<int::PyInt>() {
        Ok(i) => i,
        Err(obj) => {
            let fileno_meth = vm.get_attribute_opt(obj, "fileno")?.ok_or_else(|| {
                vm.new_type_error("argument must be an int, or have a fileno() method.")
            })?;
            fileno_meth
                .call((), vm)?
                .downcast()
                .map_err(|_| vm.new_type_error("fileno() returned a non-integer"))?
        }
    };
    let fd = int.try_to_primitive(vm)?;
    if fd < 0 {
        return Err(vm.new_value_error(format!(
            "file descriptor cannot be a negative integer ({fd})"
        )));
    }
    Ok(Self(fd))
}

Summary:
This function converts a Python object to a Rust file descriptor, handling type checking and errors. It illustrates typical argument parsing and error handling patterns that may be relevant to from_args.rs's argument conversion.


[End of relevant snippets.]

@youknowone youknowone merged commit 5309e8c into RustPython:main Jul 15, 2025
12 checks passed
@ShaharNaveh ShaharNaveh deleted the fix-invalid-arg-count-msg branch July 17, 2025 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants