Skip to content

MultiNorm class #29876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

trygvrad
Copy link
Contributor

@trygvrad trygvrad commented Apr 6, 2025

PR summary

This PR continues the work of #28658 and #28454, aiming to close #14168. (Feature request: Bivariate colormapping)

This is part one of the former PR, #29221. Please see #29221 for the previous discussion

Features included in this PR:

  • A MultiNorm class. This is a subclass of colors.Normalize and holds n_variate norms.
  • Testing of the MultiNorm class

Features not included in this PR:

  • changes to colorizer.py needed to expose the MultiNorm class
  • Exposes the functionality provided by MultiNorm together with BivarColormap and MultivarColormap to the plotting functions axes.imshow(...), axes.pcolor, and `axes.pcolormesh(...)
  • Testing of the new plotting methods
  • Examples in the docs

Comment on lines 4103 to 4105
in the case where an invalid string is used. This cannot use
`_api.check_getitem()`, because the norm keyword accepts arguments
other than strings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused because this function is only called for isinstance(norm, str).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this function exists because the norm keyword accepts Normalize objects in addition to strings.
This is fundamentally the same error you get if you give an invalid norm to a Colorizer object.

In main, the @norm.setter on colorizer.Colorizer reads:

    @norm.setter
    def norm(self, norm):
        _api.check_isinstance((colors.Normalize, str, None), norm=norm)
        if norm is None:
            norm = colors.Normalize()
        elif isinstance(norm, str):
            try:
                scale_cls = scale._scale_mapping[norm]
            except KeyError:
                raise ValueError(
                    "Invalid norm str name; the following values are "
                    f"supported: {', '.join(scale._scale_mapping)}"
                ) from None
            norm = _auto_norm_from_scale(scale_cls)()
    ...

The _get_scale_cls_from_str() exists in this PR because this functionality is now needed by both colorizer.Colorizer.norm() and colors.MultiNorm.
Note this PR does not include changes to colorizer.Colorizer.norm() so that it makes use of _get_scale_cls_from_str(). These changes follow in the next PR: #29877 .

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch from 55b85e3 to f42d65b Compare April 17, 2025 15:18
Copy link
Contributor

@anntzer anntzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor points, plus re-pinging @timhoffm in case he has an opinion re: n_input / input_dims naming?

@trygvrad
Copy link
Contributor Author

trygvrad commented May 4, 2025

Thank you for the feedback @anntzer !
Hopefully we can hear if @timhoffm has any thoughts on n_input / input_dims naming within the coming week.

@timhoffm
Copy link
Member

timhoffm commented May 5, 2025

See #29876 (comment)

@trygvrad
Copy link
Contributor Author

trygvrad commented May 7, 2025

Thank you @timhoffm
The PR should now be as we agreed (#29876 (comment)) :)

@trygvrad
Copy link
Contributor Author

trygvrad commented Jun 1, 2025

@QuLogic Thank you again and apologies for my tardiness (I was sick)
@timhoffm Do you think you could approve this PR now?

vmin, vmax : float, None, or list of float or None
Limits of the constituent norms.
If a list, each value is assigned to each of the constituent
norms. Single values are repeated to form a list of appropriate size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is broadcasting reasonable here? I would assume that most MultiNorms have different scales and thus need per-element entries anyway. It could also be an oversight to pass a single value instead of multiple values.

I'm therefore tempted to not allow scalars here but require exactly n_variables values. A more narrow and explicit interface may be the better start. We can always later expand the API to broadcast scalars if we see that's a typical case and reasonable in terms of usability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timhoffm Perhaps this is also a topic for the weekly meeting :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm perfectly fine with removing this here, and perhaps that is a good starting point.

My entry into this topic was a use case (dark-field X-ray microscopy, DFXRM) where we typically want vmax0 = vmax1 = -vmin0 =-vmin1, i.e. equal normalizations, and centered on zero, and given that entry point it felt natural to me to include broadcasting.

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch from 6b86d63 to 32247f5 Compare June 4, 2025 21:01
@trygvrad
Copy link
Contributor Author

trygvrad commented Jun 6, 2025

This is on hold until we sort out #30149 (Norm Protocol)
see #29876 (comment)

@trygvrad
Copy link
Contributor Author

I have rebased this PR and updated the MultiNorm to inherit from the Norm ABC now that #30178 has been merged :)

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch 2 times, most recently from 47b5116 to 63feb6b Compare June 29, 2025 14:09
assert norm.vmax[0] == 2
assert norm.vmax[1] == 2

# test call with clip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more extensive testing for call with multiple values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we need some more tests.
Should we make them here, or test via the top level plotting functions once they are implemented?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do the tests here because you can directly test the numeric result of the norm function. With plotting functions, you'll get colors as the result, which are much harder to test / have less expressiveness.

Here you can do a lot of cases as one-liners.

@trygvrad
Copy link
Contributor Author

trygvrad commented Jul 3, 2025

@timhoffm Thank you for taking the time to give detailed comments!

@trygvrad
Copy link
Contributor Author

trygvrad commented Jul 3, 2025

@timhoffm You made a comment here #29876 (comment):

Then (if we still want to expose MultiNorm), let's just call this n_variables and keep it private.

The variable has since changed name to n_components, but I feel like we haven't really explored the merits of keeping it private/public.

The current status is the public property:

    @property
    def n_components(self):
        """Number of norms held by this `MultiNorm`."""
        return len(self._norms)

I think we actually have 3 options:

  1. The current status: A public property.
  2. Remove n_components completely, and replace it with len(self._norms) wherever it occurs. (This would require a type-check for MultiNorm before len(self._norms) is accessed, if the object could also be a Normalize).
  3. Make n_components private.

This impacts how we communicate with the user via docstrings, i.e. in MultiNorm.__call__()
image

Here we could have:

  1. - If tuple, must be of length n_components
  2. - If tuple, must be of length equal to the number of norms held by this MultiNorm

i.e. we can choose to make n_components part of the vocabulary we use, or we can choose not to. My hunch is that this is a useful term to have for both internal discussions, and external communications.
It is also useful to keep in mind that MultiNorm should mirror the situation with colormaps, where currently Colormap, BivarColormap and MultivarColormap has n_components with value 1, 2, and len(self), respectively.

Once the top level plotting functions are made, there will be a check if the data pipeline is consistent, i.e.
number of fields in the data == cmap.n_components == norm.n_components
If we believe that there are users (i.e. developers of packages that rely on matplotlib) that want to use this functionality in some interesting way, I think it would be useful for them to have access to n_components

On the other hand, while I am quite certain that mulitvariate plotting functionality will be used by a number of users, it is unclear to me if any/how many will build advanced functionality that requires access to n_components which in reality is most important for error handling etc, and the typical user does not write custom error messages.


@timhoffm Let me know if I should remove n_components or make it private. Also if I should do the same for the colormap classes.

@github-actions github-actions bot added the Documentation: API files in lib/ and doc/api label Jul 3, 2025
@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch 2 times, most recently from ab61439 to 910b343 Compare July 3, 2025 12:54
@timhoffm
Copy link
Member

timhoffm commented Jul 4, 2025

I think making it public is the right way to go. While it slightly complicates the mental model for all the current 1d norms, it is by design that it’s now only the special case and hence it is slightly more complicated.

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch from 910b343 to dbeca30 Compare July 10, 2025 19:18
"""
Parameters
----------
norms : list of (str, `Normalize` or None)
Copy link
Member

@timhoffm timhoffm Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we accept None here? I don't see an advantage of [None, None] over ["linear", "linear"]. Quite the opposite, the first one is less readable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in case folks want to update the list later?

Copy link
Member

@timhoffm timhoffm Jul 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't make sense to me. (1) None is resolved to Normalize immediately. (2) Not sure why one wanted to lated update but that's possible either way.

I rather suspect that it's because it along the lines of: In 1D some other places - e.g. ScalarMappable.set_norm - accept None. But likely the othe places only accept it intentionally or unintentionally because norm=None is somewhere a default kwarg and that is passed through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rather suspect that it's because it along the lines of: In 1D some other places - e.g. ScalarMappable.set_norm - accept None. But likely the othe places only accept it intentionally or unintentionally because norm=None is somewhere a default kwarg and that is passed through.

Exactly this.
My thinking was that a common use case is not to supply a norm:

ax.imshow((A, B), cmap='BiOrangeBlue')

In this case the norm keyword (default: None) is singular, but to be compatible with the data/cmap it needs to have a length of 2, so in this case the norm keyword is repeated and becomes [None, None], and this then gets passed to the MultiNorm constructor.

In any case, it is better if, as you say we only accept strings.
I will make the change later.


see colorizer.py → _ensure_norm(norm, n_variates=n_variates) here for the proposed implementation. (This can also be simplified if we no longer accept None)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't make sense to me. (1) None is resolved to Normalize immediately. (2) Not sure why one wanted to lated update but that's possible either way.

What happens with ['linear', None]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be parsed to ['linear', 'linear'], but will now (with the update) produce an error.
If we find we need this functionality, we can always bring it back in a later PR.

Copy link
Member

@story645 story645 Jul 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how do you denote ['linear', mpl.NoNorm()]? is that ['linear', 'none']?

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch from 23777b3 to 3a4f6b9 Compare July 12, 2025 09:52
@trygvrad
Copy link
Contributor Author

Thank you @timhoffm , the changes should now be in :)

Data to normalize, as tuple, scalar array or structured array.

- If tuple, must be of length `n_components`
- If scalar array, the first axis must be of length `n_components`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we have a discussion on the axis order? I would have expected it the other way round. Note that we represent RGB arrays as (N, 3), where in is the color count and 3 are the r, g, b values. The call here should be comparable, i.e. the components are the second dimension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anntzer had the same question above (#29876 (comment)) I will repeat my answer (#29876 (comment)) here:


We had a brief discussion on this here #14168 (comment), but we followed that up on a weekly meeting 14 September 2023 https://hackmd.io/@matplotlib/Skaz1lrh6#Mutlivariate-colormapping .

The argument is not formulated well in the log, but basically (V, N, M) because there are V cmaps and norms, and in this way the input for the data mirrors that of the vmin, vmax and norm keywords, i.e.: ax.imshow((data0, data1), vmin=(vmin0, vmin1), cmap='BiOrangeBlue') or

ax.imshow((preassure, temperature), 
           vmin=(vmin_preassure, vmin_temperature), 
           cmap='BiOrangeBlue')

Quite often the data is qualitatively different, like the example at the bottom of the page here: https://trygvrad.github.io/mpl_docs/Multivariate%20colormaps.html where GDP_per_capita and Annual_growth are plotted together. I find it is not very intuitive to wrap these in a single array.

(also: this allows the different variates to have different data types)


While I see the argument that this has similarities to RGB images, I believe it is also qualitatively quite different, because this works through the norm→colormap pipeline while RGB images bypasses that machinery by design.

The way I see it is that the three values of a pixel in an RGB image (in the typical case, i.e. a drawing/diagram or picture taken with a camera) work together to encode one piece of information (color), while if you plot GDP_per_capita and Annual_growth together using separate norms and a 2D colormap, you have two pieces of information at each point, even though those two pieces are represented by a single color via a colormap.

Copy link
Member

@timhoffm timhoffm Jul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for repeating the answer. The topic is so large that it’s difficult to keep an overview of the state of the discussion.

Your arguments are valid, but only cover the case of list/tuple of 1d array-like, and there I agree. However, I argue that 2d arrays are a separate topic and should be handled differently, because there datasets is typically structured in columns. See e.g. the heights parameter in https://matplotlib.org/devdocs/api/_as_gen/matplotlib.axes.Axes.grouped_bar.html#matplotlib.axes.Axes.grouped_bar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha!

Just to make sure, you are arguing that a tuple/list of 2D arrays should be handled differently than a 3D array?
I have no problems with this, if you think it is OK from a syntax point of view :)

In practice this would change only

"""
            - If scalar array, the first axis must be of length `n_components`
"""

to:

"""
            - If scalar array, the last axis must be of length `n_components`
"""

And in the implementation, this would add an else: to MultiNorm._iterable_components_in_data():

    @staticmethod
    def _iterable_components_in_data(data, n_components):
        """
        Provides an iterable over the components contained in the data.

        An input array with `n_components` fields is returned as a list of length n
        referencing slices of the original array.

        Parameters
        ----------
        data : np.ndarray, tuple or list
            The input data, as tuple, scalar array or structured array.

            - If tuple, must be of length `n_components`
            - If scalar array, the last axis must be of length `n_components`
            - If structured array, must have `n_components` fields.


        Returns
        -------
        tuple of np.ndarray

        """
        if isinstance(data, np.ndarray):
            if data.dtype.fields is not None:
                data = tuple(data[descriptor[0]] for descriptor in data.dtype.descr)
            else:
                data = tuple(data[..., i] for i  in range(data.shape[-1]))
        if len(data) != n_components:
            raise ValueError("The input to this `MultiNorm` must be of shape "
                             f"({n_components}, ...), or be structured array or scalar "
                             f"with {n_components} fields.")
        return data

There will also be some implications when we check consistency between the data, colormap and norm in colorizer, but that is a problem for a later PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is my idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the MultiNorm so that only tuples, lists, or structured arrays are acceptable as input to call, inverse, etc.

The details are in MultiNorm._iterable_components_in_data(data, n_components)

I made an attempt to have error messages that steer the user in the right direction.

multi_norm = mpl.colors.MultiNorm(['linear', 'linear'])
multi_norm(np.zeros((3, 2)))

gives the error:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. You can use rfn.unstructured_to_structured(data) available with from numpy.lib import recfunctions as rfn to convert the input array of shape (3, 2) to a structured format

while

multi_norm(np.zeros((3, 2)))

gives:

The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. You can use list(data) to convert the input data of shape (2, 3) to a compatible list

Does this make sense to you @timhoffm?
I would really like to point users in the right direction, but I have limited experience writing errors of this kind :)

(PS: we may want to revisit the wording here in a later PR, because I suspect this error message will come up most commonly form the top level plotting functions, and in that case I would prefer a error message of the kind "Using axes.imshow(...) with a bivariate or multivariate colormap requires the input data..." rather than mentioning MultiNorm by name, but time will tell how easily that can be achieved, and if it is worth it.)

Copy link
Member

@jklymak jklymak Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue structured arrays are rarely used and that most people who need "structured arrays" use xarray. I'd remove the last sentence of the first error. If someone doesn't already have the data as a structured array, I doubt they really want to make it one for plotting (versus just making a list or tuple). If you want a second sentence, then I'd show them how to make a list data_as_list = [data[:, i] for i in range(data.shape[1])].

The second error should use the same name: data_as_list = list(data). However, I do think that we should just let that pass silently, even if we don't document such arrays as being a legitimate argument.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jklymak that structured arrays are too special so that we should not recommend them if people are not using them already. - They should be mentioned in the docstring but not in a suggestion in an error message. If the array is 2D list(data.T) is the simplest way, and [data[..., i] for i in range(data.shape[-1])] for more than 2D.

I do think that we should just let that pass silently, even if we don't document such arrays as being a legitimate argument.

I'm coming to the conclusion that we should simply accept "sequence of arrays". This means

  • We can check that len(data) == n_components
  • We can iterate over the component arrays.
  • It works for list/tuple.
  • It works for n-dim arrays - since they are a sequence of n-1 dim arrays.

Bonus points if you give a helpful error message in case of len(data) != n_components if it is an array with shape(data)[-1] == n_components.

The ambiguity argument, which led me to propose not accepting arrays is actually not that strong. Either it has the right shape, so list(data) is just extra work, or the order is different, but we can fail with a clear error message then. The only case we cannot dect whether the use has organized the data correctly are (n, n) shaped arrays, but that should be rare as typically 2 <= n_components <= few but values are either 1 or many.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @jklymak and @timhoffm for the feedback :D

I'm coming to the conclusion that we should simply accept "sequence of arrays". This means

The way I have implemented this is a type check isinstance(data, (tuple, list)). We could accept other container classes as well, but in my mind it is reasonable to lock it down (and this simplifies the typing/errors).


Using multi_norm = mpl.colors.MultiNorm(['linear', 'linear']) there are now 7 different error messages. All of them start with the text:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields.

multi_norm(np.zeros((2, 3))) gives:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. You can use data_as_list = list(data) to convert the input data of shape (2, 3) to a compatible list

multi_norm(np.zeros((3, 2))) gives:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. You can use data_as_list = list(data.T) to convert the input data of shape (3, 2) to a compatible list

multi_norm(np.zeros((3, 3, 2))) gives:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. You can use data_as_list = [data[..., i] for i in range(data.shape[-1])] to convert the input data of shape (3, 3, 2) to a compatible list

If the axis is inconsistent so that both first and last or neither first or last matches: i.e. multi_norm(np.zeros((2, 2))) gives:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. An np.ndarray of shape (2, 2) is not compatible

The above covers the cases with a scalar np.ndarray. There is also:

For a structured array with the wrong number of fields:

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. A structured array with 3 fields is not compatible

For an unsupported class: multi_norm(1)

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. Input of type <class 'int'> is not supported

and finally, for a list/tuple of wrong length: multi_norm((1, 1, 1))

ValueError: The input to this MultiNorm must be a list or tuple of length 2, or be structured array with 2 fields. A <class 'tuple'> of length 3 is not compatible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought that this is a sufficient and reasonable implementation:

   def _iterable_components_in_data(data, n_components):
       if isinstance(data, np.ndarray) and data.dtype.fields is not None:
           # structured array
           if len(data.dtype.fields) != n_components:
               raise ValueError(
                   "Structured array inputs to MultiNorm must have the same "
                   "number of fields as components in the MultiNorm. Expected "
                   f"{n_components}, but got {len(data.dtype.fields} fields"
           else:
               return tuple(data[field] for field in data.dtype.names)
       try:
           n_elements = len(data)
       except TypeError:
           raise ValueError("MultiNorm expects a sequence with one element per "
                            f"component as input, but got {data!r} instead")
       if n_elements != n_components:
           raise ValueError(
               "MultiNorm expects a sequence with one element per component. "
               f"This MultiNorm has {n_components} components, but got a sequence "
               f"with {n_elements} elements"
       return tuple(data[i] for i in range(n_elements))

This implicitly accepts arrays and other sequence-like containers. Do you see inputs where the error messages are not clear enough. Optionally, one can add special messages for incorrectly shaped arrays in the last if to give the conversion guidance.

@trygvrad trygvrad force-pushed the multivariate-plot-prapare branch 3 times, most recently from dd8f0c6 to 67b8264 Compare July 13, 2025 11:56
trygvrad and others added 4 commits July 20, 2025 16:05
This commit merges a number of commits now contained in  https://github.com/trygvrad/matplotlib/tree/multivariate-plot-prapare-backup , keeping only the MultiNorm class
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>
Comment on lines +3544 to +3548
data = tuple(data[descriptor[0]] for descriptor in data.dtype.descr)
if len(data) != n_components:
raise ValueError(f"{MultiNorm._get_input_err(n_components)}"
f". A structured array with "
f"{len(data)} fields is not compatible")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
data = tuple(data[descriptor[0]] for descriptor in data.dtype.descr)
if len(data) != n_components:
raise ValueError(f"{MultiNorm._get_input_err(n_components)}"
f". A structured array with "
f"{len(data)} fields is not compatible")
if len(data.dtype.fields) != n_components:
raise ValueError(
"Structured array inputs to MultiNorm must have the same "
"number of fields as components in the MultiNorm. Expected "
f"{n_components}, but got {len(data.dtype.fields} fields.")
else:
return tuple(data[field] for field in data.dtype.names)

I think we can:

  • separate the cases better. With if raise else return it's obviously clear that the structured array case is handled exhaustively. creating data and letting it fall through to the final return data increases mental load.
  • use a more focussed error message for structured arrays. If people are inputting structured arrays, they should be very much know what they want and likely just have the wrong number of elements. They don't need the full possible set of inputs.

Data to normalize, as tuple, scalar array or structured array.

- If tuple, must be of length `n_components`
- If scalar array, the first axis must be of length `n_components`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought that this is a sufficient and reasonable implementation:

   def _iterable_components_in_data(data, n_components):
       if isinstance(data, np.ndarray) and data.dtype.fields is not None:
           # structured array
           if len(data.dtype.fields) != n_components:
               raise ValueError(
                   "Structured array inputs to MultiNorm must have the same "
                   "number of fields as components in the MultiNorm. Expected "
                   f"{n_components}, but got {len(data.dtype.fields} fields"
           else:
               return tuple(data[field] for field in data.dtype.names)
       try:
           n_elements = len(data)
       except TypeError:
           raise ValueError("MultiNorm expects a sequence with one element per "
                            f"component as input, but got {data!r} instead")
       if n_elements != n_components:
           raise ValueError(
               "MultiNorm expects a sequence with one element per component. "
               f"This MultiNorm has {n_components} components, but got a sequence "
               f"with {n_elements} elements"
       return tuple(data[i] for i in range(n_elements))

This implicitly accepts arrays and other sequence-like containers. Do you see inputs where the error messages are not clear enough. Optionally, one can add special messages for incorrectly shaped arrays in the last if to give the conversion guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Bivariate colormapping
7 participants