Skip to content

Axes3D.plot_surface: Allow masked arrays and NaN values #20725

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 12, 2021
Merged

Axes3D.plot_surface: Allow masked arrays and NaN values #20725

merged 6 commits into from
Aug 12, 2021

Conversation

tomneep
Copy link
Contributor

@tomneep tomneep commented Jul 23, 2021

PR Summary

This is my third attempt at solving the issues with using masked arrays in Axes3D.plot_surface, previous attempts are #18114 and #20646. Hopefully it is third time lucky.

I got the impression that the feedback on #18114 was reasonably positive but this approach solves more issues and also behaves like contour with corner_mask=True which was a preference expressed by @ianthomas23.

The suggestion from #20646 was that it would be better to deal with NaNs before we get to Poly3DCollection.do_3d_projection which is called every time we adjust an interactive figure.

I've gone "all in" on this PR and removed the warning about not supporting NaNs and added masked array support, if you prefer to take this step-by-step e.g. removing the masked array part or keeping the NaN warning, then let me know, and I'll happily change.

There are no tests yet, once I've had some feedback on this I'll add some based on the issues that this closes.

Examples of the issues that this PR solves (along with the code to make them) are below. I didn't add the example from #8222 which loads a large dataset but the before and after are visually the same as the examples shown in #20646.

Examples before this PR

plot_surface_master

Examples after this PR

plot_surface_fixes

Code for examples

import matplotlib
import matplotlib.pyplot as plt
import numpy as np

fig, axes = plt.subplots(
    nrows=2,
    ncols=2,
    figsize=(10, 10),
    subplot_kw={"projection": "3d"},
    constrained_layout=True,
)

# 487
x, y = np.mgrid[1:10:1, 1:10:1]
z = x ** 3 + y ** 3 - 500
z = np.ma.masked_array(z, z < 0)

plot_opts = dict(rstride=1, cstride=1, linewidth=0, cmap="inferno")
axes[0, 0].plot_surface(x, y, z, **plot_opts)
axes[0, 0].view_init(35, -90)
axes[0, 0].set(title="Issue #487")

# 4941
x, y = np.mgrid[-1:1:0.05, -1:1:0.05]
z = (x ** 2 + y ** 2) + 0.01
z[10:20, 10:20] = 0.0

masked_array = np.ma.masked_equal(z, 0.0)
palette = plt.get_cmap("gray").copy()
palette.set_over("g", 1.0)
palette.set_under("b", 1.0)
palette.set_bad("r", 1.0)

plot_opts = dict(rstride=1, cstride=1, cmap=palette, vmin=-1, vmax=1)
axes[0, 1].plot_surface(x, y, masked_array, **plot_opts)
axes[0, 1].view_init(35, -90)
axes[0, 1].set(title="Issue #4941")

# 12395
z = (1 - y / x).clip(min=-5, max=5)
# place NaNs at the discontinuity
pos = np.where(np.abs(np.diff(z, axis=0)) >= 5.0)
z[pos] = np.nan

plot_opts = dict(rstride=1, cstride=1, cmap="coolwarm", antialiased=False)
axes[1, 0].plot_surface(x, y, z, **plot_opts)
axes[1, 0].set(title="Issue #12395")

#16470
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
y = [1, 2, 3, 4, 5, 6, 7, 8]

x, y = np.meshgrid(x, y)
nan = np.nan
matrix = np.array(
    [
        [nan, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
        [nan, 1.0, 2.0, 3.0, 4.0, 4.0, 4.0, 3.0, 2.0, 1.0, 1.0],
        [nan, nan, 4.0, 5.0, 6.0, 8.0, 6.0, 5.0, 4.0, 3.0, nan],
        [nan, nan, 7.0, 8.0, 11.0, 12.0, 11.0, 8.0, 7.0, nan, nan],
        [nan, nan, 8.0, 9.0, 10.0, 16.0, 10.0, 9.0, 10.0, 7.0, nan],
        [nan, nan, nan, 12.0, 16.0, 20.0, 16.0, 12.0, 11.0, nan, nan],
        [nan, nan, nan, nan, 22.0, 24.0, 22.0, 20.0, 18.0, nan, nan],
        [nan, nan, nan, nan, nan, 28.0, 26.0, 25.0, nan, nan, nan],
    ]
)
z = np.ma.masked_invalid(matrix)
normC = matplotlib.colors.Normalize(vmax=z.max(), vmin=z.min())
colors = plt.get_cmap("plasma")(normC(z))
axes[1, 1].plot_surface(x, y, z, facecolors=colors)
axes[1, 1].view_init(30, -80)
axes[1, 1].set(title="Issue #16470")

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • New features are documented, with examples if plot related.
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

@jklymak jklymak mentioned this pull request Jul 23, 2021
7 tasks
Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried that we are throwing away data and not storing the original inputs in the PolyCollection. Do we have precedent for that?

Nevertheless I'm tempted to accept the solution as it is much simpler and faster than any later data cleaning. Maybe we should add a note on the data cleaning in the docstring.

@tomneep
Copy link
Contributor Author

tomneep commented Jul 26, 2021

I'm slightly worried that we are throwing away data and not storing the original inputs in the PolyCollection. Do we have precedent for that?

I don't know, hopefully someone else does. The one thought I had was to see what is done in contour/contourf. Here it looks like the masked data are not saved in the collections (but perhaps I am missing something)? Would this be an equivalent case?

import matplotlib.pyplot as plt
import numpy as np

x, y = np.mgrid[-10:10.1:2, -10:10.1:2]
z = np.ma.masked_less(x ** 2 + y ** 2, 50)

fig, ax = plt.subplots(figsize=(6, 6))
contours = ax.contourf(x, y, z)

segs = np.concatenate([np.concatenate(s) for s in contours.allsegs if s]).T
paths = np.concatenate(
    [i.vertices for paths in contours.collections for i in paths.get_paths()]
).T

ax.plot(segs[0], segs[1], "rs")
ax.plot(paths[0], paths[1], ".b")

gives
contour_test

@jklymak
Copy link
Member

jklymak commented Jul 27, 2021

The user threw the data away when they NaN-ed or masked it. I don't think we are going to offer a way to unmask it after the surface is made, so I don't think there is harm in removing it. But maybe I'm missing a subtlety in the data->polygon pipeline. I assume if the user changes x, y, Z in the data then the polygons are regenerated anyway?

@tomneep
Copy link
Contributor Author

tomneep commented Jul 27, 2021

Sorry @jklymak I don't understand your question. What kind of changing of x, y and Z do you have in mind?

@jklymak
Copy link
Member

jklymak commented Jul 27, 2021

via set_data or whatever the 3-d equivalent is...

@tomneep
Copy link
Contributor Author

tomneep commented Jul 27, 2021

I'm not sure what the equivalent would be. You could get the paths with get_paths and perhaps modify those, but it would be tricky. Maybe there is some easy method that I am missing.

The original x, y and z don't get passed to the Poly3DCollection. I suppose I could imagine a new collection that computed the polygons from the x, y and z and would recalculate the polygons if set_data was called, but that isn't how it currently works. Is that what you had in mind?

@jklymak
Copy link
Member

jklymak commented Jul 27, 2021

Sorry if I'm taking you off on a tangent - many of our methods save a representation of the original data. If this has no such representation, then there really is no harm done in throwing out masked data because the user can't modify it post-facto anyways.

@tomneep
Copy link
Contributor Author

tomneep commented Jul 27, 2021

The Poly3DCollection is made directly from the vertices of the polygons and doesn't see the original data. It isn't just the masking because there are the stride/count options in plot_surface that (can) also throw data away before the polygons are made. So I think we are ok on that front?

@jklymak jklymak added this to the v3.5.0 milestone Jul 28, 2021
@jklymak
Copy link
Member

jklymak commented Jul 28, 2021

@eric-wieser @WeatherGod @ianthomas23 do any of you have time to comment on this solution to the masking problem? It seems reasonable to me but I don't know if I've ever used plot_surface in matplotlib ;-) Thanks!

@eric-wieser
Copy link
Contributor

eric-wieser commented Jul 29, 2021

Can you add a test showing how this behaves with striding? For instance, if a facet with stride=4 has all the . entries masked in this pattern, does matplotlib draw a diamond or two triangles?

++...
+....
.....
....+
...++

@tomneep
Copy link
Contributor Author

tomneep commented Jul 29, 2021

@eric-wieser Here is the example you request (for before and after this PR). It is a bit hard to show here but from the side angle (third plot) it doesn't look right, neither before and after this PR. Is this what you wanted to see?

Example without PR

stride_tests_master

Example with PR

stride_tests_pr

Code for plots

import matplotlib.pyplot as plt
import numpy as np

x, y = np.mgrid[-2:2.1:1, -2:2.1:1]
z = np.ma.masked_less(x * y, 2)

fig, (ax1, ax2, ax3) = plt.subplots(
    figsize=(9, 3), ncols=3, subplot_kw={"projection": "3d"}
)
ax1.plot_surface(x, y, z)
ax1.set_title("Default strides")
ax1.view_init(60, -45)

surf = ax2.plot_surface(x, y, z, rstride=4, cstride=4)
ax2.set_title("r/cstride = 4")
ax2.view_init(60, -45)

surf = ax3.plot_surface(x, y, z, rstride=4, cstride=4)
ax3.set_title("r/cstride = 4")
ax3.view_init(2, 30)

@eric-wieser
Copy link
Contributor

Thanks, that does show what I was expecting to see. I don't think the side view is interesting, but I think the middle plot might be worth adding a test for in case someone decides they want different behavior later. Using np.mgrid[-6:6.1:1, -6:6.1:1] but leaving the stride at 4 would probably produce a slightly more insightful plot for deciding whether the current behavior is sensible.

@tomneep
Copy link
Contributor Author

tomneep commented Jul 29, 2021

Thanks, I've added a test with the range you suggest. Indeed the plot looks a bit more interesting than my previous comment, with a double arrowhead/bowtie arrangement. Does this look ok to you?

strides test

@eric-wieser
Copy link
Contributor

Yes, that looks like a good test case to include, thanks! My question (possibly for a future PR) would be whether it makes sense to split that middle polygon in two, into two triangles - but it's not immediately clear to me how such an approach would generalize.

@tomneep
Copy link
Contributor Author

tomneep commented Jul 29, 2021

Yes I had the same thought after the first example (whether the polygon could be split into pieces to help the side view). I'm not sure how simple it is, so I think as you say it might be a question for a future PR.

@jklymak
Copy link
Member

jklymak commented Jul 29, 2021

I'm not sure we should be worried about striding effects - if you decimate your data you should expect aliasing, and if you decimate a mask it is luck of the draw where on the mask your stride will fall. If you need more control, it really isn't hard to stride your own data.

@eric-wieser
Copy link
Contributor

I'd dispute that the claim that it's easy to stride your own data - this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

That's not to say you're wrong in saying that the weirdness in the plot above is unimportant - I think it's fine to merge with the current behavior, and was only pushing for a test to ensure we were aware of the quirk.

@jklymak
Copy link
Member

jklymak commented Jul 29, 2021

this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

probably off topic, but that is confusing. The docs say rstride, cstride : Downsampling stride in each direction. Given that, I'm not sure what "full resolution along the gridlines and discarding data in the cells" means.

Copy link
Member

@ianthomas23 ianthomas23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a couple of CI failures, but I've rerun them and they now pass.

As my opinion was requested, I've taken a look at this PR and am happy with the output. But I am not a frequent user of mplot3d.

@eric-wieser
Copy link
Contributor

eric-wieser commented Aug 4, 2021

this isn't just simple striding that takes 1/n² of the elements, it's preserving full resolution along the gridlines and discarding data in the cells, taking (2n-1)/n² of the elements. That means there's no way to represent the striding in a numpy array.

probably off topic, but that is confusing. The docs say rstride, cstride : Downsampling stride in each direction. Given that, I'm not sure what "full resolution along the gridlines and discarding data in the cells" means.

I mean that rstride=cstride=3 samples the input data as

#######
#  #  #
#  #  #
#######
#  #  #
#  #  #
#######

Not

#  #  #


#  #  #


#  #  #

(Where every character is an entry in the input data array, and every # is a point that gets plotted)

@jklymak
Copy link
Member

jklymak commented Aug 4, 2021

So is the problem that the docs are wrong or my understanding of the term "stride" is wrong?

@eric-wieser
Copy link
Contributor

The docs are using "stride" in a vague sense, so they're not wrong per se, but certainly easy to misread. The reading I have of them that is consistent with the implementation is "Split the surface into row and column gridlines spaced rstride and cstride entries apart, but draw the perimeters of the resulting grid squares without downsampling". I suppose this phrasing in the docs is actively misleading:

If the input data is larger, it will be downsampled (by slicing) to these numbers of points.

I think the current behavior of the striding (ignoring this PR) is the right behavior, but the docs don't describe it well.

@QuLogic
Copy link
Member

QuLogic commented Aug 12, 2021

It appears this is good to go, and the striding discussion is orthogonal, so I'm going to merge this.

@QuLogic QuLogic merged commit 3277b9d into matplotlib:master Aug 12, 2021
@tomneep tomneep deleted the polygon-nan-removal branch August 19, 2021 08:41
LemonBoy added a commit to LemonBoy/matplotlib that referenced this pull request Jul 4, 2023
Extend the logic introduced by matplotlib#20725 for NaN values to inf values,
avoiding the introduction of infinite vertices in the mesh that wreak
havoc when performing the z-sorting.

Sadly I don't have a minimal test case for this problem.
LemonBoy added a commit to LemonBoy/matplotlib that referenced this pull request Jul 31, 2023
Extend the logic introduced by matplotlib#20725 for NaN values to inf values,
avoiding the introduction of infinite vertices in the mesh that wreak
havoc when performing the z-sorting.

Sadly I don't have a minimal test case for this problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants