Add multiple label support for Axes.plot() #16178

yozhikoff · 2020-01-10T16:02:32Z

PR Summary

Since plt.plot() supports multidimensional input it would be reasonable to support multiple labels.
It was mentioned on SO here.

from matplotlib import pyplot as plt

x = [1, 2, 5]

y = [[2, 4, 3],
    [4, 7, 1],
    [3, 9, 2]]

plt.plot(x, y, label=['one', 'two', 'three'])
plt.legend()

It works with all iterable types, with string instances and non-iterables the behavior is just the same as before.

PR Checklist

Has Pytest style unit tests
Code is Flake 8 compliant
New features are documented, with examples if plot related
Documentation is sphinx and numpydoc compliant
Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

anntzer · 2020-01-10T16:13:24Z

I don't like the fact that plot() supports that to start with, but given that it does (and that that's unlikely to get deprecated), I guess the feature makes sense.
This will need some tests and doc, though.

jklymak · 2020-01-10T16:59:17Z

I'm -0.75 on this. What about all the other things associated with plot? By analogy we should allow an Iterable of markers, markersizes, colors, linewidths, linestyles? etc etc. plot is already messy enough and tries to do too much, so I don't think its too much to ask folks to iterate through their matrix and explicitly label each. The following is a tiny bit more verbose, but allows more customization of each line, and is a whole lot more specific:

labels=['one', 'two', 'three']
for i in range(3):
   plt.plot(x, y[i,:], label=labels[i])
plt.legend()

yozhikoff · 2020-01-10T17:27:21Z

I agree that plot is overcomplicated in some terms, but I don't think that the analogy with markers, markersizes, colors, etc works here. This matrix plotting functionality was made obviously for 'fast plotting' purposes, where customization isn't that important. Labels, however, are crucial for determining what do the plotted lines mean.

ImportanceOfBeingErnest · 2020-01-10T17:57:53Z

I think some more vectorized input options to many of the plotting functions would be great; but I do see the problem of maintainability of those organically grown functions.
Specifically for legend labels for lines, there seems to be no urgent need, because

import matplotlib.pyplot as plt

x = [1, 2, 5]
y = [[2, 4, 3], [4, 7, 1], [3, 9, 2]]

lines = plt.plot(x, y)
labels = ['one', 'two', 'three']
plt.legend(lines, labels)
plt.show()

works fine already now.

timhoffm · 2020-01-10T23:03:11Z

If one has multiple labelled datasets, using a structured data structure like pandas.DataFrame is also a good idea.

x = [1, 2, 5]
y = [[2, 4, 3], [4, 7, 1], [3, 9, 2]]

pd.DataFrame(y, index=x, columns=['one', 'two', 'three']).plot()

timhoffm · 2020-01-10T23:31:57Z

I'm +/- 0 on this.

It's a valid point that plt.plot(x, y) is one of the fastest ways to inspect data. Supporting labels for that adds value. Though @ImportanceOfBeingErnest's example is a reasonable pattern as well.
Supporting lists of values for parameters is not necessary. They are automatically cycled, which is good enough for a quick view.
OTOH, supporting multiple datasets in one call is not a good design in the first place. Encouraging its use by adding features is questionable. Do we have an opinion if we could/would deprecate this use of multiple datasets? To me, that would be an indicator for the practical relevance of that functionality.

tacaswell

This was the source of a long discussion on this weeks phone call: https://hackmd.io/hELmT6nMToSPhpP8-0mUZA?both#16178-labels-to-all-lines-plotted-in-plot

The very short summary:

we like the idea, it should go in
only work for a single x, y pair of inputs (like we do with the data kwarg)
needs a whats_new entry

lib/matplotlib/tests/test_legend.py

lib/matplotlib/axes/_base.py

QuLogic · 2020-05-19T20:40:40Z

Please rebase to fix the conflicts.

QuLogic · 2020-06-17T01:56:15Z

I don't think you rebased quite correctly; there should not be so many commits here.

doc/users/next_whats_new/multiple_labels_for_Axes_plot.rst

lib/matplotlib/axes/_base.py

lib/matplotlib/tests/test_legend.py

timhoffm

Commits should be squashed before/when merging.

jklymak · 2020-07-10T14:45:49Z

@tacaswell you are blocking on this...

efiring · 2020-07-10T20:44:57Z

Looking at the tests: the first and last look good, but the middle two indicate that the intended behavior is to allow mismatches between the number of lines and the number of labels, and that labels don't even have to be strings. I don't think this is a good idea. Instead of bending over backwards to accept almost any user input, I think we should leave the user with some responsibility to provide inputs that make sense. Am I misinterpreting something?

doc/users/next_whats_new/multiple_labels_for_Axes_plot.rst

timhoffm · 2020-07-10T21:03:06Z

@efiring in the current implementation labels don't have to be strings. So this is just backwards-compatible. Whether we want strings-only should be discussed separately because it's an API change.

outdated

tacaswell · 2020-07-15T15:54:31Z

This seems reasonable, but it looks like it is not checking if we are getting the multiple-x-y-pairs and either failing:

plt.plot([1, 2], [[3, 4], [10, 11]], [5, 6], [7, 8], label=['a', 'b']); 
plt.legend()

This probably also needs an API change note. The signature on set_label (which is how the label kwarg eventually actually gets handled) is "object that can be converted to string via str(obj)" not "string" so I suspect we are going to have a few users who are going to be broken by this.

tacaswell

The issue of allowing this with plt.plot(x, y, x1, y1, labels=[...]) should either be addressed or intentionally dismissed.

Anyone can clear this review.

Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>

timhoffm · 2020-12-23T21:55:25Z

@tacaswell's comment #16178 (review) is not addressed (can be checked by the example above the linked comment.

I'd be ok with the current state. I.e. multiple labels only supporting the case plot(x, ys) where ys contains data in multiple columns. And in particular not supporting plot(x, y, x2, y2). But that limitation should be mentioned explicitly in the docstring description of the label parameter.

jklymak · 2020-12-23T22:02:49Z

I'm still not a fan of this just because plot is a big beast and I don't think we should be encouraging this API by making it more expansive and frankly inconsistent. For instance if I plot two data sets:

import matplotlib.pyplot as plt
import numpy as np

x = np.array([1, 2, 3])
y = np.array([[1, 2, 3], [4, 5, 6]]).T

x1 = np.array([1, 2, 3, 4]) * 2
y1 = np.array([[1, 2, 3, 5], [4, 5, 6, 7]]).T
fig, ax = plt.subplots()
ax.plot(x, y, x1, y1, label=[['one', 'two'], ['three', 'four']])
ax.legend()
plt.show()

I get

If I instead do:

ax.plot(x, y, x1, y1, label=['one', 'two', 'three', 'four'])

We get an error:

Traceback (most recent call last):
  File "./testLabels.py", line 10, in <module>
    ax.plot(x, y, x1, y1, label=['one', 'two', 'three', 'four'])
  File "/Users/jklymak/matplotlib/lib/matplotlib/axes/_axes.py", line 1599, in plot
    lines = [*self._get_lines(*args, data=data, **kwargs)]
  File "/Users/jklymak/matplotlib/lib/matplotlib/axes/_base.py", line 304, in __call__
    yield from self._plot_args(this, kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/axes/_base.py", line 454, in _plot_args
    raise ValueError(f"label must be scalar or have the same "
ValueError: label must be scalar or have the same length as the input data, but found 4 for 2 datasets.

I am not convinced that it is a good idea to add a feature that only half works with the existing API. This is all in addition to my knee-jerk reaction that if we add this, then we will need to accept tuples for other properties, and we really don't want to go down that road.

I think our API should make basic things easy, but as soon as you want to get fancy, the user is just going to have to do a bit of work, i.e. doing the above as

for yy, lab in zip(y, ['first', 'second']):
   ax.plot(x, yy, label=lab)
for yy, lab in zip(y1, ['third', 'fourth']):
   ax.plot(x1, yy, label=lab)

is just as clear, and allows the user to specify other things in the for-loop, and/or keep the handles to the lines.

timhoffm · 2020-12-23T23:01:41Z

@jklymak I appreciate your skepticism, and have shared it in the beginning. But #16178 (review) changed my mind that this is actually worth adding for the case plot(x, ys).

I don't care too much about the plot(x, y, x2, y2) API. While we cannot get rid of it for backward compatibility reasons, it's not worth jumping extra hoops to make that work as well (or error out gracefully) for multiple labels. In this case, IMHO it's sufficient to document clearly that that call does not support multiple labels.

Concerning "accept tuples for other properties": Labels are special in that compared to other properties they cannot be automatically generated/cycled. Due to this an extension of just label can be justified to better support "quick draws for inspection of data" (see #16178 (comment)).

jklymak · 2020-12-24T00:03:15Z

Fair enough - but if we go down this road, I'm afraid we need a better error message: if I pass in 4 data sets and it tells me I only passed in two, it is relatively confusing.

timhoffm · 2020-12-24T00:53:14Z

Agreed. I suspect that the proper place to check is before

matplotlib/lib/matplotlib/axes/_base.py

Line 300 in 3798e5f

while args:

with something like

if len(args) >= 4 and [has multiple labels]:
    raise ValueError("plot() with multiple groups of data does not support multiple labels")

tacaswell · 2020-12-24T15:46:52Z

I agree with @jklymak and @timhoffm here. The behavior in #16178 (comment) is definitely "wrong".

If we start out raising now we will not break anything that used to work (as this is a new feature) and if someone comes with a compelling reason to support multiple labels (either through nested broadcasting or the raveled version) we can always add that in the future in a backwards compatible way.

@yozhikoff Sorry this has taken almost a year to get through.

tacaswell

I think we should raise in the mulitple sets of input case because it is not clear what the broadcasting rules should be.

tacaswell · 2020-12-24T21:31:54Z

Thanks @yozhikoff ! Congratulations on what (I think) is your first merged Matplotlib PR 🎉 Hopefully we will hear from you agian!

yozhikoff · 2020-12-24T21:33:59Z

@tacaswell Hopefully yes! I am a huge fan of Matplotlib, would love to work on improving it.
It took almost a year largely because I had some troubles figuring out the best solution for multiple data groups case - raising an error was an elegant idea, thanks @timhoffm !

tacaswell added this to the v3.3.0 milestone Jan 27, 2020

tacaswell previously requested changes Jan 28, 2020

View reviewed changes

lib/matplotlib/tests/test_legend.py Outdated Show resolved Hide resolved

lib/matplotlib/axes/_base.py Outdated Show resolved Hide resolved

yozhikoff requested a review from tacaswell May 19, 2020 07:13

QuLogic added the status: needs rebase label May 19, 2020

tacaswell modified the milestones: v3.3.0, v3.4.0 May 19, 2020

yozhikoff requested a review from QuLogic June 11, 2020 21:54

yozhikoff force-pushed the add-multiple-label-support branch from b71d144 to f33a30e Compare July 4, 2020 21:22

timhoffm reviewed Jul 8, 2020

View reviewed changes

timhoffm approved these changes Jul 10, 2020

View reviewed changes

timhoffm removed the status: needs rebase label Jul 10, 2020

QuLogic reviewed Jul 10, 2020

View reviewed changes

doc/users/next_whats_new/multiple_labels_for_Axes_plot.rst Outdated Show resolved Hide resolved

doc/users/next_whats_new/multiple_labels_for_Axes_plot.rst Outdated Show resolved Hide resolved

efiring approved these changes Jul 10, 2020

View reviewed changes

tacaswell requested changes Jul 15, 2020

View reviewed changes

QuLogic added status: needs rebase status: needs revision labels Sep 25, 2020

yozhikoff force-pushed the add-multiple-label-support branch from 2c5cab0 to 6a979fb Compare November 13, 2020 14:32

yozhikoff and others added 6 commits December 23, 2020 22:56

Apply suggestions from code review

7fac6d3

Co-authored-by: Tim Hoffmann <2836374+timhoffm@users.noreply.github.com>

Make tests more readable

dee18de

Remove unused imports

73942be

Update multiple_labels_for_Axes_plot.rst

9558099

Fix pep8

79c1b63

Use pytest.mark.parametrize

24325da

yozhikoff force-pushed the add-multiple-label-support branch from e3fd5fc to 24325da Compare December 23, 2020 20:04

Fix pep8

6bbd04e

jklymak removed the status: needs rebase label Dec 23, 2020

tacaswell requested changes Dec 24, 2020

View reviewed changes

yozhikoff added 6 commits December 24, 2020 23:18

Raise exception in multiple data groups case

4288379

Add multiple label support to plot() docstring

143febf

Fix line length

88ad403

Fix line length

48ebde0

Fix missing label fail

afe34c5

Fix line length

a161ae3

tacaswell approved these changes Dec 24, 2020

View reviewed changes

tacaswell merged commit 6305e8d into matplotlib:master Dec 24, 2020

QuLogic removed the status: needs revision label Mar 17, 2021

jklymak mentioned this pull request Apr 17, 2021

Support for list/array of labels when plotting matrices with plot using label option #20008

Closed

timhoffm mentioned this pull request Feb 9, 2024

[Bug]: Inconsistent treatment of list of labels in plot when the input is a dataframe #27762

Closed

timhoffm mentioned this pull request Jun 22, 2024

[Bug]: Setting exactly 2 colors with tuple in plot method gives confusing error #28434

Closed

Uh oh!

Add multiple label support for Axes.plot() #16178

Add multiple label support for Axes.plot() #16178

Uh oh!

Conversation

yozhikoff commented Jan 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

PR Checklist

Uh oh!

anntzer commented Jan 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklymak commented Jan 10, 2020

Uh oh!

yozhikoff commented Jan 10, 2020

Uh oh!

ImportanceOfBeingErnest commented Jan 10, 2020

Uh oh!

timhoffm commented Jan 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

timhoffm commented Jan 10, 2020

Uh oh!

tacaswell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

QuLogic commented May 19, 2020

Uh oh!

QuLogic commented Jun 17, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

timhoffm left a comment

Choose a reason for hiding this comment

Uh oh!

jklymak commented Jul 10, 2020

Uh oh!

efiring commented Jul 10, 2020

Uh oh!

Uh oh!

Uh oh!

timhoffm commented Jul 10, 2020

Uh oh!

tacaswell commented Jul 15, 2020

Uh oh!

tacaswell left a comment

Choose a reason for hiding this comment

Uh oh!

timhoffm commented Dec 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jklymak commented Dec 23, 2020

Uh oh!

timhoffm commented Dec 23, 2020

Uh oh!

jklymak commented Dec 24, 2020

Uh oh!

timhoffm commented Dec 24, 2020

Uh oh!

tacaswell commented Dec 24, 2020

Uh oh!

tacaswell left a comment

Choose a reason for hiding this comment

Uh oh!

tacaswell commented Dec 24, 2020

Uh oh!

yozhikoff commented Dec 24, 2020

Uh oh!

Uh oh!

yozhikoff commented Jan 10, 2020 •

edited

Loading

anntzer commented Jan 10, 2020 •

edited

Loading

timhoffm commented Jan 10, 2020 •

edited

Loading

timhoffm commented Dec 23, 2020 •

edited

Loading