DataFrame.plot() produces incorrect legend labels when plotting multiple series on the same axis · Issue #18222 · pandas-dev/pandas (original) (raw)
Code Sample, a copy-pastable example if possible
import pandas as pd import matplotlib.pyplot as plt df = pd.DataFrame(data=[[1, 1, 1, 1], [2, 2, 4, 8]], columns=['x', 'r', 'g', 'b']) fig, ax = plt.subplots(nrows=1, ncols=3)
Left plot
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[0]) df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[0]) df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[0])
Center plot
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[1]) df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[1]) df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[1])
Right plot
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[2]) df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[2]) df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[2])
Produces correct output when uncommented:
ax[0].legend()
ax[1].legend()
ax[2].legend()
plt.show()
Problem description
Note: Possibly related to #14958, #17939, #14563, however this issue discusses how the behaviour depends on the order in which df.plot()
is called
- In certain situations,
df.plot()
may generate incorrect legend labels (see example) - Incorrect legend labels may appear when
df.plot()
plots multiple series on the same axis - Plotting only one series per axis always appears to produce the correct legend label
- When multiple series are plotted on the same axis:
- Only the last
df.plot()
call will produce the correct legend label - Calling
ax.legend()
after the finaldf.plot()
will produce the correct legend entry
- Only the last
- Calling
df.plot()
withmarkerstyle=''
always appears produces the correct legend label for that series regardless of the order in whichdf.plot()
is called, or the number of other series on the same axis
Expected Output
This is the incorrect output produced by the sample code
Note that in each plot, only the last series to be plotted has a correct legend label
This is the expected output
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.10.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: None
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None