This is a minor bug-fix release in the 0.20.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version.
What’s new in v0.20.2
Enhancements
Performance improvements
Bug fixes
Conversion
Indexing
I/O
Plotting
Groupby/resample/rolling
Sparse
Reshaping
Numeric
Categorical
Other
Contributors
Unblocked access to additional compression types supported in pytables: ‘blosc:blosclz, ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’, ‘blosc:zlib’, ‘blosc:zstd’ (GH14478)
Series provides a to_latex method (GH16180)
Series
to_latex
A new groupby method ngroup(), parallel to the existing cumcount(), has been added to return the group order (GH11642); see here.
ngroup()
cumcount()
Performance regression fix when indexing with a list-like (GH16285)
Performance regression fix for MultiIndexes (GH16319, GH16346)
Improved performance of .clip() with scalar arguments (GH15400)
.clip()
Improved performance of groupby with categorical groupers (GH16413)
Improved performance of MultiIndex.remove_unused_levels() (GH16556)
MultiIndex.remove_unused_levels()
Silenced a warning on some Windows environments about “tput: terminal attributes: No such device or address” when detecting the terminal size. This fix only applies to python 3 (GH16496)
Bug in using pathlib.Path or py.path.local objects with io functions (GH16291)
pathlib.Path
py.path.local
Bug in Index.symmetric_difference() on two equal MultiIndex’s, results in a TypeError (GH13490)
Index.symmetric_difference()
TypeError
Bug in DataFrame.update() with overwrite=False and NaN values (GH15593)
DataFrame.update()
overwrite=False
NaN values
Passing an invalid engine to read_csv() now raises an informative ValueError rather than UnboundLocalError. (GH16511)
read_csv()
ValueError
UnboundLocalError
Bug in unique() on an array of tuples (GH16519)
unique()
Bug in cut() when labels are set, resulting in incorrect label ordering (GH16459)
cut()
labels
Fixed a compatibility issue with IPython 6.0’s tab completion showing deprecation warnings on Categoricals (GH16409)
Categoricals
Bug in to_numeric() in which empty data inputs were causing a segfault of the interpreter (GH16302)
to_numeric()
Silence numpy warnings when broadcasting DataFrame to Series with comparison ops (GH16378, GH16306)
DataFrame
Bug in DataFrame.reset_index(level=) with single level index (GH16263)
DataFrame.reset_index(level=)
Bug in partial string indexing with a monotonic, but not strictly-monotonic, index incorrectly reversing the slice bounds (GH16515)
Bug in MultiIndex.remove_unused_levels() that would not return a MultiIndex equal to the original. (GH16556)
MultiIndex
Bug in read_csv() when comment is passed in a space delimited text file (GH16472)
comment
Bug in read_csv() not raising an exception with nonexistent columns in usecols when it had the correct length (GH14671)
usecols
Bug that would force importing of the clipboard routines unnecessarily, potentially causing an import error on startup (GH16288)
Bug that raised IndexError when HTML-rendering an empty DataFrame (GH15953)
IndexError
Bug in read_csv() in which tarfile object inputs were raising an error in Python 2.x for the C engine (GH16530)
Bug where DataFrame.to_html() ignored the index_names parameter (GH16493)
DataFrame.to_html()
index_names
Bug where pd.read_hdf() returns numpy strings for index names (GH13492)
pd.read_hdf()
Bug in HDFStore.select_as_multiple() where start/stop arguments were not respected (GH16209)
HDFStore.select_as_multiple()
Bug in DataFrame.plot with a single column and a list-like color (GH3486)
DataFrame.plot
color
Bug in plot where NaT in DatetimeIndex results in Timestamp.min (GH12405)
plot
NaT
DatetimeIndex
Timestamp.min
Bug in DataFrame.boxplot where figsize keyword was not respected for non-grouped boxplots (GH11959)
DataFrame.boxplot
figsize
Bug in creating a time-based rolling window on an empty DataFrame (GH15819)
Bug in rolling.cov() with offset window (GH16058)
rolling.cov()
Bug in .resample() and .groupby() when aggregating on integers (GH16361)
.resample()
.groupby()
Bug in construction of SparseDataFrame from scipy.sparse.dok_matrix (GH16179)
SparseDataFrame
scipy.sparse.dok_matrix
Bug in DataFrame.stack with unsorted levels in MultiIndex columns (GH16323)
DataFrame.stack
Bug in pd.wide_to_long() where no error was raised when i was not a unique identifier (GH16382)
pd.wide_to_long()
i
Bug in Series.isin(..) with a list of tuples (GH16394)
Series.isin(..)
Bug in construction of a DataFrame with mixed dtypes including an all-NaT column. (GH16395)
Bug in DataFrame.agg() and Series.agg() with aggregating on non-callable attributes (GH16405)
DataFrame.agg()
Series.agg()
Bug in .interpolate(), where limit_direction was not respected when limit=None (default) was passed (GH16282)
.interpolate()
limit_direction
limit=None
Fixed comparison operations considering the order of the categories when both categoricals are unordered (GH16014)
Bug in DataFrame.drop() with an empty-list with non-unique indices (GH16270)
DataFrame.drop()
A total of 34 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
Aaron Barber +
Andrew 亮 +
Becky Sweger +
Christian Prinoth +
Christian Stade-Schuldt +
DSM
Erik Fredriksen +
Hugues Valois +
Jeff Reback
Jeff Tratner
JimStearns206 +
John W. O’Brien
Joris Van den Bossche
JosephWagner +
Keith Webber +
Mehmet Ali “Mali” Akmanalp +
Pankaj Pandey
Patrick Luo +
Patrick O’Melveny +
Pietro Battiston
RobinFiveWords +
Ryan Hendrickson +
SimonBaron +
Tom Augspurger
WBare +
bpraggastis +
chernrick +
chris-b1
economy +
gfyoung
jaredsnyder +
keitakurita +
linebp
lloydkirk +