Skip to content

Commit 41dfccc

Browse files
authored
BUG: fixes #53935 Categorical order lost after call to remove_categories (#54027)
* Changed the default value for sort to 'False' in the difference method. This allows the difference method to then call the _difference method and finally call the _maybe_try_sort method. In the _maybe_try_sort_method it will sort the values if sort is not False. That's why in the original code haveing sort=None would still sort the categories. This way the code will only sort if you set sort=True. * Added test to show the variable value change behaves the way we want it to. * Added bug fix to whatsnew. * Changed bug fix implementation to simply check if the Ordered value is set to True, if so it sets sort=False in the call to difference in remove_categories. * Changed bug fix implementation to simply check if the Ordered value is set to True, if so it sets sort=False in the call to difference in remove_categories. * Switched the implementation to a ternary to check for ordered. This seems to work better since we are not overriding a default argument this way.
1 parent 9d1d1b1 commit 41dfccc

File tree

3 files changed

+21
-1
lines changed

3 files changed

+21
-1
lines changed

doc/source/whatsnew/v2.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -376,6 +376,7 @@ Bug fixes
376376

377377
Categorical
378378
^^^^^^^^^^^
379+
- Bug in :meth:`CategoricalIndex.remove_categories` where ordered categories would not be maintained (:issue:`53935`).
379380
- Bug in :meth:`Series.astype` with ``dtype="category"`` for nullable arrays with read-only null value masks (:issue:`53658`)
380381
- Bug in :meth:`Series.map` , where the value of the ``na_action`` parameter was not used if the series held a :class:`Categorical` (:issue:`22527`).
381382
-

pandas/core/arrays/categorical.py

+5-1
Original file line numberDiff line numberDiff line change
@@ -1369,7 +1369,11 @@ def remove_categories(self, removals) -> Self:
13691369
removals = [removals]
13701370

13711371
removals = Index(removals).unique().dropna()
1372-
new_categories = self.dtype.categories.difference(removals)
1372+
new_categories = (
1373+
self.dtype.categories.difference(removals, sort=False)
1374+
if self.dtype.ordered is True
1375+
else self.dtype.categories.difference(removals)
1376+
)
13731377
not_included = removals.difference(self.dtype.categories)
13741378

13751379
if len(not_included) != 0:

pandas/tests/indexes/categorical/test_category.py

+15
Original file line numberDiff line numberDiff line change
@@ -373,3 +373,18 @@ def test_method_delegation(self):
373373
msg = "cannot use inplace with CategoricalIndex"
374374
with pytest.raises(ValueError, match=msg):
375375
ci.set_categories(list("cab"), inplace=True)
376+
377+
def test_remove_maintains_order(self):
378+
ci = CategoricalIndex(list("abcdda"), categories=list("abcd"))
379+
result = ci.reorder_categories(["d", "c", "b", "a"], ordered=True)
380+
tm.assert_index_equal(
381+
result,
382+
CategoricalIndex(list("abcdda"), categories=list("dcba"), ordered=True),
383+
)
384+
result = result.remove_categories(["c"])
385+
tm.assert_index_equal(
386+
result,
387+
CategoricalIndex(
388+
["a", "b", np.nan, "d", "d", "a"], categories=list("dba"), ordered=True
389+
),
390+
)

0 commit comments

Comments
 (0)