Skip to content

BUG: Python 3.14 may not increment refcount #61368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
tacaswell opened this issue Apr 28, 2025 · 4 comments
Open
3 tasks done

BUG: Python 3.14 may not increment refcount #61368

tacaswell opened this issue Apr 28, 2025 · 4 comments
Labels
Bug Copy / view semantics Needs Discussion Requires discussion from core team before further action Warnings Warnings that appear or should be added to pandas

Comments

@tacaswell
Copy link
Contributor

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
import warnings

warnings.simplefilter('error')

df = pd.DataFrame(
        {'year': [2018, 2018, 2018],
         'month': [1, 1, 1],
         'day': [1, 2, 3],
         'value': [1, 2, 3]})
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

Issue Description

With python 3.14 and the Pandas main branch (or 2.2.3 with pd.options.mode.copy_on_write = "warn") the above fails with:

Python 3.14.0a7+ (heads/main:276252565cc, Apr 27 2025, 16:05:04) [Clang 19.1.7 ]
Type 'copyright', 'credits' or 'license' for more information
IPython 9.3.0.dev -- An enhanced Interactive Python. Type '?' for help.
Tip: You can use LaTeX or Unicode completion, `\alpha<tab>` will insert the α symbol.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame(
   ...:         {'year': [2018, 2018, 2018],
   ...:          'month': [1, 1, 1],
   ...:          'day': [1, 2, 3],
   ...:          'value': [1, 2, 3]})
   ...: df['date'] = pd.to_datetime(df[['year', 'month', 'day']])
<ipython-input-2-a8566e79621c>:6: ChainedAssignmentError: A value is trying to be set on a copy of a DataFrame or Series through chained assignment.
When using the Copy-on-Write mode, such chained assignment never works to update the original DataFrame or Series, because the intermediate object on which we are setting values always behaves as a copy.

Try using '.loc[row_indexer, col_indexer] = value' instead, to perform the assignment in a single step.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/copy_on_write.html
  df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

In [3]: import warnings

In [4]: warnings.simplefilter('error')

In [5]: df = pd.DataFrame(
   ...:         {'year': [2018, 2018, 2018],
   ...:          'month': [1, 1, 1],
   ...:          'day': [1, 2, 3],
   ...:          'value': [1, 2, 3]})
   ...: df['date'] = pd.to_datetime(df[['year', 'month', 'day']])
---------------------------------------------------------------------------
ChainedAssignmentError                    Traceback (most recent call last)
<ipython-input-5-a8566e79621c> in ?()
      2         {'year': [2018, 2018, 2018],
      3          'month': [1, 1, 1],
      4          'day': [1, 2, 3],
      5          'value': [1, 2, 3]})
----> 6 df['date'] = pd.to_datetime(df[['year', 'month', 'day']])

~/.virtualenvs/cp314-clang/lib/python3.14/site-packages/pandas/core/frame.py in ?(self, key, value)
   4156     def __setitem__(self, key, value) -> None:
   4157         if not PYPY:
   4158             if sys.getrefcount(self) <= 3:
-> 4159                 warnings.warn(
   4160                     _chained_assignment_msg, ChainedAssignmentError, stacklevel=2
   4161                 )
   4162

ChainedAssignmentError: A value is trying to be set on a copy of a DataFrame or Series through chained assignment.
When using the Copy-on-Write mode, such chained assignment never works to update the original DataFrame or Series, because the intermediate object on which we are setting values always behaves as a copy.

Try using '.loc[row_indexer, col_indexer] = value' instead, to perform the assignment in a single step.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/copy_on_write.html

In [6]: pd.__version__
Out[6]: '3.0.0.dev0+2080.g44c5613568'

With Python 3.14 there will be an optimization where the reference count is not incremented if Python can be sure that something above the calling scope will hold a reference for the life time of a scope. This is causing a number of failures in test suites when reference counts are checked. In this case I think it erroneously triggering the logic that the object is a intermediary.

Found this because it is failing the mpl test suite (this snippet is extracted from one of our tests).

With py313 I do not get this failure.

Expected Behavior

no warning

Installed Versions

It is mostly development versions of things, this same env with pd main also fails.

INSTALLED VERSIONS

commit : 0691c5c
python : 3.14.0a7+
python-bits : 64
OS : Linux
OS-release : 6.14.2-arch1-1
Version : #1 SMP PREEMPT_DYNAMIC Thu, 10 Apr 2025 18:43:59 +0000
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.3
numpy : 2.3.0.dev0+git20250427.4961a14
pytz : 2025.2
dateutil : 2.9.0.post1.dev6+g35ed87a.d20250427
pip : 25.0.dev0
Cython : 3.1.0b1
sphinx : None
IPython : 9.3.0.dev
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.4
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2025.3.2
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : 6.0.0.alpha0
matplotlib : 3.11.0.dev732+g8fedcea7fc
numba : None
numexpr : 2.10.3.dev0
odfpy : None
openpyxl : 3.1.5
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : 8.3.0.dev32+g7ef189757
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.16.0.dev0+git20250427.55cae81
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : 2025.3.1
xlrd : 2.0.1
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None

@tacaswell tacaswell added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 28, 2025
@rhshadrach
Copy link
Member

Thanks for the report! It sounds like we may need to disable these warnings for Python 3.14+ if the refcount cannot be relied upon.

cc @jorisvandenbossche @phofl

@rhshadrach rhshadrach added Warnings Warnings that appear or should be added to pandas Copy / view semantics and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 28, 2025
@rhshadrach
Copy link
Member

Since CoW is implemented using refcount, could there also be cases where we believe data is not being shared but it really is?

@rhshadrach rhshadrach added the Needs Discussion Requires discussion from core team before further action label Apr 28, 2025
@rhshadrach rhshadrach changed the title BUG: BUG: Python 3.14 may not increment refcount Apr 28, 2025
@jorisvandenbossche
Copy link
Member

Since CoW is implemented using refcount

The actual Copy-on-Write mechanism itself is implement using weakrefs, and does not rely on refcounting, I think.

The refcounts are used for the warning about chained assignments. While not essential for ensure correct behaviour (correctly copying when needed), those warnings are quite important towards the users for migrating / generally avoiding mistakes in the future (giving how widely spread chained assignment is).

So ideally we would be able to keep this warning working.

With Python 3.14 there will be an optimization where the reference count is not incremented if Python can be sure that something above the calling scope will hold a reference for the life time of a scope.

Do you know if there is a technical explanation of this somewhere? (or the PR implementing it? Didn't directly find anything mentioned in the 3.14 whatsnew page)
I'll have to look a bit more into this change and the specific example if there is anything on our side that we can do detect when this happens or to otherwise deal with it.

@mpage
Copy link

mpage commented Apr 28, 2025

Hi! Sorry for the random comment, but @ngoldbaum pointed out this issue to me. I'm the author of the optimization. Happy to answer any questions or help brainstorm a solution with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Copy / view semantics Needs Discussion Requires discussion from core team before further action Warnings Warnings that appear or should be added to pandas
Projects
None yet
Development

No branches or pull requests

4 participants