Skip to content

Commit 261d608

Browse files
committed
Expose max_internal_size and max_leaf_size as rw on the C classes
Just like on the Python clasess. Fixes #166 This takes a metaclass at the C level, but it's a very simple one. In addition to consistency, and letting the sizes be customized for an entire application, this has some nice properties: - We can optimize some zope.interface object storage - We delete the DEFAULT_MAX_*_SIZE macros. Now there's only one source of truth for those, whether in C or Python: _datatypes.py In addition, I was able to make a small optimization for __slotnames__. Previously it was computed (to be empty) and then discarded over and over (every time you pickled or deactivated an object); now it is properly cached. This won't affect any subclasses.
1 parent 86fd464 commit 261d608

31 files changed

+284
-103
lines changed

CHANGES.rst

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,20 @@
22
BTrees Changelog
33
==================
44

5-
4.8.1 (unreleased)
5+
4.9.0 (unreleased)
66
==================
77

8-
- Nothing changed yet.
9-
8+
- Fix the C implementation to match the Python implementation and
9+
allow setting custom node sizes for an entire application directly
10+
by changing ``BTree.max_leaf_size`` and ``BTree.max_internal_size``
11+
attributes, without having to create a new subclass. These
12+
attributes can now also be read from the classes in the C
13+
implementation. See `issue 166
14+
<https://github.com/zopefoundation/BTrees/issues/166>`_.
15+
16+
- Add various small performance improvements for storing
17+
zope.interface attributes on ``BTree`` and ``TreeSet`` as well as
18+
deactivating persistent objects from this package.
1019

1120
4.8.0 (2021-04-14)
1221
==================

docs/development.rst

Lines changed: 27 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
=====================
2-
Developer Information
3-
=====================
1+
=======================
2+
Developer Information
3+
=======================
44

55
This document provides information for developers who maintain or extend
66
`BTrees`.
@@ -25,21 +25,6 @@ Configuration Macros
2525
A string (like "IO" or "OO") that provides the prefix used for the module.
2626
This gets used to generate type names and the internal module name string.
2727

28-
``DEFAULT_MAX_BUCKET_SIZE``
29-
30-
An int giving the maximum bucket size (number of key/value pairs). When a
31-
bucket gets larger than this due to an insertion *into a BTREE*, it
32-
splits. Inserting into a bucket directly doesn't split, and functions
33-
that produce a bucket output (e.g., ``union()``) also have no bound on how
34-
large a bucket may get. Someday this will be tunable on `BTree`.
35-
instances.
36-
37-
``DEFAULT_MAX_BTREE_SIZE``
38-
39-
An ``int`` giving the maximum size (number of children) of an internal
40-
btree node. Someday this will be tunable on ``BTree`` instances.
41-
42-
4328
Macros for Keys
4429
---------------
4530

@@ -194,6 +179,30 @@ Macros for Set Operations
194179
a ``multiunion()`` function (compute a union of many input sets at high
195180
speed). This currently makes sense only for structures with integer keys.
196181

182+
Datatypes
183+
=========
184+
185+
There are two tunable values exposed on BTree and TreeSet classes.
186+
Their default values are found in ``_datatypes.py`` and shared across
187+
C and Python.
188+
189+
190+
``max_leaf_size_str``
191+
192+
An int giving the maximum bucket size (number of key/value pairs).
193+
When a bucket gets larger than this due to an insertion *into a
194+
BTREE*, it splits. Inserting into a bucket directly doesn't split,
195+
and functions that produce a bucket output (e.g., ``union()``)
196+
also have no bound on how large a bucket may get. This used to
197+
come from the C macro ``DEFAULT_MAX_BUCKET_SIZE``.
198+
199+
200+
``max_internal_size``
201+
202+
An ``int`` giving the maximum size (number of children) of an
203+
internal btree node. This used to come from the C macro
204+
``DEFAULT_MAX_BTREE_SIZE``
205+
197206

198207
BTree Clues
199208
===========

docs/overview.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -462,6 +462,9 @@ values for ``max_leaf_size`` or ``max_internal_size`` in your subclass::
462462
... max_leaf_size = 500
463463
... max_internal_size = 1000
464464

465+
As of version 4.9, you can also set these values directly on an
466+
existing BTree class if you wish to tune them across your entire application.
467+
465468
``max_leaf_size`` is used for leaf nodes in a BTree, either Buckets or
466469
Sets. ``max_internal_size`` is used for internal nodes, either BTrees
467470
or TreeSets.

src/BTrees/BTreeModuleTemplate.c

Lines changed: 68 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555

5656
static PyObject *sort_str, *reverse_str, *__setstate___str;
5757
static PyObject *_bucket_type_str, *max_internal_size_str, *max_leaf_size_str;
58+
static PyObject *__slotnames__str;
5859
static PyObject *ConflictError = NULL;
5960

6061
static void PyVar_Assign(PyObject **v, PyObject *e) { Py_XDECREF(*v); *v=e;}
@@ -314,6 +315,7 @@ typedef struct BTree_s {
314315
long max_leaf_size;
315316
} BTree;
316317

318+
static PyTypeObject BTreeTypeType;
317319
static PyTypeObject BTreeType;
318320
static PyTypeObject BucketType;
319321

@@ -583,19 +585,50 @@ VALUEMACROS_H
583585
BTREEITEMSTEMPLATE_C
584586
;
585587

586-
int
587-
init_persist_type(PyTypeObject *type)
588+
static int
589+
init_type_with_meta_base(PyTypeObject *type, PyTypeObject* meta, PyTypeObject* base)
588590
{
591+
int result;
592+
PyObject* slotnames;
589593
#ifdef PY3K
590-
((PyObject*)type)->ob_type = &PyType_Type;
594+
((PyObject*)type)->ob_type = meta;
591595
#else
592-
type->ob_type = &PyType_Type;
596+
type->ob_type = meta;
593597
#endif
594-
type->tp_base = cPersistenceCAPI->pertype;
598+
type->tp_base = base;
595599

596600
if (PyType_Ready(type) < 0)
597601
return 0;
602+
/*
603+
persistent looks for __slotnames__ in the dict at deactivation time,
604+
and if it's not present, calls ``copyreg._slotnames``, which itself
605+
looks in the dict again. Then it does some computation, and tries to
606+
store the object in the dict --- which for built-in types, it can't.
607+
So we can save some runtime if we store an empty slotnames for these classes.
608+
*/
609+
slotnames = PyTuple_New(0);
610+
if (!slotnames) {
611+
return 0;
612+
}
613+
result = PyDict_SetItem(type->tp_dict, __slotnames__str, slotnames);
614+
Py_DECREF(slotnames);
615+
return result < 0 ? 0 : 1;
616+
}
598617

618+
int /* why isn't this static? */
619+
init_persist_type(PyTypeObject* type)
620+
{
621+
return init_type_with_meta_base(type, &PyType_Type, cPersistenceCAPI->pertype);
622+
}
623+
624+
static int init_tree_type(PyTypeObject* type, PyTypeObject* bucket_type)
625+
{
626+
if (!init_type_with_meta_base(type, &BTreeTypeType, cPersistenceCAPI->pertype)) {
627+
return 0;
628+
}
629+
if (PyDict_SetItem(type->tp_dict, _bucket_type_str, (PyObject*)bucket_type) < 0) {
630+
return 0;
631+
}
599632
return 1;
600633
}
601634

@@ -644,6 +677,24 @@ module_init(void)
644677
max_leaf_size_str = INTERN("max_leaf_size");
645678
if (! max_leaf_size_str)
646679
return NULL;
680+
__slotnames__str = INTERN("__slotnames__");
681+
if (!__slotnames__str)
682+
return NULL;
683+
684+
BTreeType_setattro_allowed_names = PyTuple_Pack(
685+
5,
686+
/* BTree attributes */
687+
max_internal_size_str,
688+
max_leaf_size_str,
689+
/* zope.interface attributes */
690+
/*
691+
Technically, INTERNING directly here leaks references,
692+
but since we can't be unloaded, it's not a problem.
693+
*/
694+
INTERN("__implemented__"),
695+
INTERN("__providedBy__"),
696+
INTERN("__provides__")
697+
);
647698

648699
/* Grab the ConflictError class */
649700
interfaces = PyImport_ImportModule("BTrees.Interfaces");
@@ -694,25 +745,22 @@ module_init(void)
694745
SetType.tp_new = PyType_GenericNew;
695746
BTreeType.tp_new = PyType_GenericNew;
696747
TreeSetType.tp_new = PyType_GenericNew;
748+
697749
if (!init_persist_type(&BucketType))
698750
return NULL;
699-
if (!init_persist_type(&BTreeType))
700-
return NULL;
701-
if (!init_persist_type(&SetType))
702-
return NULL;
703-
if (!init_persist_type(&TreeSetType))
704-
return NULL;
705751

706-
if (PyDict_SetItem(BTreeType.tp_dict, _bucket_type_str,
707-
(PyObject *)&BucketType) < 0)
708-
{
709-
fprintf(stderr, "btree failed\n");
752+
if (!init_type_with_meta_base(&BTreeTypeType, &PyType_Type, &PyType_Type)) {
710753
return NULL;
711754
}
712-
if (PyDict_SetItem(TreeSetType.tp_dict, _bucket_type_str,
713-
(PyObject *)&SetType) < 0)
714-
{
715-
fprintf(stderr, "bucket failed\n");
755+
756+
if (!init_tree_type(&BTreeType, &BucketType)) {
757+
return NULL;
758+
}
759+
760+
if (!init_persist_type(&SetType))
761+
return NULL;
762+
763+
if (!init_tree_type(&TreeSetType, &SetType)) {
716764
return NULL;
717765
}
718766

@@ -727,6 +775,7 @@ module_init(void)
727775

728776
/* Add some symbolic constants to the module */
729777
mod_dict = PyModule_GetDict(module);
778+
730779
if (PyDict_SetItemString(mod_dict, MOD_NAME_PREFIX "Bucket",
731780
(PyObject *)&BucketType) < 0)
732781
return NULL;

src/BTrees/BTreeTemplate.c

Lines changed: 85 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,12 +20,12 @@ _get_max_size(BTree *self, PyObject *name, long default_max)
2020
{
2121
PyObject *size;
2222
long isize;
23-
2423
size = PyObject_GetAttr(OBJECT(OBJECT(self)->ob_type), name);
2524
if (size == NULL) {
26-
PyErr_Clear();
27-
return default_max;
25+
PyErr_Clear();
26+
return default_max;
2827
}
28+
2929
#ifdef PY3K
3030
isize = PyLong_AsLong(size);
3131
#else
@@ -48,7 +48,7 @@ _max_internal_size(BTree *self)
4848
long isize;
4949

5050
if (self->max_internal_size > 0) return self->max_internal_size;
51-
isize = _get_max_size(self, max_internal_size_str, DEFAULT_MAX_BTREE_SIZE);
51+
isize = _get_max_size(self, max_internal_size_str, -1);
5252
self->max_internal_size = isize;
5353
return isize;
5454
}
@@ -59,7 +59,7 @@ _max_leaf_size(BTree *self)
5959
long isize;
6060

6161
if (self->max_leaf_size > 0) return self->max_leaf_size;
62-
isize = _get_max_size(self, max_leaf_size_str, DEFAULT_MAX_BUCKET_SIZE);
62+
isize = _get_max_size(self, max_leaf_size_str, -1);
6363
self->max_leaf_size = isize;
6464
return isize;
6565
}
@@ -1035,6 +1035,14 @@ BTree__p_deactivate(BTree *self, PyObject *args, PyObject *keywords)
10351035
}
10361036
}
10371037

1038+
/*
1039+
Always clear our node size cache, whether we're in a jar or not. It is
1040+
only read from the type anyway, and we'll do so on the next write after
1041+
we get activated.
1042+
*/
1043+
self->max_internal_size = 0;
1044+
self->max_leaf_size = 0;
1045+
10381046
if (self->jar && self->oid)
10391047
{
10401048
ghostify = self->state == cPersistent_UPTODATE_STATE;
@@ -2496,8 +2504,79 @@ static PyNumberMethods BTree_as_number_for_nonzero = {
24962504
bucket_or, /* nb_or */
24972505
};
24982506

2499-
static PyTypeObject BTreeType = {
2507+
static PyObject* BTreeType_setattro_allowed_names; /* initialized in module */
2508+
2509+
static int
2510+
BTreeType_setattro(PyTypeObject* type, PyObject* name, PyObject* value)
2511+
{
2512+
/*
2513+
type.tp_setattro prohibits setting any attributes on a built-in type,
2514+
so we need to use our own (metaclass) type to handle it. The set of
2515+
allowable values needs to be carefully controlled.
2516+
2517+
Alternately, we could use heap-allocated types when they are supported
2518+
an all the versions we care about, because those do allow setting attributes.
2519+
*/
2520+
int allowed;
2521+
allowed = PySequence_Contains(BTreeType_setattro_allowed_names, name);
2522+
if (allowed < 0) {
2523+
return -1;
2524+
}
2525+
2526+
if (allowed) {
2527+
PyDict_SetItem(type->tp_dict, name, value);
2528+
PyType_Modified(type);
2529+
if (PyErr_Occurred()) {
2530+
return -1;
2531+
}
2532+
return 0;
2533+
}
2534+
PyErr_Format(
2535+
PyExc_TypeError,
2536+
/* distinguish the error message from what type would produce */
2537+
"BTree: can't set attributes of built-in/extension type '%s'",
2538+
type->tp_name);
2539+
return -1;
2540+
}
2541+
2542+
static PyTypeObject BTreeTypeType = {
25002543
PyVarObject_HEAD_INIT(NULL, 0)
2544+
MODULE_NAME MOD_NAME_PREFIX "BTreeType",
2545+
0, /* tp_basicsize */
2546+
0, /* tp_itemsize */
2547+
0, /* tp_dealloc */
2548+
0, /* tp_print */
2549+
0, /* tp_getattr */
2550+
0, /* tp_setattr */
2551+
0, /* tp_compare */
2552+
0, /* tp_repr */
2553+
0, /* tp_as_number */
2554+
0, /* tp_as_sequence */
2555+
0, /* tp_as_mapping */
2556+
0, /* tp_hash */
2557+
0, /* tp_call */
2558+
0, /* tp_str */
2559+
0, /* tp_getattro */
2560+
(setattrofunc)BTreeType_setattro, /* tp_setattro */
2561+
0, /* tp_as_buffer */
2562+
#ifndef PY3K
2563+
Py_TPFLAGS_CHECKTYPES |
2564+
#endif
2565+
Py_TPFLAGS_DEFAULT |
2566+
Py_TPFLAGS_BASETYPE, /* tp_flags */
2567+
0, /* tp_doc */
2568+
0, /* tp_traverse */
2569+
0, /* tp_clear */
2570+
0, /* tp_richcompare */
2571+
0, /* tp_weaklistoffset */
2572+
0, /* tp_iter */
2573+
0, /* tp_iternext */
2574+
0, /* tp_methods */
2575+
0, /* tp_members */
2576+
};
2577+
2578+
static PyTypeObject BTreeType = {
2579+
PyVarObject_HEAD_INIT(&BTreeTypeType, 0)
25012580
MODULE_NAME MOD_NAME_PREFIX "BTree", /* tp_name */
25022581
sizeof(BTree), /* tp_basicsize */
25032582
0, /* tp_itemsize */

src/BTrees/_IFBTree.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@
2626

2727
#define MOD_NAME_PREFIX "IF"
2828

29-
#define DEFAULT_MAX_BUCKET_SIZE 120
30-
#define DEFAULT_MAX_BTREE_SIZE 500
29+
30+
3131

3232
#include "_compat.h"
3333
#include "intkeymacros.h"

src/BTrees/_IIBTree.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@
2626

2727
#define MOD_NAME_PREFIX "II"
2828

29-
#define DEFAULT_MAX_BUCKET_SIZE 120
30-
#define DEFAULT_MAX_BTREE_SIZE 500
29+
30+
3131

3232
#include "_compat.h"
3333
#include "intkeymacros.h"

0 commit comments

Comments
 (0)