Skip to content

Introduce helper to dump Babelfish operator class defined over built-in data types #346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Deepesh125
Copy link
Contributor

@Deepesh125 Deepesh125 commented Apr 22, 2024

Description

Extension changes - babelfish-for-postgresql/babelfish_extensions#2460

There is a bug (described below) in Postgres due to which it does not dump user-defined operator class over built-in data
types. This commit fixed that issue by introducing helper function which will dump operator class and all of its members
appropriately.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]

Issue details

Let's assume that we want to create certain operator between int (Oid = 23) and numeric (Oid = 1700). And we want to add this operator to existing operator family of integer (intger_ops (oid = 1976)) so that optimiser chooses index scan in certain situation over sequential scan. For this, one would follow following script to create operator class -

CREATE SCHEMA uds;

CREATE FUNCTION uds.int4_numeric_cmp (int, numeric)
RETURNS int
AS
$$
  Select numeric_cmp($1::numeric, $2)
$$
LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE;

CREATE FUNCTION uds.int4_numeric_eq (int, numeric)
RETURNS boolean
AS
$$
  Select numeric_eq($1::numeric, $2)
$$
LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE;

CREATE OPERATOR uds.= (
LEFTARG = int,
RIGHTARG = numeric,
FUNCTION = uds.int4_numeric_eq,
COMMUTATOR = =,
NEGATOR = <>,
RESTRICT = eqsel,
JOIN = eqjoinsel
);

CREATE OPERATOR CLASS uds.int_numeric FOR TYPE int4
USING btree FAMILY integer_ops AS
OPERATOR 3 uds.= (int, numeric),
FUNCTION 1 uds.int4_numeric_cmp(int, numeric);

This CREATE OPERATOR CLASS would create unique entries in pg_amop for each of the member operators and will create entry in pg_amproc for each of the support function.

postgres=# select oid, opcname, opcfamily opcintype from pg_opclass where opcname = 'int_numeric';
  oid  |   opcname   | opcintype
-------+-------------+-----------
 16390 | int_numeric |      1976
(1 row)

postgres=# select * from pg_amop where amoplefttype = 23 and amoprighttype = 1700;
  oid  | amopfamily | amoplefttype | amoprighttype | amopstrategy | amoppurpose | amopopr | amopmethod | amopsortfamily
-------+------------+--------------+---------------+--------------+-------------+---------+------------+----------------
 16403 |       1976 |           23 |          1700 |            3 | s           |   16389 |        403 |              0
(1 row)

postgres=# select * from pg_amproc where amproclefttype = 23 and amprocrighttype = 1700;
  oid  | amprocfamily | amproclefttype | amprocrighttype | amprocnum |        amproc
-------+--------------+----------------+-----------------+-----------+----------------------
 16404 |         1976 |             23 |            1700 |         1 | uds.int4_numeric_cmp
(1 row)

But PG does not store dependency of pg_amop or pg_amproc entry on pg_opclass or pg_opfamily . Reason for this is explain below.

postgres=#  select * from pg_depend where classid = 'pg_catalog.pg_amop'::pg_catalog.regclass and objid = 16403;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    2602 | 16403 |        0 |       2617 |    16389 |           0 | a
(1 row)

^^ it just stores dependency on pg_operator (oid = 2617)

postgres=# select * from pg_depend where classid = 'pg_catalog.pg_amproc'::pg_catalog.regclass and objid = 16404;
 classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
    2603 | 16404 |        0 |       1255 |    16385 |           0 | a
(1 row)

^^ it just stores dependency on pg_proc (oid = 1255)

This decision of not storing dependency of pg_amop on pg_opclass or pg_opfamily is causing two issues -

  1. PG will not clean up entries in pg_amop in case operator class is deleted.
postgres=# DROP OPERATOR CLASS uds.int_numeric USING btree;
DROP OPERATOR CLASS

postgres=# select oid, opcname, opcfamily opcintype from pg_opclass where opcname = 'int_numeric';
 oid | opcname | opcintype
-----+---------+-----------
(0 rows)

postgres=# select * from pg_amop where amoplefttype = 23 and amoprighttype = 1700;
  oid  | amopfamily | amoplefttype | amoprighttype | amopstrategy | amoppurpose | amopopr | amopmethod | amopsortfamily
-------+------------+--------------+---------------+--------------+-------------+---------+------------+----------------
 16403 |       1976 |           23 |          1700 |            3 | s           |   16389 |        403 |              0
(1 row)

postgres=# select * from pg_amproc where amproclefttype = 23 and amprocrighttype = 1700;
  oid  | amprocfamily | amproclefttype | amprocrighttype | amprocnum |        amproc
-------+--------------+----------------+-----------------+-----------+----------------------
 16404 |         1976 |             23 |            1700 |         1 | uds.int4_numeric_cmp
(1 row)

postgres=# CREATE OPERATOR CLASS uds.int_numeric FOR TYPE int4
postgres-# USING btree FAMILY integer_ops AS
postgres-# OPERATOR 3 uds.= (int, numeric),
postgres-# FUNCTION 1 uds.int4_numeric_cmp(int, numeric);
ERROR:  duplicate key value violates unique constraint "pg_amop_fam_strat_index"
DETAIL:  Key (amopfamily, amoplefttype, amoprighttype, amopstrategy)=(1976, 23, 1700, 3) already exists.
  1. PG will not dump operator class as per expectation which means we will loose index scan after upgrade. Expectation is that, it would dump operator class with its member operators and functions but its not doing it. I am also attaching dump file for the reference.
postgres=# select oid, opcname, opcfamily opcintype from pg_opclass where opcname = 'int_numeric';
  oid  |   opcname   | opcintype
-------+-------------+-----------
 16406 | int_numeric |      1976
(1 row)

postgres=# select * from pg_amproc where amproclefttype = 23 and amprocrighttype = 1700;
 oid | amprocfamily | amproclefttype | amprocrighttype | amprocnum | amproc
-----+--------------+----------------+-----------------+-----------+--------
(0 rows)

postgres=# select * from pg_amop where amoplefttype = 23 and amoprighttype = 1700;
 oid | amopfamily | amoplefttype | amoprighttype | amopstrategy | amoppurpose | amopopr | amopmethod | amopsortfamily
-----+------------+--------------+---------------+--------------+-------------+---------+------------+----------------
(0 rows)

why does not PG store dependency of pg_amop or pg_amproc entry on pg_opclass or pg_opfamily ?
The answer lies with the invention of btadjustmembers (code link). Here, Postgres is not trying to register dependency of pg_amop entry on int_numeric (Operator class) but it rather tries to store dependency on integer_ops (oid = 1976 / Builtin operator family for integer datatype). And Oid = 1976 is pinned object and since Postgres does not register dependency on pinned object, recordDependencyOn will not add dependency entry.

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the PostgreSQL license, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@Deepesh125 Deepesh125 changed the title Temp change to let Postgre register the dependency of pg_amop on pg_opclass Introduce helper to dump Babelfish operator class defined over built-in data types Apr 29, 2024
destroyPQExpBuffer(query);

if (strcasecmp(opclass, "\"sys\".\"int_numeric\"") == 0)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just a thought): Can't we somehow modify the original query in dumpOpclass function so that it automatically dumps these operator classes which we want to dump? We have already done this in past, for example see fixCursorForBbfTableData function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since pg is not registeting dependency between pg_amop and pg_opclass, its impossible to find member functions/operators using any query.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the only problem with this approach is that any change to opclass definition would require pg_dump utility changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! that is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are planning something to bring up this issue with pg community

@forestkeeper forestkeeper merged commit fe31fbc into babelfish-for-postgresql:BABEL_4_X_DEV__PG_16_X Apr 30, 2024
2 checks passed
forestkeeper pushed a commit to babelfish-for-postgresql/babelfish_extensions that referenced this pull request Apr 30, 2024
This commit fixes this issue by appropriately defining operator class and operators between int and numeric.

This changes also requires engine changes (babelfish-for-postgresql/postgresql_modified_for_babelfish#346) in order to dump these Babelfish defined operator class correctly during pg_upgrade.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
@Deepesh125 Deepesh125 deleted the jira-babel-3385 branch May 2, 2024 04:06
Deepesh125 added a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request May 2, 2024
…in data types (babelfish-for-postgresql#346)

Extension changes - babelfish-for-postgresql/babelfish_extensions#2460

There is a bug (described below) in Postgres due to which it does not dump user-defined operator class over built-in data
types. This commit fixed that issue by introducing helper function which will dump operator class and all of its members
appropriately.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
Deepesh125 added a commit to amazon-aurora/babelfish_extensions that referenced this pull request May 2, 2024
…fish-for-postgresql#2460)

This commit fixes this issue by appropriately defining operator class and operators between int and numeric.

This changes also requires engine changes (babelfish-for-postgresql/postgresql_modified_for_babelfish#346) in order to dump these Babelfish defined operator class correctly during pg_upgrade.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
rishabhtanwar29 pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request May 3, 2024
roshan0708 pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Oct 15, 2024
… built-in data types (babelfish-for-postgresql#346)

Extension changes - babelfish-for-postgresql/babelfish_extensions#2460

There is a bug (described below) in Postgres due to which it does not dump user-defined operator class over built-in data
types. This commit fixed that issue by introducing helper function which will dump operator class and all of its members
appropriately.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
roshan0708 pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Oct 18, 2024
… built-in data types (babelfish-for-postgresql#346)

Extension changes - babelfish-for-postgresql/babelfish_extensions#2460

There is a bug (described below) in Postgres due to which it does not dump user-defined operator class over built-in data
types. This commit fixed that issue by introducing helper function which will dump operator class and all of its members
appropriately.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
roshan0708 pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Oct 18, 2024
… built-in data types (babelfish-for-postgresql#346)

Extension changes - babelfish-for-postgresql/babelfish_extensions#2460

There is a bug (described below) in Postgres due to which it does not dump user-defined operator class over built-in data
types. This commit fixed that issue by introducing helper function which will dump operator class and all of its members
appropriately.

Task: BABEL-3385
Signed-off-by: Dipesh Dhameliya [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants