-
Notifications
You must be signed in to change notification settings - Fork 104
Fix loading of data for TEXT column (NULL allowed) if primary key is defined using non-default collation from PG endpoint for logical database #3769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: BABEL_4_X_DEV
Are you sure you want to change the base?
Conversation
…defined using non-default collation for DMS Signed-off-by: Shameem Ahmed <[email protected]>
Pull Request Test Coverage Report for Build 15438300366Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Signed-off-by: Shameem Ahmed <[email protected]>
contrib/babelfishpg_tsql/src/hooks.c
Outdated
else if (sql_dialect == SQL_DIALECT_PG && pltsql_psql_logical_babelfish_db_name) | ||
{ | ||
/* | ||
* Check whether the type is collatable or not | ||
* If yes, return server level collation or return InvalidOid | ||
*/ | ||
return OidIsValid(typtup->typcollation) ? CLUSTER_COLLATION_OID() : InvalidOid; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can move this section out of this if condition to later before where we are handling case for TEXT. Also, add comment explaining when and why are we doing this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in latest rev
* No need to tell babelfishpg_common that logical db name has been set if dialect is TSQL | ||
* because the psql_logical_babelfish_db_name is to be used for PG dialect only | ||
*/ | ||
if (sql_dialect == SQL_DIALECT_PG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% sure if this is really needed. Can you please explain use case/example which may break if we dont do thiss
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The is_logical_db_name_set
for TSQL endpoint as well. Now, for foreign constraint the dialect gets set to PG, hence during collation resolution in merge_collation_state, CLUSTER_COLLATION_OID()
returns babelfish server/db level collation. Whereas, it expects DEFAULT_COLLATION_OID
. This results in collation conflict and we get the following error during insertion of row:
1> INSERT INTO dms_db_babel_5077_t7(dms_db_babel_5077_t7_col2) VALUES ('unicode_cs_as');
2> go
Msg 33557097, Level 16, State 1, Server BABELFISH, Line 1
could not determine which collation to use for string comparison
By setting is_logical_db_name_set
for PG dialect only, we ensure that for foreign constraint case as well we return DEFAULT_COLLATION_OID
from CLUSTER_COLLATION_OID()
(which was the behaviour initially)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FKEY check will always be done under PG_DIALECT so same question remains. Why is this needed? can you actually show whole example you are referring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CREATE TABLE dms_db_babel_5077_t6(dms_db_babel_5077_t6_col1 text NULL, dms_db_babel_5077_t6_col2 [varchar](44) collate BBF_Unicode_General_CS_AS primary key);
GO
INSERT INTO dms_db_babel_5077_t6(dms_db_babel_5077_t6_col2) VALUES ('unicode_cs_as');
GO
CREATE TABLE dms_db_babel_5077_t7(dms_db_babel_5077_t7_col1 text NULL, dms_db_babel_5077_t7_col2 [varchar](44) collate BBF_Unicode_General_CS_AS references dms_db_babel_5077_t6(dms_db_babel_5077_t6_col2));
GO
1> INSERT INTO dms_db_babel_5077_t7(dms_db_babel_5077_t7_col2) VALUES ('unicode_cs_as');
2> go
Msg 33557097, Level 16, State 1, Server BABELFISH, Line 1
could not determine which collation to use for string comparison
init_and_check_collation_callbacks(); | ||
|
||
/* let babelfishpg_common know that psql_logical_babelfish_db_name has been updated */ | ||
(*collation_callbacks_ptr->set_logical_db_name_cache) (newval); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this logic, shall we check for valid values of newval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you indicating to check whether the logical database name exists or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, that's redundant as even if the logical db name is incorrect, we won't find any relations, functions, view, procedures etc for that incorrect/invalid logical db
* No need to tell babelfishpg_common that logical db name has been set if dialect is TSQL | ||
* because the psql_logical_babelfish_db_name is to be used for PG dialect only | ||
*/ | ||
if (sql_dialect == SQL_DIALECT_PG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FKEY check will always be done under PG_DIALECT so same question remains. Why is this needed? can you actually show whole example you are referring?
Description
If the babelfish table has a column whose type is anything other than text/ntext (eg: varchar, nvarchar etc) and the collation of the column is NOT same as babelfish database/server level collation and if the query is executed from PSQL endpoint, it will throw error saying it can not decide which collation to use.
If the same query is executed from TDS endpoint, it runs successfully.
This is because when the query is executed from PSQL endpoint -- the collation of varchar is server level collation. During resolution of collation at the parent node, the collation of left and right operand does NOT match and none of their collation matches with CLUSTER_COLLATION_OID (this returns DEFAULT_COLLATION_OID as the dialect is PG)
The query is working from TDS endpoint -- the collation of varchar is database level collation. During resolution of collation at the parent node, the collation of left and right operand does NOT match and collation of right operand matches with CLUSTER_COLLATION_OID (this returns database level collation as the dialect is TSQL)
This commit handles the case when the connection is DMS by using the dialect and psql_logical_babelfish_db_name GUC. If the connection is DMS, even if it is PG dialect, we will return babelfish server level collation.
Issues Resolved
BABEL-5077
Signed-off-by: Shameem Ahmed [email protected]
Test Scenarios Covered
Use case based -
Boundary conditions -
Arbitrary inputs -
Negative test cases -
Minor version upgrade tests -
Major version upgrade tests -
Performance tests -
Tooling impact -
Client tests -
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.