Skip to content

Return double with correct precision for UNIX_TIMESTAMP #3679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 10, 2025

Conversation

yuancu
Copy link
Contributor

@yuancu yuancu commented May 27, 2025

Description

When UNIX_TIMESTAMP is fed with a timestamp string, it could not return a correct double value since the string is implicitly cast to date before feeding to the function.

Deprecated solution:
I corrected this behavior by adding STRING as an acceptable type to UNIX_TIMESTAMP and adding corresponding handlers for strings. It is worth noting that timestamp string is mentioned as an acceptable argument in the documentation:
The date argument may be a DATE, or TIMESTAMP string
Therefore, The change to the function signature should be appropriate.

I reordered the signatures of UNIX_TIMESTAMP so that unix_timestamp(timestamp) will be prioritized when the input is a string. In such case, the string will be cast to a timestamp. The reordering is valid since this coercion should also be compatible with date and time types and will result in correct values for these types.

Related Issues

Resolves #3611

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering how did we decide to fix these cases one by one instead of adding an implicit conversion from String to Timestamp?

@LantaoJin
Copy link
Member

LantaoJin commented May 29, 2025

Just wondering how did we decide to fix these cases one by one instead of adding an implicit conversion from String to Timestamp?

+1. @yuancu I have mentioned it multiple times, could you make it happen before 3.1.0? We'd better not to extend original function definitions.

For this case, the timestamp string should implicitly cast to timestamp instead of date.

@yuancu
Copy link
Contributor Author

yuancu commented May 29, 2025

Just wondering how did we decide to fix these cases one by one instead of adding an implicit conversion from String to Timestamp?

+1. @yuancu I have mentioned it multiple times, could you make it happen before 3.1.0? We'd better not to extend original function definitions.

Do you mean when we find a type mismatch, we coerce it to timestamp instead of date?

@LantaoJin
Copy link
Member

Just wondering how did we decide to fix these cases one by one instead of adding an implicit conversion from String to Timestamp?

+1. @yuancu I have mentioned it multiple times, could you make it happen before 3.1.0? We'd better not to extend original function definitions.

Do you mean when we find a type mismatch, we coerce it to timestamp instead of date?

It depends on what the format of string is. 2020-01-01 or 2020-01-01 00:00:00

@yuancu
Copy link
Contributor Author

yuancu commented May 29, 2025

Ok I get it. I'll try to fix it.

Update: The functions are resolved based on the distance between input argument types and expected argument types.
unix_timestamp expects date / timestamp / double or no input.
When the input is of string type, its distance to date / timestamp are the same. It picks the definition that is defined earlier. For now, we prioritize unix_timestamp(timestamp) over unix_timestamp(date) as timestamp allows a wider range of inputs -- date / time / timestamp string can be cast to timestamp.

TODO:

LantaoJin
LantaoJin previously approved these changes Jun 4, 2025
@LantaoJin
Copy link
Member

@yuancu could your resolve the conflicts?

@yuancu
Copy link
Contributor Author

yuancu commented Jun 7, 2025

@yuancu could your resolve the conflicts?

No problem.

LantaoJin
LantaoJin previously approved these changes Jun 9, 2025
Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting CI pass

@LantaoJin LantaoJin added calcite calcite migration releated PPL Piped processing language labels Jun 9, 2025
@qianheng-aws
Copy link
Collaborator

qianheng-aws commented Jun 10, 2025

@yuancu Do we have issue to track supporting implicit type coercion in v3? If not, please create a one and attach the ignored test there.

@qianheng-aws qianheng-aws merged commit 159b60e into opensearch-project:main Jun 10, 2025
22 checks passed
@yuancu
Copy link
Contributor Author

yuancu commented Jun 10, 2025

@yuancu Do we have issue to track supporting implicit type coercion in v3? If not, please create a one and attach the ignored test there.

#3761 is created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
calcite calcite migration releated PPL Piped processing language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Return type of UNIX_TIMESTAMP is double but its precision is missing
4 participants