Skip to content

🔈 [RUM-10284] collect telemetry before starting the session manager #3602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 11, 2025

Conversation

BenoitZugmeyer
Copy link
Member

@BenoitZugmeyer BenoitZugmeyer commented Jun 5, 2025

Motivation

We observed that in some condition, the SessionManager crashes. While the exception is caught and buffered in-memory for telemetry, the telemetry event itself is not sent to Datadog.

The goal of this PR is to be able to cover more ground in our telemetry monitoring and caught exceptions that are coming from the Session Manager.

Changes

Changes are kind of significative, so bare with me. As always, please review commit by commit.

The overall strategy is first to refactor Telemetry code so both Logs and RUM behave the same. This implies using a separate batch for telemetry, which was not the case for RUM.

With telemetry logic refactored, we can do some modifications that allows to send telemetry earlier, without waiting for startLogs or startRum to finish.

Test instructions

  • Open the sandbox in Chrome

  • Using chrome devtools, edit the session cookie to add &lock=1 at the end. This makes the Logs and RUM SDKs crash (specific changes for this crash will come in later PRs).

  • Refresh the page

  • Check that telemetry errors are listed in the devtools extension.

  • Also, check that those telemetry errors are sent to Datadog intake in the Network tab.

Checklist

  • Tested locally
  • Tested on staging
  • Added unit tests for this change.
  • Added e2e/integration tests for this change.

@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/common-telemetry-batch branch from 5b7bde8 to 54b02ba Compare June 5, 2025 17:37
@codecov-commenter
Copy link

codecov-commenter commented Jun 5, 2025

Codecov Report

Attention: Patch coverage is 97.61905% with 1 line in your changes missing coverage. Please review.

Project coverage is 92.44%. Comparing base (e5f565e) to head (7139ac7).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
packages/core/src/domain/telemetry/telemetry.ts 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3602      +/-   ##
==========================================
+ Coverage   92.37%   92.44%   +0.07%     
==========================================
  Files         323      322       -1     
  Lines        8132     8129       -3     
  Branches     1842     1840       -2     
==========================================
+ Hits         7512     7515       +3     
+ Misses        620      614       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link

cit-pr-commenter bot commented Jun 5, 2025

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 146.83 KiB 147.40 KiB 585 B +0.39%
Rum Recorder 18.02 KiB 18.02 KiB 0 B 0.00%
Rum Profiler 4.63 KiB 4.63 KiB 0 B 0.00%
Logs 51.70 KiB 51.91 KiB 210 B +0.40%
Flagging 0 B 935 B 935 B N/A%
Rum Slim 106.45 KiB 106.96 KiB 524 B +0.48%
Worker 23.59 KiB 23.59 KiB 0 B 0.00%
🚀 CPU Performance
Action Name Base Average Cpu Time (ms) Local Average Cpu Time (ms) 𝚫
addglobalcontext 0.015 0.013 -0.002
addaction 0.039 0.041 0.002
addtiming 0.010 0.009 -0.001
adderror 0.038 0.038 -0.000
startstopsessionreplayrecording 0.002 0.002 0.000
startview 0.010 0.010 -0.000
logmessage 0.047 0.042 -0.004
🧠 Memory Performance
Action Name Base Consumption Memory (bytes) Local Consumption Memory (bytes) 𝚫 (bytes)
addglobalcontext 26.61 KiB 26.96 KiB 366 B
addaction 51.16 KiB 54.76 KiB 3.60 KiB
addtiming 26.09 KiB 25.95 KiB -135 B
adderror 58.89 KiB 59.04 KiB 147 B
startstopsessionreplayrecording 23.99 KiB 24.92 KiB 955 B
startview 428.01 KiB 432.01 KiB 4.01 KiB
logmessage 57.78 KiB 60.00 KiB 2.22 KiB

🔗 RealWorld

@@ -45,8 +50,8 @@ export const enum TelemetryService {
}

export interface Telemetry {
setContextProvider: (provider: () => Context) => void
observable: Observable<TelemetryEvent & Context>
setContextProvider: (key: string, contextProvider: () => ContextValue | undefined) => void
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 thought: ‏Using a new hook like assembleTelemetry you could get rid of this setContextProvider(No need to do it right now but just to keep in mind)

@BenoitZugmeyer BenoitZugmeyer changed the title 🔈 collect telemetry before starting the session manager 🔈 [RUM-10284] collect telemetry before starting the session manager Jun 6, 2025
@BenoitZugmeyer

This comment was marked as outdated.

@dd-devflow

This comment was marked as outdated.

dd-devflow bot added a commit that referenced this pull request Jun 6, 2025
@BenoitZugmeyer

This comment was marked as outdated.

@dd-devflow

This comment was marked as outdated.

@dd-devflow dd-devflow bot added the staging-23 label Jun 6, 2025
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/common-telemetry-batch branch 2 times, most recently from c0ec56e to 6f805b9 Compare June 6, 2025 12:53
@BenoitZugmeyer BenoitZugmeyer force-pushed the benoit/common-telemetry-batch branch from 6f805b9 to 6c68f88 Compare June 6, 2025 13:04
@BenoitZugmeyer BenoitZugmeyer marked this pull request as ready for review June 6, 2025 13:33
@BenoitZugmeyer BenoitZugmeyer requested a review from a team as a code owner June 6, 2025 13:33
Copy link
Collaborator

@bcaudan bcaudan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

telemetry.setContextProvider(() => ({
foo: 'bar',
}))
telemetry.setContextProvider('foo', () => 'bar')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 suggestion: what about adding a test to illustrate "the progressively set context" case?‏

@BenoitZugmeyer
Copy link
Member Author

/merge

@dd-devflow
Copy link
Contributor

dd-devflow bot commented Jun 11, 2025

View all feedbacks in Devflow UI.

2025-06-11 07:16:08 UTC ℹ️ Start processing command /merge


2025-06-11 07:16:12 UTC ℹ️ MergeQueue: pull request added to the queue

The expected merge time in main is approximately 27m (p90).


2025-06-11 07:36:58 UTC ℹ️ MergeQueue: This merge request was merged

@dd-mergequeue dd-mergequeue bot merged commit 41d37d6 into main Jun 11, 2025
20 checks passed
@dd-mergequeue dd-mergequeue bot deleted the benoit/common-telemetry-batch branch June 11, 2025 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants