Skip to content

Session Replay Transport for Rollbar.js #1182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from
Draft

Conversation

matux
Copy link
Contributor

@matux matux commented May 9, 2025

Description of the change

Description by Claude

Goal

Implement a robust transport mechanism for session replay data in the Rollbar.js SDK by:

  • Creating a seamless flow between error occurrences and their associated session recordings
  • Implementing reliable transport for spans containing replay data
  • Establishing deterministic testing structures

Key Components

ReplayMap

  • Manages mapping between error occurrences and session recordings
  • Handles coordination of when recordings are dumped and sent to backend
  • Stores spans temporarily until errors are reported and processed

Queue Integration

  • Adds replay ID to error items during processing
  • Handles API responses to determine when to send associated replays
  • Returns Promise-based responses for better async flow control

API Extensions

  • Implemented postSpans method alongside existing postItem
  • Maintains consistent error handling between items and spans
  • Uses the session endpoint for sending recording data

Major Changes

Code Improvements

  1. Promise-based Flow Control
    - Updated Queue._handleReplayResponse to return meaningful Promises
    - Improved error handling with explicit return values
    - Maintained backward compatibility with existing code
  2. ReplayMap Implementation
    - Fixed context access pattern (contextManager.active() vs context().active())
    - Improved span export mechanism to properly integrate with spanExportQueue
    - Added robust error handling throughout

Testing Infrastructure

  1. Test Organization
    - Separated unit and integration tests into dedicated directories
    - Created proper index files for clean imports
    - Added dedicated test commands for each test type
  2. Test Reliability
    - Replaced flaky setTimeout-based assertions with async/await
    - Implemented spies on Promise-returning methods to await actual completion
    - Created more deterministic test fixtures
  3. Mock Improvements
    - Updated MockTracing to match current implementation
    - Improved SpanExporter mock to better simulate actual behavior
    - Created precise assertions that verify exact data passed between components

Integration Flow

  1. Error occurs in application code
  2. Queue adds a replayId to the error item
  3. Error is sent to Rollbar API
  4. On successful API response, Queue triggers ReplayMap.send()
  5. ReplayMap retrieves spans from memory and sends via API.postSpans()
  6. Spans are transmitted, containing session replay events

Benefits of Implementation

  • Reliability: Deterministic, Promise-based flow prevents race conditions
  • Testability: Clear separation of unit and integration tests
  • Maintainability: Explicit interfaces between components
  • Performance: Efficient handling of potentially large replay data
  • Error Handling: Robust recovery from transport failures

This implementation ensures session replay data is correctly associated with errors and reliably transmitted to the Rollbar backend, providing crucial context for debugging production issues.

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Maintenance
  • New release

Related issues

Checklists

Development

  • Lint rules pass locally
  • The code changed/added as part of this pull request has been covered with tests
  • All tests related to the changed code pass in development

matux added 7 commits May 8, 2025 22:33
- Update Queue._handleReplayResponse to return meaningful Promises
- Replace setTimeout in tests with async/await for deterministic behavior
- Fix replayMap test to properly mock context management and span export queue
- Update test expectations to match modern implementation

This change maintains backward compatibility while making tests more deterministic and less flaky by using Promises instead of arbitrary timeouts.
@matux matux requested review from brianr and waltjones May 9, 2025 03:24
@matux matux self-assigned this May 9, 2025
matux added 5 commits May 9, 2025 00:26
- Update _postPromise to use object destructuring for labeled parameters
- Replace if/else in callback with ternary operator for more concise code
- Maintain backward compatibility with existing code
- All tests passing (unit and integration)
navigator: {
userAgent: 'Mozilla/5.0 Test',
},
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests run in the context of a headless chrome instance, and usually tests that interact with window state are better using the real browser window unless there's a specific reason to mock.

Example:
https://github.com/rollbar/rollbar.js/blob/master/test/tracing/tracing.test.js#L35

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will look into it, thanks.

}

recordingSpan.setAttribute('rollbar.replay.id', replayId);
this.#exporter.export([recordingSpan.span]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When recorder.dump() calls span.end(), this is already done. https://github.com/rollbar/rollbar.js/blob/master/src/tracing/spanProcessor.js#L11-L14

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc refers to exporter.export here. That's my mistake.

Copy link
Contributor Author

@matux matux May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, because the act of ending the span makes the span processor export the span right? Oversight.

Edit: corrections cause on phone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ending the span makes the span processor export the span right?

👍 Yes


recordingSpan.setAttribute('rollbar.replay.id', replayId);
this.#exporter.export([recordingSpan.span]);
const spans = spanExportQueue.slice();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exporter needs a method to map spans in the queue to a payload and return that payload.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This is something we need to talk about because I don't fully grasp it, yet.

This is because we don't export the ReadableSpan like we said yesterday, right? We may it into our API payload.

Edit: phone messes up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants