Scrubbing PII from Datadog RUM Error Payloads
Datadog RUM transmits JavaScript error events, network request metadata, and session replay data to Datadog’s ingest servers. Without explicit redaction, PII embedded in error messages, query strings, or stack frame variable names leaves the browser unfiltered. This page details the beforeSend hook, regex-based scrubbing patterns for emails and tokens, and how to sanitize stack frames before any payload is transmitted. It is a companion to Integrating Observability SDKs: Sentry, Datadog RUM, and OpenTelemetry and the broader Core JavaScript Error Handling & Boundaries reference.
Symptom / Trigger
PII leaking into Datadog RUM most commonly surfaces through a compliance audit or a data subject access request (DSAR) that reveals personally identifiable data in the Datadog Error Tracking console or Session Replay viewer. Common leak vectors:
// Error message logged in RUM Error Tracking:
TypeError: Cannot read property 'balance' of undefined
for user [email protected] (session: tok_live_abc123XYZ)
// URL recorded in a RUM resource event:
GET /api/[email protected]&api_key=sk-prod-abcdef1234567890
The second example — API keys and emails in query strings — is especially dangerous because Datadog RUM records all network resource URLs by default when trackResources: true is enabled.
Root Cause Explanation
Datadog RUM captures error events with a message field derived directly from error.message and a stack field from error.stack. Neither is sanitized by default. Application code that formats error messages with user data, or third-party libraries that embed request details in exception strings, will transmit that data verbatim.
// Broken pattern — embeds user email directly in a thrown error message
async function fetchUserBalance(email) {
const res = await fetch(`/api/balance?email=${email}`);
if (!res.ok) {
// This message string goes directly into RUM's error.message field
throw new Error(`Failed to fetch balance for ${email}: HTTP ${res.status}`);
}
}
When this throws, RUM captures error.message containing the email address. No default Datadog configuration intercepts or masks this.
Step-by-Step Fix
1. Install and configure the beforeSend hook for RUM error events
The beforeSend callback in @datadog/browser-rum receives every event object before serialization. Return false to drop an event; return the (mutated) event object to allow it through.
import { datadogRum } from '@datadog/browser-rum';
// Shared regex patterns — define once, reference everywhere
const EMAIL_PATTERN = /[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}/g;
const TOKEN_PATTERN = /\b(tok_(?:live|test)_|sk-|Bearer\s)[A-Za-z0-9\-_]{8,}/g;
const CREDIT_CARD_PATTERN = /\b(?:\d[ -]?){13,16}\b/g;
function scrubString(str) {
if (typeof str !== 'string') return str;
return str
.replace(EMAIL_PATTERN, '[email]')
.replace(TOKEN_PATTERN, '[token]')
.replace(CREDIT_CARD_PATTERN, '[card]');
}
datadogRum.init({
// ...other config...
beforeSend(event) {
// Scrub error message and stack trace strings
if (event.type === 'error') {
if (event.error?.message) {
event.error.message = scrubString(event.error.message);
}
if (event.error?.stack) {
event.error.stack = scrubString(event.error.stack);
}
// Scrub source URL if the filename contains a query string with PII
if (event.error?.source_file) {
event.error.source_file = scrubString(event.error.source_file);
}
}
return true; // always return true unless you want to drop the event
},
});
The scrubString helper applies all patterns in a single pass. Regex literals with the g flag retain state between calls in some JavaScript engines — define them as module-level constants (not inside the callback) to avoid lastIndex drift.
2. Scrub PII from resource URLs
Resource events record the full URL of every fetch and XHR request. URLs containing email addresses in query parameters or path segments must be normalized before they reach Datadog.
beforeSend(event) {
// Scrub resource event URLs
if (event.type === 'resource' && event.resource?.url) {
try {
const url = new URL(event.resource.url);
// Remove any query parameter whose key suggests PII
const sensitiveParams = ['email', 'token', 'api_key', 'access_token', 'auth'];
sensitiveParams.forEach(key => {
if (url.searchParams.has(key)) {
url.searchParams.set(key, '[redacted]'); // replace value, preserve key for debugging
}
});
event.resource.url = url.toString();
} catch {
// Malformed URL — scrub the raw string as a fallback
event.resource.url = scrubString(event.resource.url);
}
}
return true;
},
Using URL.searchParams.set rather than deleting the parameter preserves the query structure for debugging (you can see which parameter contained PII without seeing the value).
3. Redact PII from custom attributes and user context
datadogRum.setUser and datadogRum.addAction can introduce PII into event context. Sanitize before setting:
function setRumUser(user) {
datadogRum.setUser({
id: user.id, // safe — opaque identifier
name: user.displayName, // safe if display names are not PII in your context
// Deliberately omit: email, phone, ssn, address
plan: user.subscriptionPlan, // safe — categorical value
});
}
Never pass user.email directly to setUser unless your Datadog organization has PII scrubbing enabled at the account level (Settings → Security → Sensitive Data Scanner). The SDK-level beforeSend hook does not intercept data set via setUser — that data bypasses the event pipeline and is attached at the session level.
4. Mask stack frame file paths containing user-specific routing segments
Some application architectures embed user IDs or tokens in URL path segments, which then appear in stack frame file names when errors are thrown inside dynamically loaded routes.
const USER_PATH_PATTERN = /\/users\/[a-zA-Z0-9\-]{8,}(?=\/|$)/g;
const SESSION_PATH_PATTERN = /\/sessions\/[a-zA-Z0-9]{16,}(?=\/|$)/g;
function scrubStackFrames(stack) {
if (!stack) return stack;
return stack
.replace(USER_PATH_PATTERN, '/users/[id]')
.replace(SESSION_PATH_PATTERN, '/sessions/[token]');
}
// Inside beforeSend:
if (event.error?.stack) {
event.error.stack = scrubStackFrames(event.error.stack);
}
5. Test scrubbing logic with a dedicated unit test
Regex scrubbing bugs are easy to introduce and hard to spot in production. Write a unit test that confirms every sensitive pattern is redacted correctly before the beforeSend hook is deployed.
// scrub.test.js
import { scrubString } from './rum-scrubbing';
test('redacts email addresses', () => {
expect(scrubString('Error for user [email protected]')).toBe('Error for user [email]');
});
test('redacts Stripe live tokens', () => {
expect(scrubString('token tok_live_abc123XYZdef456 expired')).toBe('token [token] expired');
});
test('leaves non-PII strings unchanged', () => {
const safe = 'TypeError: Cannot read property length of undefined';
expect(scrubString(safe)).toBe(safe);
});
test('handles null and undefined gracefully', () => {
expect(scrubString(null)).toBeNull();
expect(scrubString(undefined)).toBeUndefined();
});
The scrubString function must handle null and undefined inputs without throwing, because event.error?.message can be undefined on non-standard thrown values (for example, throw 42 or throw { code: 500 }).
6. Enable Datadog Sensitive Data Scanner as a defense-in-depth layer
Client-side beforeSend scrubbing is your primary control, but it runs in untrusted code. A browser extension, a CSP: 'unsafe-eval' violation, or a JS parse error before instrumentation loads can leave beforeSend unwired. Datadog’s Sensitive Data Scanner applies server-side redaction rules to every ingested event.
Configure the scanner in Datadog: Organization Settings → Sensitive Data Scanner → Add Scanning Group. Create a rule matching the error.message attribute path with Datadog’s built-in “Email Address” pattern. This does not replace beforeSend — it is a backstop for events that slip through.
The latency of Sensitive Data Scanner redaction means events are briefly visible in the Datadog UI before redaction completes (typically within seconds). For applications with strict data residency requirements, combine beforeSend with Datadog’s EU region endpoint (site: 'datadoghq.eu') and confirm that the Sensitive Data Scanner is enabled within the EU-region organization.
Verification
After deploying the beforeSend hook, trigger a test error that contains known PII and confirm the scrubbed version appears in the Datadog RUM console.
// Test: throw an error with a synthetic email and token
try {
throw new Error('Test PII: [email protected] token=tok_live_abcDEFghi123');
} catch (err) {
datadogRum.addError(err, { source: 'custom', context: 'pii-scrub-test' });
}
// Expected in Datadog Error Tracking:
// Test PII: [email] token=[token]
Then verify resource events by opening DevTools → Network, submitting a request to a URL with an email query parameter, and confirming the recorded URL in Datadog shows email=[redacted].
Edge Cases & Gotchas
beforeSenddoes not intercept session replay data. User inputs masked bydefaultPrivacyLevel: 'mask-user-input'are handled separately by the replay SDK. If you usedefaultPrivacyLevel: 'allow', sensitive form values will appear in replays regardless ofbeforeSendlogic.- Regex with the
gflag resetslastIndexbetween calls only if the string does not match. Define patterns as module-level constants to avoid sharing mutable state betweenbeforeSendinvocations. If you define regex literals insidebeforeSend, they are recreated on each call — safe but slightly slower under high event volume. datadogRum.addErrorbypasses some automatic error enrichment. Custom errors added viaaddErrormay lack thesource_fileandsource_linefields that automatic error capture populates. Scrubbing logic must account forundefinedfields.- Scrubbing changes the error fingerprint. Datadog groups errors by message similarity. If
beforeSendreplaces a specific email with[email], all errors that previously had unique fingerprints due to different email values will collapse into a single group. This is usually desirable — it groups related errors together — but can mask error rate spikes if the collapsed group was previously spread across many fingerprints.
FAQ
Does Datadog RUM support server-side PII scrubbing as an alternative to beforeSend?
Yes. Datadog’s Sensitive Data Scanner (available in organization settings) applies regex-based redaction rules to ingested data before it is indexed. However, data-at-rest scanning runs after transmission — the payload has already crossed the network boundary. For GDPR compliance, apply beforeSend client-side scrubbing as the primary control and treat Sensitive Data Scanner as a defense-in-depth backstop.
Can I drop entire error events for known-noisy browser errors in beforeSend?
Yes. Return false from beforeSend to suppress a specific event. This is more reliable than the excludedActivityUrls configuration option, which only filters resource events. Use a allowlist of known-noisy error message patterns and drop them explicitly.
Does scrubbing in beforeSend affect Datadog’s automatic session replay masking?
No. Session replay masking and beforeSend operate on completely separate data streams. beforeSend applies only to RUM event payloads (errors, actions, resources, long tasks). Session replay frames are processed by the @datadog/browser-rum-recorder module and are governed by defaultPrivacyLevel and element-level data-dd-privacy attributes, not beforeSend.