Dedicated Instance - Disaster Recovery

Multi-region failover, also referred to as Disaster Recovery (DR), extends Monte Carlo’s dedicated hosting model to support continuity during regional cloud outages. When DR is enabled, the platform fails over from a primary to a secondary cloud region, allowing operations to resume with minimal disruption. This configuration provides an additional layer of resilience for organizations that require high availability or business continuity assurance.

Architecture and Configuration

In a DR-enabled setup, Monte Carlo provisions a secondary instance of your dedicated environment in another supported cloud region. Critical data resources, including monitoring metadata and configuration state, are continuously synchronized. Core platform components use near real-time replication, while less critical or vendor-managed services are mirrored through periodic snapshots.

The specific failover strategy for each resource is determined by factors such as vendor support, recovery priority, and operational overhead.

Failover Process

Monte Carlo’s engineering team coordinates the failover process based on incident severity and service impact. Not every outage triggers a regional failover; most transient or localized disruptions are mitigated through Monte Carlo’s built-in resiliency mechanisms. A failover is initiated only during a significant or sustained regional outage.

During failover and recovery, customers should expect a brief downtime window while services transition between regions.

Testing

Customers may request a failover simulation once per year with advance notice to validate readiness and confirm that networking and integrations perform as expected during a regional transition.

Scope

Disaster Recovery covers core Monte Carlo application services, including monitoring, detection, and alerting infrastructure. However, certain components do not participate in failover, such as the agents and data store, in addition to data sharing.