VMware to Azure disaster recovery architecture

Architectural components

Azure

click to edit

Configuration server machine

建議安裝在vmware 上的 vm,可以用 ovp depoly

主要用於所有與 site recovery 相關的元件

configuration server

process server

master traget server

VMware servers

建議使用vCenter 管理

support.fgvpn.synology.com

Replicated machines

Mobility Service (agent) is installed on each VMware VM that you replicate.

Set up outbound network connectivity (Firewall)

Storage

Allows data to be written from the VM to the cache storage account in the source region.

Azure Active Directory

Allows data to be written from the VM to the cache storage account in the source region.

Replication

Allows the VM to communicate with the Site Recovery service.

Service Bus

Allows the VM to write Site Recovery monitoring and diagnostics data.

Normal Replication process

  1. 基本的replication policy
  1. RPO threshold

如果還原時間點大於設定的threadhodld 會發mail 通知

  1. Recovery point Retention

發生中斷時,此設定會指定您所希望回溯的時間。 進階儲存體中的保留期上限為 24 小時。 標準儲存體則為 72 小時。 (有錯誤?)

App-consistent snapshots.

您可以每隔1到12小時執行一次應用程式一致快照集。 快照集為標準的 Azure blob 快照集。 在 VM 執行的流動性代理程式會根據此設定要求 VSS 快照集,並在複寫串流中將該時間點標記為應用程式一致時間點

  1. 傳輸過程不支援site--to-site VPM,可以使用Azure
    ExpressRoute with Microsoft peering.
  1. 初始複寫作業可確保在啟用複寫時,電腦上的整個資料都會傳送至 Azure。 初始複寫完成之後,就會開始將差異變更複寫到 Azure。 機器的追蹤變更會傳送至流程伺服
  1. Communication Flow
  1. vm--> on-pre configuration server via https 443 inbound for replication management
  1. configuration server --> Azure over HTTPS443 ourbound
  1. VM send replication data to process server (same as configuration server) on port HTTPS 9443 (pot can be modified )
  1. The process server receives replication data, optimizes , and encrypts it , and sned it to Azure storage over port 443 outbound

click to edit

Resync process

  1. At time, duing the initial replication process or while transfering delta changes, there can be network connectivity issues between SRC machine to process server or betwenn process server to Azure

Site recovery marks a machine for resyncronization

1.if a machine undergoes force shutdown

  1. if machine undergoes configurational changes like disk resizing

Re-sync sends only delta data to Azure. Data transfer between on-pre and Azure by minized by computing checksums of data between source machine and data store in Azure

Consistency

App-consistent

Crash Consistent

An app-consistent snapshot contain all the information in a crash-consistent snapshot, plus all the data in memory and transactions in progres

A crash consistent snapshot captures data that was on the disk when the snapshot was taken. It doesn't include anything in memory.

Site Recovery creates crash-consistent recovery points every five minutes by default. This setting can't be modified.

Today, most apps can recover well from crash-consistent points.


Crash-consistent recovery points are usually sufficient for the replication of operating systems, and apps such as DHCP servers and print servers.

App-consistent snapshots use the Volume Shadow Copy Service (VSS)

click to edit

1) Azure Site Recovery uses Copy Only backup (VSS_BT_COPY) method which does not change Microsoft SQL's transaction log backup time and sequence number

2) When a snapshot is initiated, VSS perform a copy-on-write (COW) operation on the volume.

3) Before it performs the COW, VSS informs every app on the machine that it needs to flush its memory-resident data to disk.

4) VSS then allows the backup/disaster recovery app (in this case Site Recovery) to read the snapshot data and proceed.

問題:如果要拍快照似乎不是用 VMware 自己的 guest tool 配上 Snapshot ,而是用 mobility agent,如果要拍app-consistent 就無法在非windows 上面使用

  1. 通知app 寫回disk
  2. cow on volume
  3. 讓 DR app 可以存取這些 snapshot