Network Migration Stream
Context
Context -> High Level Design (Target State) -> Design Considerations (Target State) -> High Level Network Migration Strategy
Understand current state and therefore provide a future state view and transition plan from a network perspective.
Captures current state connectivity, use cases and challenges.
Current State Connectivity
What We Discovered
TEG has two DC environments, one in Mascot, another in Kidman Park. Both operated by Hostworks.
Discovery -> Identified Domain Controllers, Firewalls, Internet Gateways, Load Balancers and Platforms (i.e. Origin Server) and Other Information Systems and Applications are hosted at the DCs and need to be accounted for in the Target State.
Connectivity into the DCs - We've identified multiple patterns. These patterns classify the method which sites connect into the DCs.
Telstra MPLS Network - Services TEG Offices, Sites and Venues, Aegis Call Centre, Dispatch and Temporary Sites (via 4G)
Spark MPLS Network - Services TEG Offices in New Zealand
Site to Site VPN - International Sites and Third Party Sites
Remote Access - Users using Cisco AnyConnect
Functionality
Security and Access Control of the North/South Flows (between Internal and External Networks) are enforced by the ASA 5520 Firewalls.
The ASAs also terminate the Site to Site VPNs and enable Remote Access using the Cisco AnyConnect client. The ASAs also leverage the Active Directory hosted on the Domain Controllers to enable Role Based Access Control for the VPN.
The Internet Gateways enable Internet Access for Sites and Venues connected into the WAN.
Connectivity into Hostworks enables Aspect, the client for point-of-sale, to communicate with the Origin Server in the DC.
Workshops - Some of the following challenges that were taken into consideration for the design were -
DC Environment and MPLS Networks aren't easily scalable and lack cost efficiencies across all sites.
Temporary Sites utilising 4G may experience issues with reliability, being dependent on location.
Most sites and venues only have one route, which doesn't consider for resiliency. Also, third-party sites aren't regulated regarding hardware and network requirements therefore, service levels are not guaranteed equal across all sites.
Lack of consolidation around coordination and logistics of deploying venue-owned site and temporary site links.
Relay between multiple parties, to organise the links.
High Level Design. Captures the connectivity patterns, expected performance, operations and security views for the future state.
Understood - At the DC in Mascot, a Direct Connect into AWS provisioned by Hostworks is present. However, as TEG is moving out of the DC, another service is required to be selected. As such, we've proposed Telstra's Cloud Gateway for the Australian WAN Network, to reduce complexity and timelines for the Migration. However, the vendor can be reviewed as required.
As we're leveraging the existing patterns, TEG Offices and TEG Sites/Venues will connect into AWS through Direct Connect at a COLO, where the Direct Connect and Telstra MPLS Termination will reside.
For Temporary Sites that utilise 4G, we have one of two methods proposed.
Leverage the Direct Connect through Telstra's MPLS Network.
New Service, IPSec VPN through the Internet via 4G. However, this will require an assessment to identify the requirements.
Similarly to Telstra's Cloud Gateway, Spark was identified to also provide a similar service to enable connectivity into AWS. Similarly, the vendor can be revised if required.
Further, sites and venues that previously had site to site VPNs and users requiring remote access will utilize the same patterns to connect into AWS.
For connectivity in the future state, the current patterns will be leveraged to accelerate and simplify the planning and migration activities.
Working with Peter Stojkovski and Chris Morgan from Deloitte, and David from TEG
On the right, we have connectivity patterns displayed for the future state. For context for the Network Migration, we've also classified the sites by tier. We'll look to go through the reasoning later on but -
Tier 1
Represents TEG Offices, Sites/Venues, Aegis Call Centre and Dispatch
Tier 2
Represents TEG Office in New Zealand and International Sites. Also, some Third Party Sites and Venues.
Tier 3
Represents - Temporary Sites and Venues, Some Third Party Sites and Venues
Performance
Following the like-for-like approach, we focused on enabling consistent quality of experience based on the Current State, As such, the bandwidth and latency is expected to be the same as the current state. However, moving to AWS will account for horizontal scalability for future growth, with the ability to increase and decrease performance on-demand in AWS. This will allow flexibility to changing consumption patterns and application requirements.
Security
For the Future State, continuing with Cisco and it's ASAv to succeed the ASAs will ensure configuration and feature parity to reduce migration complexity and operational upskilling.
Adopting the ASAv enables functions such as -
Deep Packet Inspection
Encryption
Policy Enforcement
VPN Management
For the model selection, there are only two ASAv that is supported by AWS - the ASAv10 and ASAv30.
On analysis, the ASAv30 was the aligned better to succeed the ASA 5520 in Current State. On paper, ASAv30 matches, and in most cases exceeds the performance of the ASA 5520, catering for additional demand.
Operations. It's noted that automation offers an opportunity to optimise process for -
Migration
Where Ansible Playbooks and Python can be leveraged as Automation Toolsets to assist with the Migration and Validation Phases.
BAU
With automation is becoming a core capability for businesses. Some realizations to be had are -
Through DevOps, methods and tools can be built to take advantage of the environment, enabling infrastructure and resources to be deployed and destroyed at will, allowing dynamic scaling to meet demand.
By adopting Infrastructure as Code, rapid re-deployment and flexibility in deployment patterns can be enabled, simplifying change and rollback processes.
APIs on the network infrastructure can also be leveraged to perform tasks such as automated changes from a click of a button, avoiding the traditional approach of performing changes on a per-device basis.
Network Functions
For the use cases identified in the current state, an equivalent appliance or service was adopted for the future state.
Stateful Firewalling, Remote Access and Site-to-Site VPN Termination is currently performed by the ASA 5520. It will be the ASAv30 in the future state.
Active Directory will be replaced with AWS' Directory Service to deliver Authencation and Authorisation of Users.
AWS Route 53 for DNS Lookup and Resolution.
Replacing the Foundry Load Balancers for the Origin Platform with AWS ELBs.
Pricing
We were able to provide an indicative monthly cost for the network.
For the Direct Connect, AWS will charge ~$1,500/mo for two DXs (one for Australia, another for NZ), with a Port Speed of 1G, at 10G/Hr.
Additionally, Telstra will charge $2,650/mo to enable the Cloud Gateway.
Spark wasn't able to respond to us around pricing so, we weren't able to provide an estimation.
Between FY2016 and 2018, TEG consumed between 4-6G of Internet/Hr. We've accounted for 10G/Hr, which carries a cost of $1,200/mo.
Internet Service through AWS
Accounted for 2 ASAv30 for HA/DR, which will cost $3,400/month.
Assuming the ELB processes 5G/Hr, it will cost $100/mo.
Lastly, Route53 for DNS. The cost appeared to be negligible.
Together, with the exception of the Spark DX Service, we estimate it would cost ~$8,800 AUD each month.
Design Considerations
SD-WAN
Retaining the Telstra and Spark MPLS Networks for the WAN. Whilst, reducing business impact and minimising risks, SD-WAN aligns better to TEG's aspirations to be -
Service Provider and Hardware Vendor Agnostic
Cost Effective
Globally Scalable
Cloud Based Web Proxy
Discovery -> No Web Proxy.
Should Be Considered
Engage Telstra for New Zealand's MPLS Network
Secondary Connections for Priority Sites
For the SD-WAN Solution
The WAN Overlay (WAN Provider), Field Services (Network Teams) and Carriage Services (Service Providers) can be de-coupled. Presently, TEG is locked-in with Telstra for the Telstra MPLS Network, as it's a proprietary solution.
On the left-hand side, in the context of TEG -
Tier 1 Sites
The MPLS Network (Primary) and the Internet is the Secondary. Reducing cost.
Tier 2 Sites
click to edit
Centralised Perimeter Protection Plane for Remote Users and Sites
Network Migration Strategy
An approach to the delivery of the migration, moving sites and venues from current state to future state.
The migration strategy feeds into the outcomes of the Network Migration Stream for the EPICs
The approach we've taken into context the challenges and considerations of the current state
Read Challenges/Considerations
The approach, we've called 'Agile (Iterative) Approach', enables the migration process to be refined, enabling the transition of complex sites and venues to occur quickly and seamlessly.
Meaning, continual progression of the project with gaps and issues addressed in future iterations.
The migration can occur quicker, as each phase doesn't rely on the completion of the previous. i.e. execution in parallel.
Allows capacity to change directions if needed.
As represented at a high-level, a feedback loop to is present to enable on-going innovation. We'll be going into more detail in the coming slides.
May introduce more unknowns.
This is the approach proposed for the Network Migration.
Perspective of TEG. This is the proposed strategy.
Noted is that, the commencement of the Network Migration is dictated by the establishment of the Core Infrastructure in AWS prior to Migration.
Utilising the Master Services Spreadsheet to identify locations, we were able to estimate the effort and time to complete the Network Migration.
For this estimation, we assumed
All offices were Tier 1
All sites and venues were Tier 2
With a run rate of one migration a day
With a run rate of two migrations per day
All agencies (i.e. third party sites) were Tier 3
With a run rate of three migrations per day
As such, we're looking at approximately 50 change windows to complete the migration.