Please enable JavaScript.
Coggle requires JavaScript to display documents.
NGS Data Platform Modernization - Coggle Diagram
NGS Data Platform Modernization
Objectives
Transition from vendor-operated Azure SQL DB + ADF + Power BI
stack
to a self-managed Microsoft Fabric-based platform
Achieve trust, transparency, and lineage with end-to-end observability
Ensure business continuity with dual-run and KPI parity
Build a governed analytics ecosystem (RLS, OLS, certified semantic model)
Maintain cost discipline with predictable spend and usage guardrails
Scope of work
Stream A - Migration of Historical Data
Profile & migrate 10 years of historical data (only valuable datasets)
Retire redundant objects and plan cold archive policy
Implement idempotent loads (MERGE/UPSERT), de-duplication
Validate row counts & KPI parity → dual-run & cutover plan
Stream B - Implementation of New Platform
Deliver Functional Specs (B1.1–B1.5):
Modern data architecture & ingestion (ADLS Gen2 / OneLake)
Data quality & governance framework
Central semantic model & Power BI integration
DS/ML enablement (optional feature store)
Scale for 3–5 new sources/year, 1–5 M rows/year/fact
Stream C - Operaitons & Maintenance
Post-go-live support with SLA targets (99.9 % uptime)
Continuous optimization, training & governance oversight
Functional Specifications
Data Architecture & Ingestion
Layered Zones: Raw → Validated → Curated
Open formats (Delta or Iceberg) and auto-compaction strategy
Connectors: SFTP, REST, JDBC with schema contracts & late-arrival handling
Git-versioned pipelines with idempotent re-runs and central DAG monitoring
Serve via SQL endpoints + governed presentation marts
Integrate RBAC/ABAC and Power BI Direct Lake optimization
DQ & Governance
DQ SLAs per dataset; profiling, drift detection & quarantine
Policy-as-Code for RLS/OLS/masking & classification (PII labels)
Automated lineage from source → Power BI (measure level)
Integrated catalog with business glossary & DQ dashboards
Sematic Layer & BI
Central certified Power BI model (thin report pattern)
Import / Incremental Refresh or Direct Lake mode
Targets: ≤ 3 s render time (95th pct); ≤ 20 min refresh
Data Science & ML
Notebook environment (Python/SQL) with RBAC & auto-suspend compute
Batch scoring, model versioning, governed promotion
Migration & Cutover
Historical backfill → dual-run → parity validation → rollback plan (≤ 3 days)
General Specifications
Data Privacy & Security – PDPA compliance, SingCERT patching, AES-256 encryption
Environments & Hosting – Dedicated Azure cloud in Singapore (APAC fallback with PDPA justification)
VAPT & Assurance – Quarterly VA, annual PT (SOC 2/ISO 27001)
Change Mgmt & KT – Training, SOPs, architecture docs, CI/CD guide
Support & SLAs – ITIL 4 aligned, P1 ≤ 4 h, P2 ≤ 8 h workaround, monthly reports
Cost Governance & Portability – Budget thresholds, auto-suspend, open formats
Project Delivery – Named PM + core team; weekly WIP & bi-weekly steering
POC Stage – Time-boxed demo of end-to-end ingestion → Power BI semantic model
Acceptance & Handover – Pipelines, DQ/Lineage, Semantic Model, DR drill, Cost Dashboard, Docs, KT sessions
Evaluation Criteria
Competitive Pricing (detailed breakdown) 40 %
Strategic & Technical Proposal Quality 60 % (total) – includes team structure & track record