Multi-Tenant ISV Platform Modernisation

Modernising a legacy .NET monolith to .NET 8 on Azure App Service — with full Azure DevOps CI/CD, zero-downtime deployment slot swapping, and production-safe rollback across six live instances.

Multi-Tenant ISV .NET 8 Azure DevOps Azure App Service

Legacy Code, Risky Deployments

A multi-tenant independent software vendor ran a centralised API platform serving dozens of distributor tenants — B2B ERP workflows, branded B2C microsite storefronts, payment processing, and desktop sync clients — from a single legacy .NET codebase on Azure App Service.

The platform had grown to 400+ API endpoints but was still carrying patterns from its .NET Framework origins: legacy package references, ad-hoc HttpClient instantiation causing socket exhaustion under load, and no build-time guardrails against async deadlocks. The Azure App Service runtime stack had not been consistently configured for the modernised target framework.

Deployments were manual and risky. Six production App Service instances sat behind public load-balanced domains, yet there was no blue-green process, no automated health verification after release, and no way to roll back cleanly if one instance updated while another failed. SQL schema changes ran separately from application deploys. Production releases required confidence the entire fleet would stay on the same version.


.NET 8 Runtime & Reliability Fixes

Before pipelines could be trusted in production, the application itself needed to run correctly on the modern stack and behave reliably under tenant load.

Runtime stack upgrade

The API was brought onto .NET 8 with the ASP.NET Core minimal hosting model. Every deploy now enforces the correct Azure App Service runtime stack (DOTNETCORE|8.0) on both the staging slot and production — verified again immediately after each slot swap so a misconfigured instance cannot reach live traffic.

Tenancy remains unchanged at the architecture level: one shared application resolves the current tenant from JWT claims or public microsite request keys, then routes to that tenant’s isolated SQL database at runtime.

Bug fixes & hardening

  • Replaced new HttpClient() anti-patterns across payment and integration flows with IHttpClientFactory, eliminating socket exhaustion and stale DNS under concurrent tenant traffic.
  • Added build-time async safety analyzers that fail the build on blocking calls (.Result, .Wait()) and async void handlers.
  • Centralised exception handling middleware for consistent JSON error responses and per-tenant DB logging.
  • Lightweight /health endpoint with no database dependency — returns a build checksum the pipeline compares against the artifact so probes confirm the correct bits are live, not just that something responded.

Azure DevOps Pipeline Architecture

We built four separate environment pipelines with reusable YAML templates — each pipeline scoped to its own Azure service connection and resources, following least-privilege branch promotion from development through to production.

Four pipelines, one promotion path

  • Development — build, unit tests, and security scan only; no deployment
  • Testing — deploys to two App Service instances with full slot-swap flow
  • Staging — same pattern with enhanced security scanning before production
  • Production — six App Service instances across three public API surfaces, manual approval gates, and blocks on critical vulnerability findings

Fifteen reusable templates cover build, Trivy and .NET package vulnerability scanning, slot preparation, deploy, sequential swap, CORS configuration, health checks with build checksum verification, and rollback. SQL migrations run through a separate pipeline family with versioned scripts and migration history tracking.

Branch promotion: dev → test → stage → prod (PR-only, no direct commits). Below is how a single production release flows through the pipeline.

Succeeded Manual validation Health + checksum verify Triggers downstream pipeline
Click the diagram for full-screen view
Zoom
Build
Build & Package 2m 48s
1 job completed
1 artifact · Trivy + NuGet scan
Approve
Approval Gate 20s
manual validation passed
Batch 1 (4 apps) → Batch 2 (2 apps)
Phase 1 — Deploy to staging (parallel)
Deploy · ERP API 141s
DOTNETCORE|8.0 · staging slot
Deploy · ERP API 239s
DOTNETCORE|8.0 · staging slot
Deploy · Microsite 144s
DOTNETCORE|8.0 · staging slot
Deploy · Microsite 242s
DOTNETCORE|8.0 · staging slot
Deploy · Payments API38s
DOTNETCORE|8.0 · staging slot
Deploy · Sync API40s
DOTNETCORE|8.0 · staging slot
Phase 2 — Health + checksum (parallel)
Verify · ERP API 117s
GET /health on staging slot
API checksum = build artifact
Verify · ERP API 215s
GET /health on staging slot
API checksum = build artifact
Verify · Microsite 118s
GET /health on staging slot
API checksum = build artifact
Verify · Microsite 216s
GET /health on staging slot
API checksum = build artifact
Verify · Payments API14s
GET /health on staging slot
API checksum = build artifact
Verify · Sync API15s
GET /health on staging slot
API checksum = build artifact
Phase 3 — Sequential slot swap
Swap · ERP API 132s
staging → production · CORS applied
Swap · ERP API 230s
staging → production · CORS applied
Swap · Microsite 134s
staging → production · CORS applied
Swap · Microsite 231s
staging → production · CORS applied
Swap · Payments API29s
staging → production · tagged build
Swap · Sync API31s
staging → production · tagged build
Phase 4 — Health + checksum (per swap)
Verify · ERP API 130s
availabilityState Normal · GET /health
API checksum = build artifact
Verify · ERP API 228s
availabilityState Normal · GET /health
API checksum = build artifact
Verify · Microsite 131s
availabilityState Normal · GET /health
API checksum = build artifact
Verify · Microsite 229s
availabilityState Normal · GET /health
API checksum = build artifact
Verify · Payments API30s
availabilityState Normal · GET /health
API checksum = build artifact
Verify · Sync API30s
availabilityState Normal · GET /health
API checksum = build artifact
Phase 5 — Fleet health
Final Health Check 1m 12s
1 job completed
3 API surfaces · checksum verified each probe
SQL migration
Trigger SQL Migration 28s
separate pipeline family
versioned scripts · history tracked
View detailed pipeline steps & rollback logic
Each production release: 1. BUILD Restore → compile (.NET 8) → publish → zip artifact Security scan (Trivy + NuGet vulnerabilities) Copy health-check + CORS config into artifact (expected build checksum) 2. APPROVAL GATE (Batch 1 — 4 apps, then Batch 2 — 2 apps) 3. PHASE 1 — PARALLEL DEPLOY TO STAGING SLOTS All target App Services simultaneously: Start/create staging slot Set runtime stack DOTNETCORE|8.0 Deploy new build to staging slot (Production traffic untouched throughout) 4. PHASE 2 — HEALTH + CHECKSUM (parallel, staging) For every deployed instance — before any swap: GET /health on the staging slot Compare API checksum to expected value in build artifact Fail pipeline if mismatch — no swap proceeds 5. PHASE 3 — SEQUENTIAL SLOT SWAP (one app at a time) For each App Service, in dependency order: Swap staging → production Apply environment CORS rules (100+ tenant microsite origins) Tag resource: CurrentBuildId, PreviousBuildId, SwapStatus Stop staging slot (cost saving) 6. PHASE 4 — HEALTH + CHECKSUM (per swap, production) Immediately after each swap: Verify Azure availabilityState = Normal GET /health on the now-live production slot Compare API checksum to build artifact — fail if mismatch 7. PHASE 5 — FINAL FLEET HEALTH CHECK JSON-driven probes against public load-balanced endpoints 10 iterations per endpoint — /health checksum + root on each API surface Measure response time and success rate across the full fleet 8. ON FAILURE — INTELLIGENT ROLLBACK If app N fails after app N+1 already swapped: Read Azure resource tags on the next app If tags match current build → roll back N+1 to PreviousBuildId Keeps all instances on a consistent version

Verification at Every Stage

Health and checksum verification are dedicated pipeline phases — not folded into deploy or swap. Staging is verified before any traffic moves; production is re-verified after every swap.

Application endpoint

A dedicated GET /health controller returns HTTP 200 with a plain-text liveness message and a checksum of the deployed API build. It deliberately avoids touching tenant databases so a single slow tenant cannot fail a deployment probe. The pipeline reads that checksum in a separate verify phase and asserts it matches the value baked into the build artifact — catching wrong builds, partial deploys, or stale slots even when the endpoint returns 200.

Pipeline-driven checks

  • Phase 2 — staging verify — dedicated parallel phase after all deploys complete. Each instance gets GET /health on its staging slot; the pipeline fails if the API checksum does not match the build. No swap starts until every instance passes.
  • Phase 4 — production verify — dedicated step after each sequential swap. Queries Azure availabilityState, calls /health on the live production slot, and re-verifies the checksum before the next app swaps.
  • Phase 5 — fleet check — environment-specific JSON config lists every public endpoint across all three API surfaces; each is probed 10 times with timeout, status code, and checksum validation.
  • Configs in artifact — health and CORS JSON files ship with the build so the same artifact self-describes the expected checksum and how it should be verified after deploy.

What We Used

Application

.NET 8 / ASP.NET Core C# Multi-tenant SQL (per-tenant DB) JWT + public microsite auth Azure Blob Storage IHttpClientFactory

Platform & DevOps

Azure App Service Deployment slots Azure DevOps Pipelines YAML templates Trivy security scanning Azure CLI Resource tagging SQL migration pipelines

Safe, Repeatable Production Releases

The ISV can promote code through four environments with automated gates, deploy to six production instances without downtime, and recover cleanly when something goes wrong.

6
Production App Service instances
4
Isolated environment pipelines
0
Downtime during slot swap
400+
API endpoints on .NET 8
Production releases run through parallel deploy + sequential swap — no more taking the entire fleet offline at once
Tag-based rollback keeps all six instances on the same build version if a mid-fleet deploy fails
Runtime stack enforced on every slot — .NET 8 configured at deploy time and re-verified after swap
HttpClient and async deadlock fixes removed a class of intermittent production failures under tenant concurrency
Separate SQL migration pipelines decouple schema changes from application deploys with history tracking
100+ tenant microsite CORS origins managed automatically on every swap — no manual App Service config after release

Need Safer Deployments?

Whether you’re modernising a legacy .NET platform, standing up Azure DevOps from scratch, or need zero-downtime releases across multiple App Service instances — we design pipeline architecture that fits how your product actually runs in production.

Theme