The Latency Bar We Set for Ourselves
When we started building Profiden's verification orchestration layer, we set an internal SLA: p95 latency of under 3 seconds for synchronous identity and address checks, and under 5 seconds for composite checks that involve multiple parallel data sources. That sounds achievable until you realise the data sources you are calling include government APIs with wildly variable response times, third-party bureau APIs with their own SLA commitments, and edge cases involving network routing to regional data centres.
This post is a technical deep-dive into how we actually hit that target consistently in production.
The Problem: Government APIs Are Not Designed for Real-Time Products
The honest truth about Indian government data APIs — UIDAI for Aadhaar, NSDL for PAN, DigiLocker for documents — is that they were not designed with the latency requirements of real-time onboarding flows in mind. Response times are variable: usually fast, occasionally very slow, sometimes unavailable. Building a product that promises sub-3-second responses while depending on these APIs requires layers of resilience.
Architecture Pattern 1: Parallel Fan-Out
When a verification request comes in for an identity document, we do not call data sources sequentially. We fan out all applicable checks simultaneously and wait for the fastest authoritative response.
For a PAN verification, for example, we fire requests to the NSDL API, a commercial PAN validation database, and our own cached result store at the same time. The first response that crosses our confidence threshold wins. The others are cancelled. This gives us the speed of the fastest source while maintaining fallback resilience.
Architecture Pattern 2: Predictive Cache Warming
Government APIs have cool caches. Our Aadhaar check results, for example, are valid for 24 hours under our data retention policy — and a large proportion of checks are for candidates who have already interacted with another Profiden client in the past. Rather than re-calling the UIDAI API for every check, we warm our result cache based on consented re-use rules and serve cached results where available.
The cache hit rate for repeat identity verifications sits around 34% in production, which meaningfully reduces both our latency and the load on upstream government APIs.
Architecture Pattern 3: Circuit Breakers and Adaptive Routing
When a data source starts responding slowly — say UIDAI is running at 8 seconds instead of 1 — we do not want that slow source to block the entire verification workflow. We implement circuit breakers that:
- Track p95 latency for each upstream source on a 60-second rolling window
- When latency exceeds the threshold, route new requests to alternative sources automatically
- Probe the degraded source every 30 seconds and re-route traffic back when it recovers
This pattern keeps our p95 response time stable even when one or two upstream sources are degraded.
Architecture Pattern 4: Asynchronous Offloading for Slow Checks
Not every check can be synchronous. Criminal record searches, education verification at universities without digital APIs, and multi-jurisdiction court searches are inherently asynchronous — they take hours or days, not seconds. We handle these by returning a partial response immediately (with the fast checks resolved) and delivering the slow checks via webhook when they complete.
The webhook delivery system uses a persistent queue with exactly-once delivery semantics. Failed deliveries are retried with exponential backoff up to 72 hours, ensuring that slow results are never silently dropped.
What the Numbers Look Like
In production across our current client base:
- Aadhaar OTP eKYC: p50 = 1.1s, p95 = 2.6s
- PAN validation: p50 = 0.4s, p95 = 1.2s
- Composite identity + address: p50 = 1.8s, p95 = 4.1s
- EPFO employment lookup: p50 = 2.4s, p95 = 6.8s (asynchronous for p95 outliers)
We miss our 3-second p95 target on composite checks during peak load periods — that is an ongoing area of work. The bottleneck is EPFO API variability, not our own infrastructure.