Thursday, September 19, 2024
HomeHotstarFortifying your API Gateway: Defending Hundreds of thousands of Requests per second...

Fortifying your API Gateway: Defending Hundreds of thousands of Requests per second Towards Potential Exploitations | by Ziheng Wang

[ad_1]

Photograph by Immo Wegmann on Unsplash

Disney+Hotstar is the biggest OTT supplier in India and powers the Disney+ app within the MENA, SEA, and SAARC areas.

One of many key challenges confronted by the platform is authenticating requests to origin APIs, whereas additionally stopping any potential exploits by hackers or malicious customers. Authentication exploits may end up in monetary losses, availability points, and a destructive affect on the consumer expertise.

On this weblog, we’ll discover our journey of constructing a centralized and sturdy authentication mechanism utilizing the Emissary open-source Kubernetes-native API gateway (previously often known as Ambassador). We’ll focus on how our resolution has advanced and the way it can successfully authenticate requests from hundreds of thousands of Hotstar customers.

Let’s stroll by means of our previous-gen resolution for request authentication and learn the way requests movement by means of our techniques.

At Disney+Hotstar, we make the most of JWT tokens for request authentication. Beforehand, our companies had been uncovered to the shopper through AWS Load Balancer (ALB), which resulted in all requests hitting the origin with out authentication. In consequence, our origin companies needed to combine with our in-house token SDK to authenticate and decode the JWT token.

Limitations & Challenges

  • Auth is each service’s duty : On this setup, every client-facing service was required to own an intensive understanding of authentication, which created a possible safety danger. Moreover, distributing token secrets and techniques to quite a few companies violated the precept of “Least Privilege”. Any oversight on this course of may probably result in safety breaches.
  • Inconsistencies on account of SDK variations: Inconsistencies within the model of the token library throughout companies may create difficulties in rolling out token upgrades throughout groups and companies, together with signing key rotation.
  • Inconsistent Error Responses: Unauthenticated error responses could possibly be inconsistent throughout companies, posing a problem in sustaining the enterprise contract between shoppers and companies.

Given these limitations, we determined to relook at our design and discover a resolution that will permit us to beat these gaps.

To mitigate these challenges, we opted for a single ingress authentication that would function a safeguard to all external-facing APIs. After cautious consideration, we selected Emissary Ingress, which is predicated on the high-performance Envoy and affords a variety of versatile plugins akin to ExtAuth, RateLimit, Tracing, and extra. This selection was well-suited to our use instances and offered us with a excessive degree of extensibility.

Structure

Centralized Gateway Authentication Workflow

To realize granular management over APIs, we carried out authentication checks as plugins within the Emissary API Gateway. This allowed us to invoke the plugin solely when particular standards had been met within the incoming request path, guaranteeing that every API had the suitable degree of authentication. In consequence, we not solely improved our safety measures but additionally gained higher flexibility for personalized authentication.

  • Token Authentication: Primary authentication of JWT token by checking the token signature, expiration time, and different related info.
  • third Get together Auth Integration: Pluggable authentication for requests from third-party platforms, permitting for versatile customization of the authentication course of.
  • Silent token refresh: Token’s life cycle fully managed by a single Authentication service, clear to origin companies and shopper.
  • Consumer session identification: To keep away from the necessity to cross consumer tokens and carry out validation throughout a number of companies, we launched a brand new identification construction often known as “Envelope”. This construction is generated as soon as on the Gateway and might be consumed by all origin companies on request chain for widespread knowledge entry.

These enhancements permit us to securely handle consumer identification tokens and shield origin companies from invalid requests.

Subsequent, we’ll dive into how we solved 4 main challenges with the Gateway Authentication resolution.

To securely propagate consumer identification info to our companies with out counting on the possibly fragile JWT token-based propagation, we launched a brand new identification construction referred to as “Envelope”. This construction is modeled as a Protocol Buffer and gives a uniform and safe option to propagate private identification info to origin companies.

Envelope proto construction
  • The Envelope can serve all the data contained within the token and in addition gives flexibility for serving enriched knowledge based mostly on our enterprise necessities.
  • Every Envelope is a short-lived identification token that’s scoped to the lifetime of the shopper request, fully consumed and propagated amongst inner companies in our system.
  • Downstream companies can fetch the properties within the Envelope conveniently by integrating with our Envelope SDK supported in numerous languages.
Centralized Information Enrichment Workflow

There are a number of use-cases the place downstream companies might have related buyer knowledge to serve a wealthy consumer expertise. Consumer Cohorts is one such piece which performs a important position in Hotstar ecosystem. We use cohort knowledge to bucket teams of shoppers that showcase related patterns, and we will then design efficient engagement methods per distinctive cohort.

Let’s take a sensible instance to know this higher. We tag customers who’ve a desire for watching sure sports activities into one cohort group, and push notification to them every time a event related to them is being streamed on Hotstar. This ensures that our clients don’t miss out on their favorite content material.

One other use-case is clients whose subscription plan simply expired or is because of expire shortly — they are going to be tagged into one other cohort group, after which might be reminded to resume the subscription periodically.

We acknowledged that we may considerably enhance the system NFRs (Non-functional Necessities) by enriching these properties as soon as on the edge whereas producing the Envelope.

Context

At instances, it’s obligatory to dam consumer periods by invalidating their tokens after they log off or in the event that they’re flagged as malicious by our RiskEngine (learn extra in our RiskEngineBlog). To perform this, we’d like a contract between the Authentication Service and different elements of our system for token power blocks.

There are additionally instances the place sure occasions, akin to a consumer buying a brand new subscription plan or upgrading their present plan, require asynchronous updates to their token properties. That is the place token power refresh turns into obligatory. By implementing token power block and refresh, we be certain that our system stays safe and our customers’ entry stays up-to-date.

Resolution

To resolve this, naturally we might consider two approaches, both storing the invalidation and refresh record in a knowledge storage like Redis or caching regionally. Nonetheless there are drawbacks with each options when it goes to manufacturing, it’s expensive to verify Redis for each request with the primary strategy as site visitors quantity grows, and the area utilization is certainly an enormous concern with the second strategy because the set saved regionally could possibly be very giant.

To handle these considerations, we launched Bloom Filter that will hearken to token lifecycle occasions hold itself up-to-date. Bloom filter is a space-efficient probabilistic knowledge construction used to verify whether or not a component is a member within the set. Checks at Bloom Filter can solely return both “extremely attainable in set” or “positively not in set”.

“Extremely attainable in set” means there’s a chance {that a} blocked consumer session is in BloomFilter, however truly not. If we block a unsuitable consumer, there might be a destructive affect to our consumer expertise. Subsequently, we might nonetheless do a deep verify in Redis to rule out false positives. Because the blocked and refreshed periods take a tiny share of the entire site visitors, nearly all of the instances might be filtered out by Bloom Filter with out querying Redis.

Through the use of Bloom Filter, we had been capable of scale back our area consumption by 40x.

On this weblog, we talked about why we re-architected the consumer authentication movement and the way we constructed a brand new age authentication system from scratch that takes care of consumer token validation, refresh, user-logouts, adjustments in subscription, consumer knowledge enrichment in envelope and apply safety features at gateway. We additionally talked about fascinating sub-problems round simplifying area constraints to carry out excessive scale authentication checks.

Wish to construct stuff like this? We’re hiring & we’re all the time on the lookout for sensible engineers who love fixing laborious issues. Take a look at open roles at https://tech.hotstar.com.

[ad_2]

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments