Crafting a Stateful Authentication Plugin for Kong in a Heterogeneous Microservice Architecture

API Gateway

Word Count: 2.4k

Read Times: 15 Min

We’re facing a classic technical evolution dilemma. A stable Ruby on Rails monolith, running for years, carries our core business logic, including a complex and deeply coupled user authentication and authorization system. To meet new performance challenges and improve business agility, the team decided to introduce new microservices with a diverse technology stack: a high-concurrency push service for external clients built with Axum (Rust); an internal data processing service using Quarkus (JVM) to leverage the rich Java ecosystem and native compilation capabilities; and a lightweight BFF layer built with Fastify (Node.js).

All external traffic is routed through a Kong API Gateway. The initial approach was naive and straightforward:

-- kong.conf snippet
services:
- name: legacy-rails-service
  url: http://rails-app:3000
- name: new-axum-service
  url: http://axum-app:8000

routes:
- name: rails-route
  service: legacy-rails-service
  paths:
  - /rails
- name: axum-route
  service: new-axum-service
  paths:
  - /axum-push

This simple proxy model pushes the responsibility of authentication and authorization down to each individual service. The problem quickly became apparent: should the Axum, Quarkus, and Fastify development teams each implement logic to call the Rails authentication endpoint to verify user identity? This is not only redundant work but also spreads fragile coupling throughout the entire system. Any change to the Rails authentication logic would be a deployment catastrophe.

Defining the Problem: Unified Authentication & Seamless Migration

The core architectural goal became clear: we need a unified authentication and authorization layer that executes at the API gateway level, transparent to downstream microservices.

The specific requirements were as follows:

Centralized Processing: All requests entering the microservice cluster must complete authentication and authorization checks at the Kong layer.
Legacy System Compatibility: The authentication logic must reuse the existing Rails application’s session or token validation mechanism to avoid rewriting the core auth module during the initial migration phase.
High Performance: For new high-performance services (especially Axum and Fastify), the latency introduced by the authentication layer must be minimal.
Downstream Transparency: After successful authentication, downstream services should be able to obtain user information in a standardized way (e.g., via HTTP headers) without needing to know the details of the authentication process.
High Availability: Any instability or temporary unavailability of the Rails monolith should not cripple the entire authentication system, especially for users with already established sessions.

Option A: Per-Service Middleware Validation

This is the most obvious solution. Each microservice implements an authentication middleware within its own framework.

For example, in the Fastify service, the code might look like this:

// fastify-auth-middleware.js
const axios = require('axios');

const RAILS_AUTH_ENDPOINT = 'http://rails-app:3000/internal/auth/verify';

async function authMiddleware(request, reply) {
    const authToken = request.headers['authorization'];
    if (!authToken) {
        reply.code(401).send({ error: 'Missing authorization token' });
        return;
    }

    try {
        // Make a validation call to Rails for every single request
        const response = await axios.post(RAILS_AUTH_ENDPOINT, {}, {
            headers: { 'Authorization': authToken }
        });

        if (response.status === 200 && response.data.userId) {
            // Validation passed, inject user info into the request
            request.user = { id: response.data.userId, roles: response.data.roles };
        } else {
            reply.code(401).send({ error: 'Invalid token' });
        }
    } catch (error) {
        // Rails service is unavailable or validation failed
        console.error('Auth verification failed:', error.message);
        reply.code(503).send({ error: 'Auth service unavailable' });
    }
}

// Register in the application
// server.addHook('preHandler', authMiddleware);

Pros:

Intuitive to implement; each team can work within their familiar tech stack.

Cons:

Code Redundancy: Every service needs to implement nearly identical logic.
Tight Coupling: All new services become directly dependent on an internal Rails API, creating a difficult-to-manage dependency web.
Performance Bottleneck: Each external request triggers an additional internal network call (Service -> Rails). Under high concurrency, this dramatically increases response latency and puts immense pressure on the Rails application.
Inconsistent Policies: Caching, retry, and timeout strategies are difficult to keep uniform across different languages and frameworks, leading to inconsistent system behavior.

This solution is unacceptable in a real-world project. It breaks down the problem at the cost of proliferating complexity.

Option B: A Standalone Authentication Microservice

The second option is to build a new, dedicated microservice for authentication. This service would encapsulate calls to the Rails authentication logic.

graph TD
    subgraph "Request Flow"
        Client -->|Request with Token| Kong
        Kong -->|Forward| Auth_Service
        Auth_Service -->|Verify Token| Rails_App
        Rails_App -->|Validation Result| Auth_Service
        Auth_Service -->|User Info or 401| Kong
        Kong -->|Proxy with User Header| Upstream_Service(Axum/Quarkus/Fastify)
    end

Pros:

Separation of Concerns: Authentication logic is cleanly isolated, adhering to the single-responsibility principle of microservices.
Decoupling: Upstream services no longer depend directly on Rails but on a well-defined authentication service interface.

Cons:

Operational Complexity: Introduces a new, critical service to maintain. Its own high availability becomes a single point of failure for the entire system.
Performance Not Fully Resolved: Still introduces an extra network hop (Client -> Kong -> Auth Service -> Upstream). While it might be better than Option A (e.g., caching can be implemented within the Auth Service), the request path is still elongated.
Migration Pains: Building this service requires development resources, and it still needs a way to interact with the Rails session store (e.g., shared database or internal API calls), which is a thorny technical problem in itself.

The Final Choice: A Custom Stateful Plugin for Kong

We ultimately chose to solve the problem at its source: the API gateway. By writing a custom Lua plugin for Kong, we can handle all authentication logic at the traffic ingress point.

Reasoning:

Optimal Performance: Plugin code runs within Kong’s Nginx worker processes and is executed by LuaJIT, delivering extremely high performance. It can communicate with upstream services using internal APIs like ngx.location.capture, which is much more lightweight than a full HTTP client call.
Centralized Control: All authentication logic, caching policies, and timeout settings are consolidated in one place, making them easy to manage and audit.
Complete Downstream Transparency: Microservice developers don’t need to worry about authentication details at all. They can simply trust the X-User-ID and other headers injected by Kong.
Resilient by Design: The plugin can implement a powerful internal caching mechanism. Even if the Rails authentication service is temporarily unavailable, requests can still be processed successfully as long as the user’s authentication information is in the cache, significantly improving system resilience.

Core Implementation: The `kong-legacy-auth` Plugin

We named the plugin kong-legacy-auth. A Kong plugin typically consists of two files: handler.lua (core logic) and schema.lua (configuration definition).

1. `schema.lua`: Defining Plugin Configuration

This file defines the plugin’s configurable options, allowing operators to pass in parameters dynamically when enabling the plugin.

-- kong/plugins/kong-legacy-auth/schema.lua

-- Comprehensive comments are a hallmark of production-grade code.
return {
  no_consumer = true, -- This plugin is global and not tied to a specific Consumer.
  fields = {
    auth_url = {
      type = "string",
      required = true,
      description = "The internal Rails endpoint URL for validating tokens or sessions."
    },
    auth_method = {
      type = "string",
      default = "POST",
      one_of = {"GET", "POST"},
      description = "The HTTP method to use when calling the validation endpoint."
    },
    timeout = {
      type = "number",
      default = 2000, -- Default 2-second timeout
      description = "Timeout in milliseconds for the validation endpoint call."
    },
    cache_ttl = {
      type = "number",
      default = 300, -- Default 5-minute cache
      description = "Cache TTL in seconds for successful authentication results. 0 disables caching."
    },
    user_id_header = {
      type = "string",
      default = "X-User-ID",
      description = "The name of the header to inject with the user ID for downstream requests."
    },
    user_roles_header = {
      type = "string",
      default = "X-User-Roles",
      description = "The name of the header to inject with user roles (comma-separated) for downstream requests."
    }
  }
}

This schema clearly defines the plugin’s behavior and promotes good maintainability.

2. `handler.lua`: The Core Authentication Logic

This is the heart of the plugin, where all the processing logic resides. We primarily leverage the access phase, which executes before Kong proxies the request to an upstream service.

-- kong/plugins/kong-legacy-auth/handler.lua

local BasePlugin = require "kong.plugins.base_plugin"
local http = require "resty.http"
local cjson = require "cjson.safe"

local LegacyAuthHandler = BasePlugin:extend()

LegacyAuthHandler.PRIORITY = 1000 -- Ensures it runs before other plugins.
LegacyAuthHandler.VERSION = "1.0.0"

function LegacyAuthHandler:new()
  LegacyAuthHandler.super.new(self, "kong-legacy-auth")
end

-- The core access phase handler function
function LegacyAuthHandler:access(conf)
  LegacyAuthHandler.super.access(self)

  local authorization_header = kong.request.get_header("Authorization")
  if not authorization_header then
    return kong.response.exit(401, { message = "Authorization header is missing" })
  end

  -- 1. Cache-first strategy
  -- A crucial detail here: the cache key must be unique and unambiguous. Using the token directly is a good choice.
  local cache_key = "legacy_auth:" .. authorization_header
  if conf.cache_ttl > 0 then
    local cached_user_data_str = kong.cache:get(cache_key)
    if cached_user_data_str then
      local user_data, err = cjson.decode(cached_user_data_str)
      if not err and user_data then
        kong.log.debug("Auth cache hit for key: ", cache_key)
        -- Set headers from cache
        kong.service.request.set_header(conf.user_id_header, user_data.id)
        if user_data.roles then
          kong.service.request.set_header(conf.user_roles_header, table.concat(user_data.roles, ","))
        end
        return -- Authentication successful, return immediately.
      end
    end
  end

  -- 2. Cache miss, proceed with live validation
  kong.log.debug("Auth cache miss, proceeding with live validation for key: ", cache_key)
  
  -- Use resty.http for network calls, standard practice in OpenResty.
  local httpc, err = http.new()
  if not httpc then
    kong.log.err("failed to create http client: ", err)
    return kong.response.exit(500, { message = "Internal Server Error: Cannot create HTTP client" })
  end

  httpc:set_timeout(conf.timeout)

  -- Construct the request to the Rails validation service.
  local res, err = httpc:request_uri(conf.auth_url, {
    method = conf.auth_method,
    headers = {
      ["Authorization"] = authorization_header,
      ["Content-Type"] = "application/json"
    }
  })

  -- 3. Robust error handling
  if not res then
    kong.log.err("failed to request auth service: ", err)
    -- In a real project, this should have more granular error classification (e.g., timeout, connection refused).
    return kong.response.exit(503, { message = "Authentication service unavailable" })
  end

  -- Close the connection to release resources.
  httpc:close()

  -- 4. Process the validation result
  if res.status == 200 then
    local body_str = res.body
    local body, json_err = cjson.decode(body_str)

    if json_err or not body.user_id then
      kong.log.err("failed to decode auth response or missing user_id: ", json_err or "nil")
      return kong.response.exit(500, { message = "Invalid response from authentication service" })
    end

    -- Authentication successful, set headers.
    kong.service.request.set_header(conf.user_id_header, body.user_id)
    
    local roles_str = ""
    if body.roles and type(body.roles) == "table" then
      roles_str = table.concat(body.roles, ",")
      kong.service.request.set_header(conf.user_roles_header, roles_str)
    end

    -- 5. Write the successful result to the cache
    if conf.cache_ttl > 0 then
      local user_data_to_cache = {
        id = body.user_id,
        roles = body.roles or {}
      }
      local cache_val, cache_err = cjson.encode(user_data_to_cache)
      if not cache_err then
        local ok, err = kong.cache:set(cache_key, cache_val, conf.cache_ttl)
        if not ok then
            kong.log.warn("failed to set auth cache: ", err)
        end
      end
    end

  elseif res.status == 401 or res.status == 403 then
    -- Explicit authentication/authorization failure
    return kong.response.exit(res.status, { message = "Invalid credentials" })
  else
    -- Other unexpected errors
    kong.log.warn("auth service returned unexpected status: ", res.status)
    return kong.response.exit(502, { message = "Bad response from authentication service" })
  end
end

return LegacyAuthHandler

Simplification for Downstream Services

With this plugin in place, the code for downstream microservices becomes extremely simple. They no longer need to care about the complex authentication process and can simply trust the user information passed by the gateway.

For the Axum service, the code to get the user ID looks like this:

// axum-service/src/main.rs
use axum::{
    async_trait,
    extract::{FromRequestParts, State},
    http::{request::Parts, HeaderMap, StatusCode},
    response::{IntoResponse, Response},
    routing::get,
    Router,
};
use std::net::SocketAddr;

// Define an Extractor to safely extract user information
struct AuthenticatedUser {
    id: i64,
    roles: Vec<String>,
}

#[async_trait]
impl<S> FromRequestParts<S> for AuthenticatedUser
where
    S: Send + Sync,
{
    type Rejection = (StatusCode, &'static str);

    async fn from_request_parts(parts: &mut Parts, _state: &S) -> Result<Self, Self::Rejection> {
        let headers = &parts.headers;

        // Extract user ID from the header set by the Kong plugin
        let user_id_str = headers
            .get("X-User-ID")
            .and_then(|value| value.to_str().ok())
            .ok_or((StatusCode::UNAUTHORIZED, "Missing or invalid X-User-ID header"))?;
        
        let id = user_id_str.parse::<i64>().map_err(|_| {
            (StatusCode::UNAUTHORIZED, "Invalid user ID format")
        })?;

        // Extract role information
        let roles = headers
            .get("X-User-Roles")
            .and_then(|value| value.to_str().ok())
            .map(|s| s.split(',').map(String::from).collect())
            .unwrap_or_else(Vec::new);

        Ok(AuthenticatedUser { id, roles })
    }
}

// Business Handler
async fn protected_route(user: AuthenticatedUser) -> Response {
    // Business logic can directly use user.id and user.roles
    // For example, check if the user has permission for this action
    if !user.roles.contains(&"admin".to_string()) {
        return (StatusCode::FORBIDDEN, "Admin role required").into_response();
    }
    format!("Hello, admin user with ID: {}!", user.id).into_response()
}

// ... main function to start the server

This pattern dramatically reduces the cognitive load for microservice developers.

Architectural Limitations and Future Evolution

While this custom Kong plugin elegantly solves the immediate problem of unified authentication in a heterogeneous environment, it is not a silver bullet.

First, the Rails monolith remains the single source of truth for authentication, making it a potential bottleneck and point of failure. Our caching strategy (cache_ttl) effectively mitigates this, turning a hard dependency on Rails into a soft one, but this is a fault-tolerance measure, not a fundamental cure. The pressure on the Rails app will still exist during moments of high new user traffic or when caches are invalidated en masse.

Second, writing and deploying authentication logic in Lua within Kong introduces a degree of tech stack complexity. The team must be capable of developing, testing, and maintaining Lua code. If complex business authorization logic (like Attribute-Based Access Control, or ABAC) were implemented entirely in the plugin, it would become bloated and difficult to debug.

The long-term evolutionary path should be to gradually extract the authentication and user models from Rails into a completely independent, highly available identity provider. At that point, the kong-legacy-auth plugin’s mission will be complete, and it can be replaced by standard OIDC or JWT plugins. But until then, as a pragmatic and cost-effective transitional architecture, it buys the team a valuable window for refactoring while ensuring the rapid iteration of new business features.