From Local Hack to Production-Ready: How We Solved the BrainGrid's MCP Multi-Tenant Authentication Problem
We had to solve the MCP multi-tenant authentication problem to make our agent accessible to our customers. Here's how we did it.
You've built an amazing MCP server. It works perfectly on your laptop. Your AI assistant can create Jira tickets, query your database, deploy to production - life is good. Then your teammate asks: "Hey, can I use this too?"
Even better, you want to ship your MCP as a product for your customers. You now need to support multiple tenants, each with their own API keys and authentication.
Suddenly, you're in hell.
The Problem Nobody Talks About
Here's what happens when you try to share your MCP server with your customers:
Option 1: The "Just Install It" Approach
# Your instructions to teammates: 1. Clone the repo 2. Install dependencies 3. Set up your API keys 4. Configure your environment 5. Run the server locally 6. Oh, and update these keys when they expire... 7. And don't forget to pull the latest changes... 8. BTW, it might conflict with your other Node versions...
Result: 3 hours later, half your customer gave up, the other half is debugging npm issues.
Option 2: The "Let's Host It" Nightmare
You deploy to a Serverless platform like Cloud Run or Vercel. Five minutes later:
customer: "It's asking me to authenticate... again"
you: "Yeah, just refresh and login again"
customer: "I just did. It's asking again."
you: "Oh, that's because Cloud Run scales to zero and..."
customer: "I don't care why. I just want to create a ticket."
The core issue? Serverless platforms don't do sessions. Every request could hit a different instance. Your carefully crafted auth flow becomes a game of authentication whack-a-mole.
Why This Matters More Than You Think
This isn't just an annoyance. It's the difference between:
- A tool only you use vs A tool your entire customer base adopts
- "Cool prototype" vs "Critical infrastructure"
- Weekend project vs Production-ready product customers actually use
We learned this the hard way at BrainGrid. Our MCP server transformed how our team worked with AI - but only after we solved the authentication puzzle we were ready to ship to our customers.
What You'll Learn
This guide shows you exactly how we transformed our MCP server from a local development tool into a production-ready service that:
- Authenticates once, works everywhere - No more login fatigue
- Scales from 1 to 1000 users - Same performance whether it's just you or the whole company
- Costs pennies to run - Efficient caching means minimal cloud costs
- Works with existing auth - Integrates with WorkOS, Auth0, or any OAuth provider
- Deploys in minutes - One command to go from local to remote
We'll cover the exact architecture, the gotchas we discovered, and the code that makes it all work. No theory, no fluff - just battle-tested solutions from our production deployment serving hundreds of developers.
Ready to make your MCP server something your customers will actually want to use? Let's dive in.
How we got there
- Initial Setup: From Local to Remote
- The Serverless Challenge
- Technical Solution: Redis Session Store
- Production Deployment Strategies
- Monitoring and Debugging
- Performance Optimization
- The Paradigm Shift
Initial Setup: From Local to Remote
Step 1: Basic MCP Server Configuration
Start with a standard MCP server setup using FastMCP. The key is understanding the dual nature of MCP servers - they need to work both locally for development and remotely for customers to use.
import { FastMCP } from 'fastmcp'; import { z } from 'zod'; // Define your tool schemas const CreateRequirementSchema = z.object({ message: z.string().describe("The requirement description"), repositories: z.string().optional().describe("Comma-separated list of repos") }); const server = new FastMCP({ name: 'braingrid-server', version: '1.0.0' }); // Add your tools server.addTool({ name: 'create_requirement', description: 'Create a new requirement in BrainGrid', parameters: CreateRequirementSchema, execute: async (args, context) => { // Tool implementation // Note: context.session contains user auth info when hosted const apiClient = new BrainGridApiClient(config, context?.session); return await apiClient.createRequirement(args); } }); // Local development (stdio transport) await server.start({ transportType: 'stdio' });
Step 2: Switching to httpStream for Remote Hosting
To deploy on Cloud Run or Vercel, switch to httpStream transport. This requires careful consideration of how your tools will handle authentication:
// Detect transport type from environment const transportType = process.env.MCP_TRANSPORT || 'stdio'; // httpStream configuration for serverless if (transportType === 'httpStream') { await server.start({ transportType: 'httpStream', httpStream: { port: parseInt(process.env.PORT || '8080'), endpoint: '/mcp' } }); } else { // Local stdio transport await server.start({ transportType: 'stdio' }); }
Step 3: Implementing OAuth with WorkOS
MCP requires specific OAuth implementation patterns. The key insight is that MCP clients expect a particular discovery flow:
const serverOptions = { name: 'braingrid-server', version: '1.0.0', authenticate: authenticateRequest, oauth: { enabled: true, protectedResource: { resource: 'https://mcp.braingrid.ai', authorizationServers: ['https://auth.workos.com'], bearerMethodsSupported: ['header'], }, // This is crucial for MCP client compatibility authorizationServer: { issuer: 'https://auth.workos.com', authorizationEndpoint: 'https://auth.workos.com/oauth2/authorize', tokenEndpoint: 'https://auth.workos.com/oauth2/token', jwksUri: 'https://auth.workos.com/oauth2/jwks', // Note: Not /.well-known/jwks.json responseTypesSupported: ['code'], grantTypesSupported: ['authorization_code', 'refresh_token'], codeChallengeMethodsSupported: ['S256'], tokenEndpointAuthMethodsSupported: ['none'], scopesSupported: ['email', 'offline_access', 'openid', 'profile'], } } };
Key implementation detail: The WWW-Authenticate header must be properly formatted for MCP clients:
// MCP session structure - what gets passed to your tools interface MCPSession { userId: string; email: string; organizationId: string; scopes: string[]; token: string; } export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> { const authHeader = request.headers.authorization; if (!authHeader) { // MCP clients expect this specific format throw new Response(null, { status: 401, headers: { 'WWW-Authenticate': 'Bearer error="unauthorized", ' + 'error_description="Authorization needed", ' + 'resource_metadata="https://mcp.braingrid.ai/.well-known/oauth-protected-resource"' } }); } // Extract bearer token const bearerMatch = authHeader.match(/^Bearer (.+)$/); if (!bearerMatch) { throw new Response(null, { status: 401, headers: { 'WWW-Authenticate': 'Bearer error="invalid_token", ' + 'error_description="Invalid authorization header format"' } }); } const token = bearerMatch[1]; // Validate JWT and return session return await validateAndCreateSession(token); }
Step 4: Handling Dual Transport Modes
Your MCP server needs to support both local and remote authentication patterns:
export class BrainGridApiClient { private auth?: AuthHandler; private session?: MCPSession; private readonly config: { apiUrl: string; organizationId?: string }; constructor(config: { apiUrl: string; organizationId?: string }, session?: MCPSession) { this.config = config; this.session = session; // Only create AuthHandler for local mode if (!session) { this.auth = new AuthHandler(config); } } private async getHeaders(): Promise<Record<string, string>> { if (this.session) { // Remote mode - use session token return { 'Authorization': `Bearer ${this.session.token}`, 'X-Organization-Id': this.session.organizationId, 'Content-Type': 'application/json', }; } else if (this.auth) { // Local mode - use stored auth return this.auth.getOrganizationHeaders(); } throw new Error('No authentication method available'); } }
The Serverless Challenge
Serverless platforms like Cloud Run and Vercel share fundamental characteristics that create unique challenges for stateful applications:
1. Instance Lifecycle Management
Serverless instances have unpredictable lifecycles:
- Cold starts: New instances spin up on demand
- Scale to zero: Instances terminate after inactivity
- Horizontal scaling: Multiple instances serve concurrent requests
- No sticky sessions: Requests can hit any instance
This creates specific challenges for MCP servers:
// This approach fails in serverless: class NaiveMCPServer { private sessions = new Map<string, MCPSession>(); // ❌ Lost on instance restart async authenticate(token: string): Promise<MCPSession> { // Check memory cache if (this.sessions.has(token)) { return this.sessions.get(token)!; } // Validate and cache const session = await validateJWT(token); this.sessions.set(token, session); // ❌ Only exists on this instance return session; } }
2. JWT Validation Overhead
Without session persistence, your MCP server performs full JWT validation on every request:
async function validateJWT(token: string): Promise<MCPSession> { // Step 1: Fetch JWKS (Network call ~50ms) const jwks = await fetchJWKS('https://auth.workos.com/oauth2/jwks'); // Step 2: Verify signature (CPU intensive ~10ms) const verified = await jose.jwtVerify(token, jwks); // Step 3: Check claims (CPU ~5ms) if (verified.payload.iss !== 'https://auth.workos.com') { throw new Error('Invalid issuer'); } // Step 4: Extract session data return { userId: verified.payload.sub, email: verified.payload.email, organizationId: verified.payload.org_id, scopes: verified.payload.scopes, token: token }; }
This adds 50-100ms to every request and increases costs significantly.
3. Re-authentication Fatigue
The user experience without session persistence:
Timeline of a frustrated developer:
0:00 - Connect to MCP server ✓
0:01 - Authenticate via WorkOS ✓
0:02 - Create requirement ✓
0:05 - (Cloud Run scales instance to zero)
0:10 - Try to update task ✗ "Please authenticate again"
0:11 - Re-authenticate 😤
0:12 - Update task ✓
0:15 - (New instance due to load)
0:16 - Try to commit ✗ "Please authenticate again"
0:17 - Rage quit
Technical Solution: Redis Session Store with Encryption
Architecture Overview
The solution implements a multi-tier caching strategy with security at its core:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Request │────▶│ Memory │────▶│ Redis │
│ │ │ Cache │ │ Cache │
└─────────────┘ └─────────────┘ └─────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────┐
│ JWT │ │ JWT │
│ Validation │ │ Validation │
└─────────────┘ └─────────────┘
Implementation Details
Session Store with AES-256-GCM Encryption
The session store implements military-grade encryption for sensitive session data:
import { Redis } from 'ioredis'; import crypto from 'crypto'; import { MCPSession } from './types.js'; import { logger } from './logger.js'; export class SessionStore { private redis: Redis | null = null; private encryptionKey: Buffer | null = null; private algorithm = 'aes-256-gcm'; private readonly ttl: number; private keyPrefix = 'mcp:session:'; constructor() { // Only initialize for httpStream transport if (process.env.MCP_TRANSPORT !== 'httpStream') { logger.debug('SessionStore not initialized - stdio transport'); return; } const redisUrl = process.env.REDIS_URL; const encryptionKeyHex = process.env.ENCRYPTION_KEY; if (!redisUrl || !encryptionKeyHex) { logger.warn('Session persistence disabled - missing configuration'); return; } // Validate encryption key length if (encryptionKeyHex.length !== 64) { throw new Error('ENCRYPTION_KEY must be 32 bytes (64 hex characters)'); } this.encryptionKey = Buffer.from(encryptionKeyHex, 'hex'); this.ttl = parseInt(process.env.SESSION_CACHE_TTL || '604800', 10); // Configure Redis with production-ready settings this.redis = new Redis(redisUrl, { // Retry strategy with exponential backoff retryStrategy: (times) => { const delay = Math.min(times * 50, 2000); logger.debug(`Redis retry attempt ${times}, delay: ${delay}ms`); return delay; }, // Reconnect on READONLY errors (Redis failover) reconnectOnError: (err) => { const shouldReconnect = err.message.includes('READONLY'); if (shouldReconnect) { logger.warn('Redis READONLY error, reconnecting...'); } return shouldReconnect; }, // Connection settings connectTimeout: 10000, maxRetriesPerRequest: 3, enableReadyCheck: true, enableOfflineQueue: false, // Fail fast in production }); // Monitor Redis connection health this.redis.on('connect', () => logger.info('Redis connected')); this.redis.on('ready', () => logger.info('Redis ready')); this.redis.on('error', (err) => logger.error({ err }, 'Redis error')); this.redis.on('close', () => logger.warn('Redis connection closed')); } /** * Check if session store is available */ isAvailable(): boolean { return this.redis !== null && this.redis.status === 'ready' && this.encryptionKey !== null; } /** * Store encrypted session with automatic expiration */ async storeSession(session: MCPSession): Promise<void> { if (!this.isAvailable()) { logger.debug('Session store unavailable, skipping storage'); return; } try { // Generate unique IV for each encryption const iv = crypto.randomBytes(16); const cipher = crypto.createCipheriv(this.algorithm, this.encryptionKey!, iv); // Encrypt session data const sessionJson = JSON.stringify(session); const encrypted = Buffer.concat([ cipher.update(sessionJson, 'utf8'), cipher.final() ]); // Get authentication tag for GCM const authTag = cipher.getAuthTag(); // Combine components: IV (16) + AuthTag (16) + Encrypted Data const combined = Buffer.concat([iv, authTag, encrypted]); const encoded = combined.toString('base64'); // Store with TTL const key = `${this.keyPrefix}${session.userId}`; await this.redis!.setex(key, this.ttl, encoded); logger.debug({ userId: session.userId, keySize: encoded.length, ttl: this.ttl }, 'Session stored successfully'); } catch (error) { logger.error({ error }, 'Failed to store session'); // Don't throw - graceful degradation } } /** * Retrieve and decrypt session */ async getSession(userId: string): Promise<MCPSession | null> { if (!this.isAvailable()) { return null; } const startTime = Date.now(); try { const key = `${this.keyPrefix}${userId}`; const encoded = await this.redis!.get(key); if (!encoded) { logger.debug({ userId }, 'Session not found in cache'); return null; } // Decode and extract components const combined = Buffer.from(encoded, 'base64'); const iv = combined.slice(0, 16); const authTag = combined.slice(16, 32); const encrypted = combined.slice(32); // Decrypt with authentication const decipher = crypto.createDecipheriv(this.algorithm, this.encryptionKey!, iv); decipher.setAuthTag(authTag); const decrypted = Buffer.concat([ decipher.update(encrypted), decipher.final() ]); const session = JSON.parse(decrypted.toString('utf8')) as MCPSession; const elapsed = Date.now() - startTime; logger.debug({ userId, elapsed }, 'Session retrieved from cache'); return session; } catch (error) { if (error instanceof Error && error.message.includes('Unsupported state or unable to authenticate data')) { logger.error({ userId }, 'Session decryption failed - possible tampering'); } else { logger.error({ error, userId }, 'Failed to retrieve session'); } return null; } } /** * Remove session (for logout) */ async removeSession(userId: string): Promise<void> { if (!this.isAvailable()) return; try { const key = `${this.keyPrefix}${userId}`; await this.redis!.del(key); logger.debug({ userId }, 'Session removed'); } catch (error) { logger.error({ error, userId }, 'Failed to remove session'); } } /** * Clean shutdown */ async close(): Promise<void> { if (this.redis) { await this.redis.quit(); this.redis = null; } } } // Singleton instance export const sessionStore = new SessionStore();
Optimized Authentication Middleware
The authentication middleware implements a fast-path/slow-path pattern:
import { IncomingMessage } from 'http'; import crypto from 'crypto'; import { decodeJwt, createRemoteJWKSet, jwtVerify, JWTPayload } from 'jose'; import { sessionStore } from './session-store.js'; import { logger } from './logger.js'; import { MCPSession } from './types.js'; export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> { const requestId = crypto.randomUUID(); const startTime = Date.now(); logger.debug({ requestId, method: request.method, url: request.url }, 'Authentication request started'); try { // Extract bearer token const token = extractBearerToken(request); if (!token) { throw new UnauthorizedError('No bearer token provided'); } // Fast path: Try to decode JWT for userId let userId: string | null = null; let tokenExp: number | null = null; try { const decoded = decodeJwt(token); userId = decoded.sub || null; tokenExp = decoded.exp || null; // Quick expiration check if (tokenExp && tokenExp < Date.now() / 1000) { logger.debug({ requestId, userId }, 'Token expired, skipping cache'); userId = null; // Force validation } } catch (error) { logger.debug({ requestId }, 'Failed to decode JWT for cache lookup'); } // Try cache if we have a userId if (userId && sessionStore.isAvailable()) { const cached = await sessionStore.getSession(userId); if (cached && cached.token === token) { const elapsed = Date.now() - startTime; logger.info({ requestId, userId, elapsed, source: 'cache' }, 'Authentication successful (cached)'); return cached; } } // Slow path: Full JWT validation logger.debug({ requestId }, 'Cache miss, performing JWT validation'); const session = await validateJWTWithWorkOS(token); // Store for next time if (sessionStore.isAvailable()) { await sessionStore.storeSession(session); } const elapsed = Date.now() - startTime; logger.info({ requestId, userId: session.userId, elapsed, source: 'jwt' }, 'Authentication successful (validated)'); return session; } catch (error) { const elapsed = Date.now() - startTime; logger.error({ requestId, error: error instanceof Error ? error.message : 'Unknown error', elapsed }, 'Authentication failed'); // Return proper HTTP response for MCP if (error instanceof UnauthorizedError) { throw new Response(null, { status: 401, headers: { 'WWW-Authenticate': `Bearer error="unauthorized", ` + `error_description="${error.message}", ` + `resource_metadata="${getResourceMetadataUrl()}"` } }); } throw error; } } function extractBearerToken(request: IncomingMessage): string | null { const authHeader = request.headers.authorization; if (!authHeader) return null; const match = authHeader.match(/^Bearer (.+)$/); return match ? match[1] : null; } class UnauthorizedError extends Error { constructor(message: string) { super(message); this.name = 'UnauthorizedError'; } } function getResourceMetadataUrl(): string { const host = process.env.MCP_HOST || 'https://mcp.braingrid.ai'; return `${host}/.well-known/oauth-protected-resource`; } // JWT validation with WorkOS const jwksCache = new Map<string, ReturnType<typeof createRemoteJWKSet>>(); async function validateJWTWithWorkOS(token: string): Promise<MCPSession> { const issuer = process.env.WORKOS_ISSUER || 'https://auth.workos.com'; try { // Get or create JWKS let jwks = jwksCache.get(issuer); if (!jwks) { jwks = createRemoteJWKSet(new URL(`${issuer}/oauth2/jwks`)); jwksCache.set(issuer, jwks); } // Verify JWT with options const verifyOptions: any = { issuer, algorithms: ['RS256'], }; // Only check audience if configured if (process.env.WORKOS_CLIENT_ID) { verifyOptions.audience = process.env.WORKOS_CLIENT_ID; } const { payload } = await jwtVerify(token, jwks, verifyOptions); // Validate required claims if (!payload.sub || !payload.email || !payload.org_id) { throw new Error('Missing required JWT claims'); } // Create session from JWT claims return { userId: payload.sub, email: payload.email as string, organizationId: payload.org_id as string, scopes: Array.isArray(payload.scopes) ? payload.scopes : [], token, }; } catch (error) { logger.error({ error: error instanceof Error ? error.message : 'Unknown error' }, 'JWT validation failed'); throw error; } }
Graceful Degradation
The implementation handles Redis failures gracefully by simply returning null and forcing re-authentication. This is intentional - in a serverless environment, there's no point in falling back to in-memory caching since each instance has its own memory. Better to fail fast and have the user re-authenticate than to create inconsistent state.
Production Deployment Strategies
Cloud Run Configuration
Create a comprehensive deployment configuration:
# Multi-stage build for optimization FROM node:22-alpine AS builder WORKDIR /app # Copy package files COPY package*.json ./ COPY pnpm-lock.yaml ./ # Install dependencies RUN npm install -g pnpm && pnpm install --frozen-lockfile # Copy source code COPY . . # Build TypeScript RUN pnpm run build # Production stage FROM node:22-alpine WORKDIR /app # Install production dependencies only COPY package*.json ./ COPY pnpm-lock.yaml ./ RUN npm install -g pnpm && pnpm install --prod --frozen-lockfile # Copy built application COPY /app/dist ./dist # Set environment ENV NODE_ENV=production ENV MCP_TRANSPORT=httpStream # Health check HEALTHCHECK \ CMD node -e "require('http').get('http://localhost:8080/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))" EXPOSE 8080 CMD ["node", "dist/server.js"]
Deploy with proper configuration:
#!/bin/bash # deploy-cloud-run.sh PROJECT_ID="your-project-id" SERVICE_NAME="braingrid-mcp-server" REGION="us-central1" REDIS_URL="rediss://<your-redis-instance>" # Build and push image gcloud builds submit --tag gcr.io/${PROJECT_ID}/${SERVICE_NAME} # Deploy to Cloud Run gcloud run deploy ${SERVICE_NAME} \ --image gcr.io/${PROJECT_ID}/${SERVICE_NAME} \ --platform managed \ --region ${REGION} \ --allow-unauthenticated \ --set-env-vars "MCP_TRANSPORT=httpStream" \ --set-env-vars "BRAINGRID_ENV=production" \ --set-env-vars "REDIS_URL=${REDIS_URL}" \ --set-secrets "ENCRYPTION_KEY=mcp-encryption-key:latest" \ --cpu 1 \ --memory 512Mi \ --min-instances 1 \ --max-instances 100 \ --concurrency 80 \ --timeout 300
Vercel Configuration
For Vercel deployment, create vercel.json
:
{ "version": 2, "builds": [ { "src": "dist/server.js", "use": "@vercel/node" } ], "routes": [ { "src": "/health", "dest": "/dist/server.js" }, { "src": "/mcp", "dest": "/dist/server.js" }, { "src": "/.well-known/oauth-protected-resource", "dest": "/dist/server.js" } ], "env": { "MCP_TRANSPORT": "httpStream", "NODE_ENV": "production" } }
Monitoring and Debugging
Structured Logging
Implement comprehensive logging for production debugging:
import pino from 'pino'; // Configure structured logging export const logger = pino({ level: process.env.LOG_LEVEL || 'info', transport: process.env.NODE_ENV === 'production' ? undefined : { target: 'pino-pretty', options: { colorize: true, translateTime: 'HH:MM:ss Z', ignore: 'pid,hostname' } }, formatters: { level: (label) => { return { level: label }; } }, serializers: { req: (req) => ({ method: req.method, url: req.url, headers: { ...req.headers, authorization: req.headers.authorization ? '[REDACTED]' : undefined } }), err: pino.stdSerializers.err } }); // Request tracking middleware import { IncomingMessage, ServerResponse } from 'http'; export function requestLogging() { return (req: IncomingMessage, res: ServerResponse, next: () => void) => { const start = Date.now(); const requestId = crypto.randomUUID(); // Attach to request (req as any).requestId = requestId; // Log request logger.info({ requestId, req, type: 'request' }, 'Incoming request'); // Log response res.on('finish', () => { const elapsed = Date.now() - start; logger.info({ requestId, statusCode: res.statusCode, elapsed, type: 'response' }, 'Request completed'); }); next(); }; }
Metrics Collection
For production deployments, export metrics to your observability platform:
// Example: Exporting MCP tool call metrics to DataDog import { StatsD } from 'node-dogstatsd'; const dogstatsd = new StatsD({ host: process.env.DD_AGENT_HOST || 'localhost', port: 8125, prefix: 'mcp.server.', tags: [`env:${process.env.BRAINGRID_ENV || 'development'}`] }); // Track tool usage export function recordToolCall(toolName: string, duration: number, success: boolean) { // Record timing metric dogstatsd.timing('tool.call.duration', duration, [ `tool:${toolName}`, `status:${success ? 'success' : 'failure'}` ]); // Increment counter dogstatsd.increment('tool.call.count', 1, [ `tool:${toolName}`, `status:${success ? 'success' : 'failure'}` ]); } // In your tool implementation: server.addTool({ name: 'create_requirement', execute: async (args, context) => { const startTime = Date.now(); try { const result = await apiClient.createRequirement(args); recordToolCall('create_requirement', Date.now() - startTime, true); return result; } catch (error) { recordToolCall('create_requirement', Date.now() - startTime, false); throw error; } } });
Performance Optimization
Connection Pooling
Optimize Redis connections for serverless:
// Redis connection pool for serverless export class RedisConnectionPool { private static instance: Redis | null = null; static getInstance(): Redis | null { if (!this.instance && process.env.REDIS_URL) { this.instance = new Redis(process.env.REDIS_URL, { // Connection pool settings maxRetriesPerRequest: 3, enableReadyCheck: true, lazyConnect: true, // Important for serverless // Serverless-optimized timeouts connectTimeout: 5000, commandTimeout: 5000, // Connection reuse keepAlive: 30000, noDelay: true, // Handle connection errors gracefully retryStrategy: (times) => { if (times > 3) return null; // Stop retrying return Math.min(times * 100, 3000); } }); // Ensure connection is established this.instance.connect().catch((err: Error) => { logger.error({ err: err.message }, 'Redis connection failed'); this.instance = null; }); } return this.instance; } static async close(): Promise<void> { if (this.instance) { await this.instance.quit(); this.instance = null; } } }
Request Batching
Optimize for concurrent requests:
export class BatchedJWTValidator { private readonly pendingValidations = new Map<string, Promise<MCPSession>>(); async validateToken(token: string): Promise<MCPSession> { // Check if validation is already in progress if (this.pendingValidations.has(token)) { logger.debug('Reusing pending validation'); return this.pendingValidations.get(token)!; } // Start new validation const validationPromise = this.performValidation(token) .finally(() => { // Clean up after completion this.pendingValidations.delete(token); }); this.pendingValidations.set(token, validationPromise); return validationPromise; } private async performValidation(token: string): Promise<MCPSession> { // Actual JWT validation logic return validateJWTWithWorkOS(token); } }
Conclusion
Hosting MCP servers in serverless environments is challenging, but the patterns we've covered make it possible to build production-ready solutions that scale.
The key technical takeaways:
- Session persistence is non-negotiable - Without Redis or similar external storage, your users face constant re-authentication
- Security can't be an afterthought - Proper encryption (AES-256-GCM) and secure token handling are essential
- Fast-path optimization matters - JWT validation is expensive; caching authenticated sessions dramatically improves performance
- Graceful degradation over complex fallbacks - When Redis fails, force re-authentication rather than trying clever in-memory solutions
- Observable systems are debuggable systems - Export metrics to DataDog or your platform of choice
By solving these challenges, we transformed our MCP server from a local development tool into infrastructure that our entire team relies on. The same patterns apply whether you're building tools for internal use or creating MCP servers for the broader community.
The future of development involves AI assistants that understand context and can take meaningful actions. Making that future accessible to teams - not just individual developers - requires solving the infrastructure challenges we've outlined here.
About the Author
Tyler Wells is the Co-founder and CTO of BrainGrid, where we're building the future of AI-assisted software development. With over 25 years of experience in distributed systems and developer tools, Tyler focuses on making complex technology accessible to engineering teams.
Want to discuss MCP server architecture or share your experiences? Find me on X or connect on LinkedIn.
Interested in turning half-baked thoughts into crystal-clear, AI-ready specs and tasks that your IDE can nail, the first time? Check out BrainGrid - Follow us on X for updates.
Ready to stop babysitting your AI coding assistant?
Join the waitlist for BrainGrid and experience AI-powered planning that makes coding agents actually work.
Join the Waitlist