From Local Hack to Production-Ready: How We Solved the BrainGrid's MCP Multi-Tenant Authentication Problem
We had to solve the MCP multi-tenant authentication problem to make our agent accessible to our customers. Here's how we did it.
You've built an amazing MCP server. It works perfectly on your laptop. Your AI assistant can create Jira tickets, query your database, deploy to production - life is good. Then your teammate asks: "Hey, can I use this too?"
Even better, you want to ship your MCP as a product for your customers. You now need to support multiple tenants, each with their own API keys and authentication.
Suddenly, you're in hell.
The Problem Nobody Talks About
Here's what happens when you try to share your MCP server with your customers:
Option 1: The "Just Install It" Approach
1# Your instructions to teammates: 21. Clone the repo 32. Install dependencies 43. Set up your API keys 54. Configure your environment 65. Run the server locally 76. Oh, and update these keys when they expire... 87. And don't forget to pull the latest changes... 98. BTW, it might conflict with your other Node versions...
Result: 3 hours later, half your customer gave up, the other half is debugging npm issues.
Option 2: The "Let's Host It" Nightmare
You deploy to a Serverless platform like Cloud Run or Vercel. Five minutes later:
customer: "It's asking me to authenticate... again"
you: "Yeah, just refresh and login again"
customer: "I just did. It's asking again."
you: "Oh, that's because Cloud Run scales to zero and..."
customer: "I don't care why. I just want to create a ticket."
The core issue? Serverless platforms don't do sessions. Every request could hit a different instance. Your carefully crafted auth flow becomes a game of authentication whack-a-mole.
Why This Matters More Than You Think
This isn't just an annoyance. It's the difference between:
- A tool only you use vs A tool your entire customer base adopts
- "Cool prototype" vs "Critical infrastructure"
- Weekend project vs Production-ready product customers actually use
We learned this the hard way at BrainGrid. Our MCP server transformed how our team worked with AI - but only after we solved the authentication puzzle we were ready to ship to our customers.
What You'll Learn
This guide shows you exactly how we transformed our MCP server from a local development tool into a production-ready service that:
- Authenticates once, works everywhere - No more login fatigue
- Scales from 1 to 1000 users - Same performance whether it's just you or the whole company
- Costs pennies to run - Efficient caching means minimal cloud costs
- Works with existing auth - Integrates with WorkOS, Auth0, or any OAuth provider
- Deploys in minutes - One command to go from local to remote
We'll cover the exact architecture, the gotchas we discovered, and the code that makes it all work. No theory, no fluff - just battle-tested solutions from our production deployment serving hundreds of developers.
Ready to make your MCP server something your customers will actually want to use? Let's dive in.
How we got there
- Initial Setup: From Local to Remote
- The Serverless Challenge
- Technical Solution: Redis Session Store
- Production Deployment Strategies
- Monitoring and Debugging
- Performance Optimization
- The Paradigm Shift
Initial Setup: From Local to Remote
Step 1: Basic MCP Server Configuration
Start with a standard MCP server setup using FastMCP. The key is understanding the dual nature of MCP servers - they need to work both locally for development and remotely for customers to use.
1import { FastMCP } from 'fastmcp'; 2import { z } from 'zod'; 3 4// Define your tool schemas 5const CreateRequirementSchema = z.object({ 6 message: z.string().describe("The requirement description"), 7 repositories: z.string().optional().describe("Comma-separated list of repos") 8}); 9 10const server = new FastMCP({ 11 name: 'braingrid-server', 12 version: '1.0.0' 13}); 14 15// Add your tools 16server.addTool({ 17 name: 'create_requirement', 18 description: 'Create a new requirement in BrainGrid', 19 parameters: CreateRequirementSchema, 20 execute: async (args, context) => { 21 // Tool implementation 22 // Note: context.session contains user auth info when hosted 23 const apiClient = new BrainGridApiClient(config, context?.session); 24 return await apiClient.createRequirement(args); 25 } 26}); 27 28// Local development (stdio transport) 29await server.start({ transportType: 'stdio' });
Step 2: Switching to httpStream for Remote Hosting
To deploy on Cloud Run or Vercel, switch to httpStream transport. This requires careful consideration of how your tools will handle authentication:
1// Detect transport type from environment 2const transportType = process.env.MCP_TRANSPORT || 'stdio'; 3 4// httpStream configuration for serverless 5if (transportType === 'httpStream') { 6 await server.start({ 7 transportType: 'httpStream', 8 httpStream: { 9 port: parseInt(process.env.PORT || '8080'), 10 endpoint: '/mcp' 11 } 12 }); 13} else { 14 // Local stdio transport 15 await server.start({ transportType: 'stdio' }); 16}
Step 3: Implementing OAuth with WorkOS
MCP requires specific OAuth implementation patterns. The key insight is that MCP clients expect a particular discovery flow:
1const serverOptions = { 2 name: 'braingrid-server', 3 version: '1.0.0', 4 authenticate: authenticateRequest, 5 oauth: { 6 enabled: true, 7 protectedResource: { 8 resource: 'https://mcp.braingrid.ai', 9 authorizationServers: ['https://auth.workos.com'], 10 bearerMethodsSupported: ['header'], 11 }, 12 // This is crucial for MCP client compatibility 13 authorizationServer: { 14 issuer: 'https://auth.workos.com', 15 authorizationEndpoint: 'https://auth.workos.com/oauth2/authorize', 16 tokenEndpoint: 'https://auth.workos.com/oauth2/token', 17 jwksUri: 'https://auth.workos.com/oauth2/jwks', // Note: Not /.well-known/jwks.json 18 responseTypesSupported: ['code'], 19 grantTypesSupported: ['authorization_code', 'refresh_token'], 20 codeChallengeMethodsSupported: ['S256'], 21 tokenEndpointAuthMethodsSupported: ['none'], 22 scopesSupported: ['email', 'offline_access', 'openid', 'profile'], 23 } 24 } 25};
Key implementation detail: The WWW-Authenticate header must be properly formatted for MCP clients:
1// MCP session structure - what gets passed to your tools 2interface MCPSession { 3 userId: string; 4 email: string; 5 organizationId: string; 6 scopes: string[]; 7 token: string; 8} 9 10export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> { 11 const authHeader = request.headers.authorization; 12 13 if (!authHeader) { 14 // MCP clients expect this specific format 15 throw new Response(null, { 16 status: 401, 17 headers: { 18 'WWW-Authenticate': 'Bearer error="unauthorized", ' + 19 'error_description="Authorization needed", ' + 20 'resource_metadata="https://mcp.braingrid.ai/.well-known/oauth-protected-resource"' 21 } 22 }); 23 } 24 25 // Extract bearer token 26 const bearerMatch = authHeader.match(/^Bearer (.+)$/); 27 if (!bearerMatch) { 28 throw new Response(null, { 29 status: 401, 30 headers: { 31 'WWW-Authenticate': 'Bearer error="invalid_token", ' + 32 'error_description="Invalid authorization header format"' 33 } 34 }); 35 } 36 37 const token = bearerMatch[1]; 38 // Validate JWT and return session 39 return await validateAndCreateSession(token); 40}
Step 4: Handling Dual Transport Modes
Your MCP server needs to support both local and remote authentication patterns:
1export class BrainGridApiClient { 2 private auth?: AuthHandler; 3 private session?: MCPSession; 4 private readonly config: { apiUrl: string; organizationId?: string }; 5 6 constructor(config: { apiUrl: string; organizationId?: string }, session?: MCPSession) { 7 this.config = config; 8 this.session = session; 9 10 // Only create AuthHandler for local mode 11 if (!session) { 12 this.auth = new AuthHandler(config); 13 } 14 } 15 16 private async getHeaders(): Promise<Record<string, string>> { 17 if (this.session) { 18 // Remote mode - use session token 19 return { 20 'Authorization': `Bearer ${this.session.token}`, 21 'X-Organization-Id': this.session.organizationId, 22 'Content-Type': 'application/json', 23 }; 24 } else if (this.auth) { 25 // Local mode - use stored auth 26 return this.auth.getOrganizationHeaders(); 27 } 28 throw new Error('No authentication method available'); 29 } 30}
The Serverless Challenge
Serverless platforms like Cloud Run and Vercel share fundamental characteristics that create unique challenges for stateful applications:
1. Instance Lifecycle Management
Serverless instances have unpredictable lifecycles:
- Cold starts: New instances spin up on demand
- Scale to zero: Instances terminate after inactivity
- Horizontal scaling: Multiple instances serve concurrent requests
- No sticky sessions: Requests can hit any instance
This creates specific challenges for MCP servers:
1// This approach fails in serverless: 2class NaiveMCPServer { 3 private sessions = new Map<string, MCPSession>(); // β Lost on instance restart 4 5 async authenticate(token: string): Promise<MCPSession> { 6 // Check memory cache 7 if (this.sessions.has(token)) { 8 return this.sessions.get(token)!; 9 } 10 11 // Validate and cache 12 const session = await validateJWT(token); 13 this.sessions.set(token, session); // β Only exists on this instance 14 return session; 15 } 16}
2. JWT Validation Overhead
Without session persistence, your MCP server performs full JWT validation on every request:
1async function validateJWT(token: string): Promise<MCPSession> { 2 // Step 1: Fetch JWKS (Network call ~50ms) 3 const jwks = await fetchJWKS('https://auth.workos.com/oauth2/jwks'); 4 5 // Step 2: Verify signature (CPU intensive ~10ms) 6 const verified = await jose.jwtVerify(token, jwks); 7 8 // Step 3: Check claims (CPU ~5ms) 9 if (verified.payload.iss !== 'https://auth.workos.com') { 10 throw new Error('Invalid issuer'); 11 } 12 13 // Step 4: Extract session data 14 return { 15 userId: verified.payload.sub, 16 email: verified.payload.email, 17 organizationId: verified.payload.org_id, 18 scopes: verified.payload.scopes, 19 token: token 20 }; 21}
This adds 50-100ms to every request and increases costs significantly.
3. Re-authentication Fatigue
The user experience without session persistence:
Timeline of a frustrated developer:
0:00 - Connect to MCP server β
0:01 - Authenticate via WorkOS β
0:02 - Create requirement β
0:05 - (Cloud Run scales instance to zero)
0:10 - Try to update task β "Please authenticate again"
0:11 - Re-authenticate π€
0:12 - Update task β
0:15 - (New instance due to load)
0:16 - Try to commit β "Please authenticate again"
0:17 - Rage quit
Technical Solution: Redis Session Store with Encryption
Architecture Overview
The solution implements a multi-tier caching strategy with security at its core:
βββββββββββββββ     βββββββββββββββ     βββββββββββββββ
β   Request   ββββββΆβ   Memory    ββββββΆβ    Redis    β
β             β     β   Cache     β     β    Cache    β
βββββββββββββββ     βββββββββββββββ     βββββββββββββββ
                            β                    β
                            βΌ                    βΌ
                    βββββββββββββββ     βββββββββββββββ
                    β     JWT     β     β     JWT     β
                    β Validation  β     β Validation  β
                    βββββββββββββββ     βββββββββββββββ
Implementation Details
Session Store with AES-256-GCM Encryption
The session store implements military-grade encryption for sensitive session data:
1import { Redis } from 'ioredis'; 2import crypto from 'crypto'; 3import { MCPSession } from './types.js'; 4import { logger } from './logger.js'; 5 6export class SessionStore { 7 private redis: Redis | null = null; 8 private encryptionKey: Buffer | null = null; 9 private algorithm = 'aes-256-gcm'; 10 private readonly ttl: number; 11 private keyPrefix = 'mcp:session:'; 12 13 constructor() { 14 // Only initialize for httpStream transport 15 if (process.env.MCP_TRANSPORT !== 'httpStream') { 16 logger.debug('SessionStore not initialized - stdio transport'); 17 return; 18 } 19 20 const redisUrl = process.env.REDIS_URL; 21 const encryptionKeyHex = process.env.ENCRYPTION_KEY; 22 23 if (!redisUrl || !encryptionKeyHex) { 24 logger.warn('Session persistence disabled - missing configuration'); 25 return; 26 } 27 28 // Validate encryption key length 29 if (encryptionKeyHex.length !== 64) { 30 throw new Error('ENCRYPTION_KEY must be 32 bytes (64 hex characters)'); 31 } 32 33 this.encryptionKey = Buffer.from(encryptionKeyHex, 'hex'); 34 this.ttl = parseInt(process.env.SESSION_CACHE_TTL || '604800', 10); 35 36 // Configure Redis with production-ready settings 37 this.redis = new Redis(redisUrl, { 38 // Retry strategy with exponential backoff 39 retryStrategy: (times) => { 40 const delay = Math.min(times * 50, 2000); 41 logger.debug(`Redis retry attempt ${times}, delay: ${delay}ms`); 42 return delay; 43 }, 44 // Reconnect on READONLY errors (Redis failover) 45 reconnectOnError: (err) => { 46 const shouldReconnect = err.message.includes('READONLY'); 47 if (shouldReconnect) { 48 logger.warn('Redis READONLY error, reconnecting...'); 49 } 50 return shouldReconnect; 51 }, 52 // Connection settings 53 connectTimeout: 10000, 54 maxRetriesPerRequest: 3, 55 enableReadyCheck: true, 56 enableOfflineQueue: false, // Fail fast in production 57 }); 58 59 // Monitor Redis connection health 60 this.redis.on('connect', () => logger.info('Redis connected')); 61 this.redis.on('ready', () => logger.info('Redis ready')); 62 this.redis.on('error', (err) => logger.error({ err }, 'Redis error')); 63 this.redis.on('close', () => logger.warn('Redis connection closed')); 64 } 65 66 /** 67 * Check if session store is available 68 */ 69 isAvailable(): boolean { 70 return this.redis !== null && 71 this.redis.status === 'ready' && 72 this.encryptionKey !== null; 73 } 74 75 /** 76 * Store encrypted session with automatic expiration 77 */ 78 async storeSession(session: MCPSession): Promise<void> { 79 if (!this.isAvailable()) { 80 logger.debug('Session store unavailable, skipping storage'); 81 return; 82 } 83 84 try { 85 // Generate unique IV for each encryption 86 const iv = crypto.randomBytes(16); 87 const cipher = crypto.createCipheriv(this.algorithm, this.encryptionKey!, iv); 88 89 // Encrypt session data 90 const sessionJson = JSON.stringify(session); 91 const encrypted = Buffer.concat([ 92 cipher.update(sessionJson, 'utf8'), 93 cipher.final() 94 ]); 95 96 // Get authentication tag for GCM 97 const authTag = cipher.getAuthTag(); 98 99 // Combine components: IV (16) + AuthTag (16) + Encrypted Data 100 const combined = Buffer.concat([iv, authTag, encrypted]); 101 const encoded = combined.toString('base64'); 102 103 // Store with TTL 104 const key = `${this.keyPrefix}${session.userId}`; 105 await this.redis!.setex(key, this.ttl, encoded); 106 107 logger.debug({ 108 userId: session.userId, 109 keySize: encoded.length, 110 ttl: this.ttl 111 }, 'Session stored successfully'); 112 } catch (error) { 113 logger.error({ error }, 'Failed to store session'); 114 // Don't throw - graceful degradation 115 } 116 } 117 118 /** 119 * Retrieve and decrypt session 120 */ 121 async getSession(userId: string): Promise<MCPSession | null> { 122 if (!this.isAvailable()) { 123 return null; 124 } 125 126 const startTime = Date.now(); 127 try { 128 const key = `${this.keyPrefix}${userId}`; 129 const encoded = await this.redis!.get(key); 130 131 if (!encoded) { 132 logger.debug({ userId }, 'Session not found in cache'); 133 return null; 134 } 135 136 // Decode and extract components 137 const combined = Buffer.from(encoded, 'base64'); 138 const iv = combined.slice(0, 16); 139 const authTag = combined.slice(16, 32); 140 const encrypted = combined.slice(32); 141 142 // Decrypt with authentication 143 const decipher = crypto.createDecipheriv(this.algorithm, this.encryptionKey!, iv); 144 decipher.setAuthTag(authTag); 145 146 const decrypted = Buffer.concat([ 147 decipher.update(encrypted), 148 decipher.final() 149 ]); 150 151 const session = JSON.parse(decrypted.toString('utf8')) as MCPSession; 152 153 const elapsed = Date.now() - startTime; 154 logger.debug({ userId, elapsed }, 'Session retrieved from cache'); 155 156 return session; 157 } catch (error) { 158 if (error instanceof Error && error.message.includes('Unsupported state or unable to authenticate data')) { 159 logger.error({ userId }, 'Session decryption failed - possible tampering'); 160 } else { 161 logger.error({ error, userId }, 'Failed to retrieve session'); 162 } 163 return null; 164 } 165 } 166 167 /** 168 * Remove session (for logout) 169 */ 170 async removeSession(userId: string): Promise<void> { 171 if (!this.isAvailable()) return; 172 173 try { 174 const key = `${this.keyPrefix}${userId}`; 175 await this.redis!.del(key); 176 logger.debug({ userId }, 'Session removed'); 177 } catch (error) { 178 logger.error({ error, userId }, 'Failed to remove session'); 179 } 180 } 181 182 /** 183 * Clean shutdown 184 */ 185 async close(): Promise<void> { 186 if (this.redis) { 187 await this.redis.quit(); 188 this.redis = null; 189 } 190 } 191} 192 193// Singleton instance 194export const sessionStore = new SessionStore();
Optimized Authentication Middleware
The authentication middleware implements a fast-path/slow-path pattern:
1import { IncomingMessage } from 'http'; 2import crypto from 'crypto'; 3import { decodeJwt, createRemoteJWKSet, jwtVerify, JWTPayload } from 'jose'; 4import { sessionStore } from './session-store.js'; 5import { logger } from './logger.js'; 6import { MCPSession } from './types.js'; 7 8export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> { 9 const requestId = crypto.randomUUID(); 10 const startTime = Date.now(); 11 12 logger.debug({ 13 requestId, 14 method: request.method, 15 url: request.url 16 }, 'Authentication request started'); 17 18 try { 19 // Extract bearer token 20 const token = extractBearerToken(request); 21 if (!token) { 22 throw new UnauthorizedError('No bearer token provided'); 23 } 24 25 // Fast path: Try to decode JWT for userId 26 let userId: string | null = null; 27 let tokenExp: number | null = null; 28 29 try { 30 const decoded = decodeJwt(token); 31 userId = decoded.sub || null; 32 tokenExp = decoded.exp || null; 33 34 // Quick expiration check 35 if (tokenExp && tokenExp < Date.now() / 1000) { 36 logger.debug({ requestId, userId }, 'Token expired, skipping cache'); 37 userId = null; // Force validation 38 } 39 } catch (error) { 40 logger.debug({ requestId }, 'Failed to decode JWT for cache lookup'); 41 } 42 43 // Try cache if we have a userId 44 if (userId && sessionStore.isAvailable()) { 45 const cached = await sessionStore.getSession(userId); 46 if (cached && cached.token === token) { 47 const elapsed = Date.now() - startTime; 48 logger.info({ 49 requestId, 50 userId, 51 elapsed, 52 source: 'cache' 53 }, 'Authentication successful (cached)'); 54 55 return cached; 56 } 57 } 58 59 // Slow path: Full JWT validation 60 logger.debug({ requestId }, 'Cache miss, performing JWT validation'); 61 const session = await validateJWTWithWorkOS(token); 62 63 // Store for next time 64 if (sessionStore.isAvailable()) { 65 await sessionStore.storeSession(session); 66 } 67 68 const elapsed = Date.now() - startTime; 69 logger.info({ 70 requestId, 71 userId: session.userId, 72 elapsed, 73 source: 'jwt' 74 }, 'Authentication successful (validated)'); 75 76 return session; 77 } catch (error) { 78 const elapsed = Date.now() - startTime; 79 logger.error({ 80 requestId, 81 error: error instanceof Error ? error.message : 'Unknown error', 82 elapsed 83 }, 'Authentication failed'); 84 85 // Return proper HTTP response for MCP 86 if (error instanceof UnauthorizedError) { 87 throw new Response(null, { 88 status: 401, 89 headers: { 90 'WWW-Authenticate': `Bearer error="unauthorized", ` + 91 `error_description="${error.message}", ` + 92 `resource_metadata="${getResourceMetadataUrl()}"` 93 } 94 }); 95 } 96 97 throw error; 98 } 99} 100 101function extractBearerToken(request: IncomingMessage): string | null { 102 const authHeader = request.headers.authorization; 103 if (!authHeader) return null; 104 105 const match = authHeader.match(/^Bearer (.+)$/); 106 return match ? match[1] : null; 107} 108 109class UnauthorizedError extends Error { 110 constructor(message: string) { 111 super(message); 112 this.name = 'UnauthorizedError'; 113 } 114} 115 116function getResourceMetadataUrl(): string { 117 const host = process.env.MCP_HOST || 'https://mcp.braingrid.ai'; 118 return `${host}/.well-known/oauth-protected-resource`; 119} 120 121// JWT validation with WorkOS 122const jwksCache = new Map<string, ReturnType<typeof createRemoteJWKSet>>(); 123 124async function validateJWTWithWorkOS(token: string): Promise<MCPSession> { 125 const issuer = process.env.WORKOS_ISSUER || 'https://auth.workos.com'; 126 127 try { 128 // Get or create JWKS 129 let jwks = jwksCache.get(issuer); 130 if (!jwks) { 131 jwks = createRemoteJWKSet(new URL(`${issuer}/oauth2/jwks`)); 132 jwksCache.set(issuer, jwks); 133 } 134 135 // Verify JWT with options 136 const verifyOptions: any = { 137 issuer, 138 algorithms: ['RS256'], 139 }; 140 141 // Only check audience if configured 142 if (process.env.WORKOS_CLIENT_ID) { 143 verifyOptions.audience = process.env.WORKOS_CLIENT_ID; 144 } 145 146 const { payload } = await jwtVerify(token, jwks, verifyOptions); 147 148 // Validate required claims 149 if (!payload.sub || !payload.email || !payload.org_id) { 150 throw new Error('Missing required JWT claims'); 151 } 152 153 // Create session from JWT claims 154 return { 155 userId: payload.sub, 156 email: payload.email as string, 157 organizationId: payload.org_id as string, 158 scopes: Array.isArray(payload.scopes) ? payload.scopes : [], 159 token, 160 }; 161 } catch (error) { 162 logger.error({ error: error instanceof Error ? error.message : 'Unknown error' }, 'JWT validation failed'); 163 throw error; 164 } 165}
Graceful Degradation
The implementation handles Redis failures gracefully by simply returning null and forcing re-authentication. This is intentional - in a serverless environment, there's no point in falling back to in-memory caching since each instance has its own memory. Better to fail fast and have the user re-authenticate than to create inconsistent state.
Production Deployment Strategies
Cloud Run Configuration
Create a comprehensive deployment configuration:
1# Multi-stage build for optimization 2FROM node:22-alpine AS builder 3 4WORKDIR /app 5 6# Copy package files 7COPY package*.json ./ 8COPY pnpm-lock.yaml ./ 9 10# Install dependencies 11RUN npm install -g pnpm && pnpm install --frozen-lockfile 12 13# Copy source code 14COPY . . 15 16# Build TypeScript 17RUN pnpm run build 18 19# Production stage 20FROM node:22-alpine 21 22WORKDIR /app 23 24# Install production dependencies only 25COPY package*.json ./ 26COPY pnpm-lock.yaml ./ 27RUN npm install -g pnpm && pnpm install --prod --frozen-lockfile 28 29# Copy built application 30COPY /app/dist ./dist 31 32# Set environment 33ENV NODE_ENV=production 34ENV MCP_TRANSPORT=httpStream 35 36# Health check 37HEALTHCHECK \ 38 CMD node -e "require('http').get('http://localhost:8080/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))" 39 40EXPOSE 8080 41 42CMD ["node", "dist/server.js"]
Deploy with proper configuration:
1#!/bin/bash 2# deploy-cloud-run.sh 3 4PROJECT_ID="your-project-id" 5SERVICE_NAME="braingrid-mcp-server" 6REGION="us-central1" 7REDIS_URL="rediss://<your-redis-instance>" 8 9# Build and push image 10gcloud builds submit --tag gcr.io/${PROJECT_ID}/${SERVICE_NAME} 11 12# Deploy to Cloud Run 13gcloud run deploy ${SERVICE_NAME} \ 14 --image gcr.io/${PROJECT_ID}/${SERVICE_NAME} \ 15 --platform managed \ 16 --region ${REGION} \ 17 --allow-unauthenticated \ 18 --set-env-vars "MCP_TRANSPORT=httpStream" \ 19 --set-env-vars "BRAINGRID_ENV=production" \ 20 --set-env-vars "REDIS_URL=${REDIS_URL}" \ 21 --set-secrets "ENCRYPTION_KEY=mcp-encryption-key:latest" \ 22 --cpu 1 \ 23 --memory 512Mi \ 24 --min-instances 1 \ 25 --max-instances 100 \ 26 --concurrency 80 \ 27 --timeout 300
Vercel Configuration
For Vercel deployment, create vercel.json:
1{ 2 "version": 2, 3 "builds": [ 4 { 5 "src": "dist/server.js", 6 "use": "@vercel/node" 7 } 8 ], 9 "routes": [ 10 { 11 "src": "/health", 12 "dest": "/dist/server.js" 13 }, 14 { 15 "src": "/mcp", 16 "dest": "/dist/server.js" 17 }, 18 { 19 "src": "/.well-known/oauth-protected-resource", 20 "dest": "/dist/server.js" 21 } 22 ], 23 "env": { 24 "MCP_TRANSPORT": "httpStream", 25 "NODE_ENV": "production" 26 } 27}
Monitoring and Debugging
Structured Logging
Implement comprehensive logging for production debugging:
1import pino from 'pino'; 2 3// Configure structured logging 4export const logger = pino({ 5 level: process.env.LOG_LEVEL || 'info', 6 transport: process.env.NODE_ENV === 'production' ? undefined : { 7 target: 'pino-pretty', 8 options: { 9 colorize: true, 10 translateTime: 'HH:MM:ss Z', 11 ignore: 'pid,hostname' 12 } 13 }, 14 formatters: { 15 level: (label) => { 16 return { level: label }; 17 } 18 }, 19 serializers: { 20 req: (req) => ({ 21 method: req.method, 22 url: req.url, 23 headers: { 24 ...req.headers, 25 authorization: req.headers.authorization ? '[REDACTED]' : undefined 26 } 27 }), 28 err: pino.stdSerializers.err 29 } 30}); 31 32// Request tracking middleware 33import { IncomingMessage, ServerResponse } from 'http'; 34 35export function requestLogging() { 36 return (req: IncomingMessage, res: ServerResponse, next: () => void) => { 37 const start = Date.now(); 38 const requestId = crypto.randomUUID(); 39 40 // Attach to request 41 (req as any).requestId = requestId; 42 43 // Log request 44 logger.info({ 45 requestId, 46 req, 47 type: 'request' 48 }, 'Incoming request'); 49 50 // Log response 51 res.on('finish', () => { 52 const elapsed = Date.now() - start; 53 logger.info({ 54 requestId, 55 statusCode: res.statusCode, 56 elapsed, 57 type: 'response' 58 }, 'Request completed'); 59 }); 60 61 next(); 62 }; 63}
Metrics Collection
For production deployments, export metrics to your observability platform:
1// Example: Exporting MCP tool call metrics to DataDog 2import { StatsD } from 'node-dogstatsd'; 3 4const dogstatsd = new StatsD({ 5 host: process.env.DD_AGENT_HOST || 'localhost', 6 port: 8125, 7 prefix: 'mcp.server.', 8 tags: [`env:${process.env.BRAINGRID_ENV || 'development'}`] 9}); 10 11// Track tool usage 12export function recordToolCall(toolName: string, duration: number, success: boolean) { 13 // Record timing metric 14 dogstatsd.timing('tool.call.duration', duration, [ 15 `tool:${toolName}`, 16 `status:${success ? 'success' : 'failure'}` 17 ]); 18 19 // Increment counter 20 dogstatsd.increment('tool.call.count', 1, [ 21 `tool:${toolName}`, 22 `status:${success ? 'success' : 'failure'}` 23 ]); 24} 25 26// In your tool implementation: 27server.addTool({ 28 name: 'create_requirement', 29 execute: async (args, context) => { 30 const startTime = Date.now(); 31 try { 32 const result = await apiClient.createRequirement(args); 33 recordToolCall('create_requirement', Date.now() - startTime, true); 34 return result; 35 } catch (error) { 36 recordToolCall('create_requirement', Date.now() - startTime, false); 37 throw error; 38 } 39 } 40});
Performance Optimization
Connection Pooling
Optimize Redis connections for serverless:
1// Redis connection pool for serverless 2export class RedisConnectionPool { 3 private static instance: Redis | null = null; 4 5 static getInstance(): Redis | null { 6 if (!this.instance && process.env.REDIS_URL) { 7 this.instance = new Redis(process.env.REDIS_URL, { 8 // Connection pool settings 9 maxRetriesPerRequest: 3, 10 enableReadyCheck: true, 11 lazyConnect: true, // Important for serverless 12 13 // Serverless-optimized timeouts 14 connectTimeout: 5000, 15 commandTimeout: 5000, 16 17 // Connection reuse 18 keepAlive: 30000, 19 noDelay: true, 20 21 // Handle connection errors gracefully 22 retryStrategy: (times) => { 23 if (times > 3) return null; // Stop retrying 24 return Math.min(times * 100, 3000); 25 } 26 }); 27 28 // Ensure connection is established 29 this.instance.connect().catch((err: Error) => { 30 logger.error({ err: err.message }, 'Redis connection failed'); 31 this.instance = null; 32 }); 33 } 34 35 return this.instance; 36 } 37 38 static async close(): Promise<void> { 39 if (this.instance) { 40 await this.instance.quit(); 41 this.instance = null; 42 } 43 } 44}
Request Batching
Optimize for concurrent requests:
1export class BatchedJWTValidator { 2 private readonly pendingValidations = new Map<string, Promise<MCPSession>>(); 3 4 async validateToken(token: string): Promise<MCPSession> { 5 // Check if validation is already in progress 6 if (this.pendingValidations.has(token)) { 7 logger.debug('Reusing pending validation'); 8 return this.pendingValidations.get(token)!; 9 } 10 11 // Start new validation 12 const validationPromise = this.performValidation(token) 13 .finally(() => { 14 // Clean up after completion 15 this.pendingValidations.delete(token); 16 }); 17 18 this.pendingValidations.set(token, validationPromise); 19 return validationPromise; 20 } 21 22 private async performValidation(token: string): Promise<MCPSession> { 23 // Actual JWT validation logic 24 return validateJWTWithWorkOS(token); 25 } 26}
Conclusion
Hosting MCP servers in serverless environments is challenging, but the patterns we've covered make it possible to build production-ready solutions that scale.
The key technical takeaways:
- Session persistence is non-negotiable - Without Redis or similar external storage, your users face constant re-authentication
- Security can't be an afterthought - Proper encryption (AES-256-GCM) and secure token handling are essential
- Fast-path optimization matters - JWT validation is expensive; caching authenticated sessions dramatically improves performance
- Graceful degradation over complex fallbacks - When Redis fails, force re-authentication rather than trying clever in-memory solutions
- Observable systems are debuggable systems - Export metrics to DataDog or your platform of choice
By solving these challenges, we transformed our MCP server from a local development tool into infrastructure that our entire team relies on. The same patterns apply whether you're building tools for internal use or creating MCP servers for the broader community.
The future of development involves AI assistants that understand context and can take meaningful actions. Making that future accessible to teams - not just individual developers - requires solving the infrastructure challenges we've outlined here.
About the Author
Tyler Wells is the Co-founder and CTO of BrainGrid, where we're building the future of AI-assisted software development. With over 25 years of experience in distributed systems and developer tools, Tyler focuses on making complex technology accessible to engineering teams.
Want to discuss MCP server architecture or share your experiences? Find me on X or connect on LinkedIn.
Interested in turning half-baked thoughts into crystal-clear, AI-ready specs and tasks that your IDE can nail, the first time? Check out BrainGrid - Follow us on X for updates.
Ready to build without the back-and-forth?
Turn messy thoughts into engineering-grade prompts that coding agents can nail, the first time.
Get Started