Engineering

From Local Hack to Production-Ready: How We Solved the BrainGrid's MCP Multi-Tenant Authentication Problem

We had to solve the MCP multi-tenant authentication problem to make our agent accessible to our customers. Here's how we did it.

Tyler Wells, Co-founder & CTO
19 min read
From Local Hack to Production-Ready: How We Solved the BrainGrid's MCP Multi-Tenant Authentication Problem

You've built an amazing MCP server. It works perfectly on your laptop. Your AI assistant can create Jira tickets, query your database, deploy to production - life is good. Then your teammate asks: "Hey, can I use this too?"

Even better, you want to ship your MCP as a product for your customers. You now need to support multiple tenants, each with their own API keys and authentication.

Suddenly, you're in hell.

The Problem Nobody Talks About

Here's what happens when you try to share your MCP server with your customers:

Option 1: The "Just Install It" Approach

1# Your instructions to teammates:
21. Clone the repo
32. Install dependencies
43. Set up your API keys
54. Configure your environment
65. Run the server locally
76. Oh, and update these keys when they expire...
87. And don't forget to pull the latest changes...
98. BTW, it might conflict with your other Node versions...

Result: 3 hours later, half your customer gave up, the other half is debugging npm issues.

Option 2: The "Let's Host It" Nightmare

You deploy to a Serverless platform like Cloud Run or Vercel. Five minutes later:

customer: "It's asking me to authenticate... again"
you: "Yeah, just refresh and login again"
customer: "I just did. It's asking again."
you: "Oh, that's because Cloud Run scales to zero and..."
customer: "I don't care why. I just want to create a ticket."

The core issue? Serverless platforms don't do sessions. Every request could hit a different instance. Your carefully crafted auth flow becomes a game of authentication whack-a-mole.

Why This Matters More Than You Think

This isn't just an annoyance. It's the difference between:

  • A tool only you use vs A tool your entire customer base adopts
  • "Cool prototype" vs "Critical infrastructure"
  • Weekend project vs Production-ready product customers actually use

We learned this the hard way at BrainGrid. Our MCP server transformed how our team worked with AI - but only after we solved the authentication puzzle we were ready to ship to our customers.

What You'll Learn

This guide shows you exactly how we transformed our MCP server from a local development tool into a production-ready service that:

  • Authenticates once, works everywhere - No more login fatigue
  • Scales from 1 to 1000 users - Same performance whether it's just you or the whole company
  • Costs pennies to run - Efficient caching means minimal cloud costs
  • Works with existing auth - Integrates with WorkOS, Auth0, or any OAuth provider
  • Deploys in minutes - One command to go from local to remote

We'll cover the exact architecture, the gotchas we discovered, and the code that makes it all work. No theory, no fluff - just battle-tested solutions from our production deployment serving hundreds of developers.

Ready to make your MCP server something your customers will actually want to use? Let's dive in.

How we got there

  1. Initial Setup: From Local to Remote
  2. The Serverless Challenge
  3. Technical Solution: Redis Session Store
  4. Production Deployment Strategies
  5. Monitoring and Debugging
  6. Performance Optimization
  7. The Paradigm Shift

Initial Setup: From Local to Remote

Step 1: Basic MCP Server Configuration

Start with a standard MCP server setup using FastMCP. The key is understanding the dual nature of MCP servers - they need to work both locally for development and remotely for customers to use.

1import { FastMCP } from 'fastmcp';
2import { z } from 'zod';
3
4// Define your tool schemas
5const CreateRequirementSchema = z.object({
6  message: z.string().describe("The requirement description"),
7  repositories: z.string().optional().describe("Comma-separated list of repos")
8});
9
10const server = new FastMCP({
11  name: 'braingrid-server',
12  version: '1.0.0'
13});
14
15// Add your tools
16server.addTool({
17  name: 'create_requirement',
18  description: 'Create a new requirement in BrainGrid',
19  parameters: CreateRequirementSchema,
20  execute: async (args, context) => {
21    // Tool implementation
22    // Note: context.session contains user auth info when hosted
23    const apiClient = new BrainGridApiClient(config, context?.session);
24    return await apiClient.createRequirement(args);
25  }
26});
27
28// Local development (stdio transport)
29await server.start({ transportType: 'stdio' });

Step 2: Switching to httpStream for Remote Hosting

To deploy on Cloud Run or Vercel, switch to httpStream transport. This requires careful consideration of how your tools will handle authentication:

1// Detect transport type from environment
2const transportType = process.env.MCP_TRANSPORT || 'stdio';
3
4// httpStream configuration for serverless
5if (transportType === 'httpStream') {
6  await server.start({
7    transportType: 'httpStream',
8    httpStream: {
9      port: parseInt(process.env.PORT || '8080'),
10      endpoint: '/mcp'
11    }
12  });
13} else {
14  // Local stdio transport
15  await server.start({ transportType: 'stdio' });
16}

Step 3: Implementing OAuth with WorkOS

MCP requires specific OAuth implementation patterns. The key insight is that MCP clients expect a particular discovery flow:

1const serverOptions = {
2  name: 'braingrid-server',
3  version: '1.0.0',
4  authenticate: authenticateRequest,
5  oauth: {
6    enabled: true,
7    protectedResource: {
8      resource: 'https://mcp.braingrid.ai',
9      authorizationServers: ['https://auth.workos.com'],
10      bearerMethodsSupported: ['header'],
11    },
12    // This is crucial for MCP client compatibility
13    authorizationServer: {
14      issuer: 'https://auth.workos.com',
15      authorizationEndpoint: 'https://auth.workos.com/oauth2/authorize',
16      tokenEndpoint: 'https://auth.workos.com/oauth2/token',
17      jwksUri: 'https://auth.workos.com/oauth2/jwks', // Note: Not /.well-known/jwks.json
18      responseTypesSupported: ['code'],
19      grantTypesSupported: ['authorization_code', 'refresh_token'],
20      codeChallengeMethodsSupported: ['S256'],
21      tokenEndpointAuthMethodsSupported: ['none'],
22      scopesSupported: ['email', 'offline_access', 'openid', 'profile'],
23    }
24  }
25};

Key implementation detail: The WWW-Authenticate header must be properly formatted for MCP clients:

1// MCP session structure - what gets passed to your tools
2interface MCPSession {
3  userId: string;
4  email: string;
5  organizationId: string;
6  scopes: string[];
7  token: string;
8}
9
10export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
11  const authHeader = request.headers.authorization;
12
13  if (!authHeader) {
14    // MCP clients expect this specific format
15    throw new Response(null, {
16      status: 401,
17      headers: {
18        'WWW-Authenticate': 'Bearer error="unauthorized", ' +
19          'error_description="Authorization needed", ' +
20          'resource_metadata="https://mcp.braingrid.ai/.well-known/oauth-protected-resource"'
21      }
22    });
23  }
24
25  // Extract bearer token
26  const bearerMatch = authHeader.match(/^Bearer (.+)$/);
27  if (!bearerMatch) {
28    throw new Response(null, {
29      status: 401,
30      headers: {
31        'WWW-Authenticate': 'Bearer error="invalid_token", ' +
32          'error_description="Invalid authorization header format"'
33      }
34    });
35  }
36
37  const token = bearerMatch[1];
38  // Validate JWT and return session
39  return await validateAndCreateSession(token);
40}

Step 4: Handling Dual Transport Modes

Your MCP server needs to support both local and remote authentication patterns:

1export class BrainGridApiClient {
2  private auth?: AuthHandler;
3  private session?: MCPSession;
4  private readonly config: { apiUrl: string; organizationId?: string };
5
6  constructor(config: { apiUrl: string; organizationId?: string }, session?: MCPSession) {
7    this.config = config;
8    this.session = session;
9
10    // Only create AuthHandler for local mode
11    if (!session) {
12      this.auth = new AuthHandler(config);
13    }
14  }
15
16  private async getHeaders(): Promise<Record<string, string>> {
17    if (this.session) {
18      // Remote mode - use session token
19      return {
20        'Authorization': `Bearer ${this.session.token}`,
21        'X-Organization-Id': this.session.organizationId,
22        'Content-Type': 'application/json',
23      };
24    } else if (this.auth) {
25      // Local mode - use stored auth
26      return this.auth.getOrganizationHeaders();
27    }
28    throw new Error('No authentication method available');
29  }
30}

The Serverless Challenge

Serverless platforms like Cloud Run and Vercel share fundamental characteristics that create unique challenges for stateful applications:

1. Instance Lifecycle Management

Serverless instances have unpredictable lifecycles:

  • Cold starts: New instances spin up on demand
  • Scale to zero: Instances terminate after inactivity
  • Horizontal scaling: Multiple instances serve concurrent requests
  • No sticky sessions: Requests can hit any instance

This creates specific challenges for MCP servers:

1// This approach fails in serverless:
2class NaiveMCPServer {
3  private sessions = new Map<string, MCPSession>(); // ❌ Lost on instance restart
4
5  async authenticate(token: string): Promise<MCPSession> {
6    // Check memory cache
7    if (this.sessions.has(token)) {
8      return this.sessions.get(token)!;
9    }
10
11    // Validate and cache
12    const session = await validateJWT(token);
13    this.sessions.set(token, session); // ❌ Only exists on this instance
14    return session;
15  }
16}

2. JWT Validation Overhead

Without session persistence, your MCP server performs full JWT validation on every request:

1async function validateJWT(token: string): Promise<MCPSession> {
2  // Step 1: Fetch JWKS (Network call ~50ms)
3  const jwks = await fetchJWKS('https://auth.workos.com/oauth2/jwks');
4
5  // Step 2: Verify signature (CPU intensive ~10ms)
6  const verified = await jose.jwtVerify(token, jwks);
7
8  // Step 3: Check claims (CPU ~5ms)
9  if (verified.payload.iss !== 'https://auth.workos.com') {
10    throw new Error('Invalid issuer');
11  }
12
13  // Step 4: Extract session data
14  return {
15    userId: verified.payload.sub,
16    email: verified.payload.email,
17    organizationId: verified.payload.org_id,
18    scopes: verified.payload.scopes,
19    token: token
20  };
21}

This adds 50-100ms to every request and increases costs significantly.

3. Re-authentication Fatigue

The user experience without session persistence:

Timeline of a frustrated developer:
0:00 - Connect to MCP server βœ“
0:01 - Authenticate via WorkOS βœ“
0:02 - Create requirement βœ“
0:05 - (Cloud Run scales instance to zero)
0:10 - Try to update task βœ— "Please authenticate again"
0:11 - Re-authenticate 😀
0:12 - Update task βœ“
0:15 - (New instance due to load)
0:16 - Try to commit βœ— "Please authenticate again"
0:17 - Rage quit

Technical Solution: Redis Session Store with Encryption

Architecture Overview

The solution implements a multi-tier caching strategy with security at its core:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Request   │────▢│   Memory    │────▢│    Redis    β”‚
β”‚             β”‚     β”‚   Cache     β”‚     β”‚    Cache    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚                    β”‚
                            β–Ό                    β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚     JWT     β”‚     β”‚     JWT     β”‚
                    β”‚ Validation  β”‚     β”‚ Validation  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Implementation Details

Session Store with AES-256-GCM Encryption

The session store implements military-grade encryption for sensitive session data:

1import { Redis } from 'ioredis';
2import crypto from 'crypto';
3import { MCPSession } from './types.js';
4import { logger } from './logger.js';
5
6export class SessionStore {
7  private redis: Redis | null = null;
8  private encryptionKey: Buffer | null = null;
9  private algorithm = 'aes-256-gcm';
10  private readonly ttl: number;
11  private keyPrefix = 'mcp:session:';
12
13  constructor() {
14    // Only initialize for httpStream transport
15    if (process.env.MCP_TRANSPORT !== 'httpStream') {
16      logger.debug('SessionStore not initialized - stdio transport');
17      return;
18    }
19
20    const redisUrl = process.env.REDIS_URL;
21    const encryptionKeyHex = process.env.ENCRYPTION_KEY;
22
23    if (!redisUrl || !encryptionKeyHex) {
24      logger.warn('Session persistence disabled - missing configuration');
25      return;
26    }
27
28    // Validate encryption key length
29    if (encryptionKeyHex.length !== 64) {
30      throw new Error('ENCRYPTION_KEY must be 32 bytes (64 hex characters)');
31    }
32
33    this.encryptionKey = Buffer.from(encryptionKeyHex, 'hex');
34    this.ttl = parseInt(process.env.SESSION_CACHE_TTL || '604800', 10);
35
36    // Configure Redis with production-ready settings
37    this.redis = new Redis(redisUrl, {
38      // Retry strategy with exponential backoff
39      retryStrategy: (times) => {
40        const delay = Math.min(times * 50, 2000);
41        logger.debug(`Redis retry attempt ${times}, delay: ${delay}ms`);
42        return delay;
43      },
44      // Reconnect on READONLY errors (Redis failover)
45      reconnectOnError: (err) => {
46        const shouldReconnect = err.message.includes('READONLY');
47        if (shouldReconnect) {
48          logger.warn('Redis READONLY error, reconnecting...');
49        }
50        return shouldReconnect;
51      },
52      // Connection settings
53      connectTimeout: 10000,
54      maxRetriesPerRequest: 3,
55      enableReadyCheck: true,
56      enableOfflineQueue: false, // Fail fast in production
57    });
58
59    // Monitor Redis connection health
60    this.redis.on('connect', () => logger.info('Redis connected'));
61    this.redis.on('ready', () => logger.info('Redis ready'));
62    this.redis.on('error', (err) => logger.error({ err }, 'Redis error'));
63    this.redis.on('close', () => logger.warn('Redis connection closed'));
64  }
65
66  /**
67   * Check if session store is available
68   */
69  isAvailable(): boolean {
70    return this.redis !== null &&
71           this.redis.status === 'ready' &&
72           this.encryptionKey !== null;
73  }
74
75  /**
76   * Store encrypted session with automatic expiration
77   */
78  async storeSession(session: MCPSession): Promise<void> {
79    if (!this.isAvailable()) {
80      logger.debug('Session store unavailable, skipping storage');
81      return;
82    }
83
84    try {
85      // Generate unique IV for each encryption
86      const iv = crypto.randomBytes(16);
87      const cipher = crypto.createCipheriv(this.algorithm, this.encryptionKey!, iv);
88
89      // Encrypt session data
90      const sessionJson = JSON.stringify(session);
91      const encrypted = Buffer.concat([
92        cipher.update(sessionJson, 'utf8'),
93        cipher.final()
94      ]);
95
96      // Get authentication tag for GCM
97      const authTag = cipher.getAuthTag();
98
99      // Combine components: IV (16) + AuthTag (16) + Encrypted Data
100      const combined = Buffer.concat([iv, authTag, encrypted]);
101      const encoded = combined.toString('base64');
102
103      // Store with TTL
104      const key = `${this.keyPrefix}${session.userId}`;
105      await this.redis!.setex(key, this.ttl, encoded);
106
107      logger.debug({
108        userId: session.userId,
109        keySize: encoded.length,
110        ttl: this.ttl
111      }, 'Session stored successfully');
112    } catch (error) {
113      logger.error({ error }, 'Failed to store session');
114      // Don't throw - graceful degradation
115    }
116  }
117
118  /**
119   * Retrieve and decrypt session
120   */
121  async getSession(userId: string): Promise<MCPSession | null> {
122    if (!this.isAvailable()) {
123      return null;
124    }
125
126    const startTime = Date.now();
127    try {
128      const key = `${this.keyPrefix}${userId}`;
129      const encoded = await this.redis!.get(key);
130
131      if (!encoded) {
132        logger.debug({ userId }, 'Session not found in cache');
133        return null;
134      }
135
136      // Decode and extract components
137      const combined = Buffer.from(encoded, 'base64');
138      const iv = combined.slice(0, 16);
139      const authTag = combined.slice(16, 32);
140      const encrypted = combined.slice(32);
141
142      // Decrypt with authentication
143      const decipher = crypto.createDecipheriv(this.algorithm, this.encryptionKey!, iv);
144      decipher.setAuthTag(authTag);
145
146      const decrypted = Buffer.concat([
147        decipher.update(encrypted),
148        decipher.final()
149      ]);
150
151      const session = JSON.parse(decrypted.toString('utf8')) as MCPSession;
152
153      const elapsed = Date.now() - startTime;
154      logger.debug({ userId, elapsed }, 'Session retrieved from cache');
155
156      return session;
157    } catch (error) {
158      if (error instanceof Error && error.message.includes('Unsupported state or unable to authenticate data')) {
159        logger.error({ userId }, 'Session decryption failed - possible tampering');
160      } else {
161        logger.error({ error, userId }, 'Failed to retrieve session');
162      }
163      return null;
164    }
165  }
166
167  /**
168   * Remove session (for logout)
169   */
170  async removeSession(userId: string): Promise<void> {
171    if (!this.isAvailable()) return;
172
173    try {
174      const key = `${this.keyPrefix}${userId}`;
175      await this.redis!.del(key);
176      logger.debug({ userId }, 'Session removed');
177    } catch (error) {
178      logger.error({ error, userId }, 'Failed to remove session');
179    }
180  }
181
182  /**
183   * Clean shutdown
184   */
185  async close(): Promise<void> {
186    if (this.redis) {
187      await this.redis.quit();
188      this.redis = null;
189    }
190  }
191}
192
193// Singleton instance
194export const sessionStore = new SessionStore();

Optimized Authentication Middleware

The authentication middleware implements a fast-path/slow-path pattern:

1import { IncomingMessage } from 'http';
2import crypto from 'crypto';
3import { decodeJwt, createRemoteJWKSet, jwtVerify, JWTPayload } from 'jose';
4import { sessionStore } from './session-store.js';
5import { logger } from './logger.js';
6import { MCPSession } from './types.js';
7
8export async function authenticateRequest(request: IncomingMessage): Promise<MCPSession> {
9  const requestId = crypto.randomUUID();
10  const startTime = Date.now();
11
12  logger.debug({
13    requestId,
14    method: request.method,
15    url: request.url
16  }, 'Authentication request started');
17
18  try {
19    // Extract bearer token
20    const token = extractBearerToken(request);
21    if (!token) {
22      throw new UnauthorizedError('No bearer token provided');
23    }
24
25    // Fast path: Try to decode JWT for userId
26    let userId: string | null = null;
27    let tokenExp: number | null = null;
28
29    try {
30      const decoded = decodeJwt(token);
31      userId = decoded.sub || null;
32      tokenExp = decoded.exp || null;
33
34      // Quick expiration check
35      if (tokenExp && tokenExp < Date.now() / 1000) {
36        logger.debug({ requestId, userId }, 'Token expired, skipping cache');
37        userId = null; // Force validation
38      }
39    } catch (error) {
40      logger.debug({ requestId }, 'Failed to decode JWT for cache lookup');
41    }
42
43    // Try cache if we have a userId
44    if (userId && sessionStore.isAvailable()) {
45      const cached = await sessionStore.getSession(userId);
46      if (cached && cached.token === token) {
47        const elapsed = Date.now() - startTime;
48        logger.info({
49          requestId,
50          userId,
51          elapsed,
52          source: 'cache'
53        }, 'Authentication successful (cached)');
54
55        return cached;
56      }
57    }
58
59    // Slow path: Full JWT validation
60    logger.debug({ requestId }, 'Cache miss, performing JWT validation');
61    const session = await validateJWTWithWorkOS(token);
62
63    // Store for next time
64    if (sessionStore.isAvailable()) {
65      await sessionStore.storeSession(session);
66    }
67
68    const elapsed = Date.now() - startTime;
69    logger.info({
70      requestId,
71      userId: session.userId,
72      elapsed,
73      source: 'jwt'
74    }, 'Authentication successful (validated)');
75
76    return session;
77  } catch (error) {
78    const elapsed = Date.now() - startTime;
79    logger.error({
80      requestId,
81      error: error instanceof Error ? error.message : 'Unknown error',
82      elapsed
83    }, 'Authentication failed');
84
85    // Return proper HTTP response for MCP
86    if (error instanceof UnauthorizedError) {
87      throw new Response(null, {
88        status: 401,
89        headers: {
90          'WWW-Authenticate': `Bearer error="unauthorized", ` +
91            `error_description="${error.message}", ` +
92            `resource_metadata="${getResourceMetadataUrl()}"`
93        }
94      });
95    }
96
97    throw error;
98  }
99}
100
101function extractBearerToken(request: IncomingMessage): string | null {
102  const authHeader = request.headers.authorization;
103  if (!authHeader) return null;
104
105  const match = authHeader.match(/^Bearer (.+)$/);
106  return match ? match[1] : null;
107}
108
109class UnauthorizedError extends Error {
110  constructor(message: string) {
111    super(message);
112    this.name = 'UnauthorizedError';
113  }
114}
115
116function getResourceMetadataUrl(): string {
117  const host = process.env.MCP_HOST || 'https://mcp.braingrid.ai';
118  return `${host}/.well-known/oauth-protected-resource`;
119}
120
121// JWT validation with WorkOS
122const jwksCache = new Map<string, ReturnType<typeof createRemoteJWKSet>>();
123
124async function validateJWTWithWorkOS(token: string): Promise<MCPSession> {
125  const issuer = process.env.WORKOS_ISSUER || 'https://auth.workos.com';
126
127  try {
128    // Get or create JWKS
129    let jwks = jwksCache.get(issuer);
130    if (!jwks) {
131      jwks = createRemoteJWKSet(new URL(`${issuer}/oauth2/jwks`));
132      jwksCache.set(issuer, jwks);
133    }
134
135    // Verify JWT with options
136    const verifyOptions: any = {
137      issuer,
138      algorithms: ['RS256'],
139    };
140
141    // Only check audience if configured
142    if (process.env.WORKOS_CLIENT_ID) {
143      verifyOptions.audience = process.env.WORKOS_CLIENT_ID;
144    }
145
146    const { payload } = await jwtVerify(token, jwks, verifyOptions);
147
148    // Validate required claims
149    if (!payload.sub || !payload.email || !payload.org_id) {
150      throw new Error('Missing required JWT claims');
151    }
152
153    // Create session from JWT claims
154    return {
155      userId: payload.sub,
156      email: payload.email as string,
157      organizationId: payload.org_id as string,
158      scopes: Array.isArray(payload.scopes) ? payload.scopes : [],
159      token,
160    };
161  } catch (error) {
162    logger.error({ error: error instanceof Error ? error.message : 'Unknown error' }, 'JWT validation failed');
163    throw error;
164  }
165}

Graceful Degradation

The implementation handles Redis failures gracefully by simply returning null and forcing re-authentication. This is intentional - in a serverless environment, there's no point in falling back to in-memory caching since each instance has its own memory. Better to fail fast and have the user re-authenticate than to create inconsistent state.

Production Deployment Strategies

Cloud Run Configuration

Create a comprehensive deployment configuration:

1# Multi-stage build for optimization
2FROM node:22-alpine AS builder
3
4WORKDIR /app
5
6# Copy package files
7COPY package*.json ./
8COPY pnpm-lock.yaml ./
9
10# Install dependencies
11RUN npm install -g pnpm && pnpm install --frozen-lockfile
12
13# Copy source code
14COPY . .
15
16# Build TypeScript
17RUN pnpm run build
18
19# Production stage
20FROM node:22-alpine
21
22WORKDIR /app
23
24# Install production dependencies only
25COPY package*.json ./
26COPY pnpm-lock.yaml ./
27RUN npm install -g pnpm && pnpm install --prod --frozen-lockfile
28
29# Copy built application
30COPY --from=builder /app/dist ./dist
31
32# Set environment
33ENV NODE_ENV=production
34ENV MCP_TRANSPORT=httpStream
35
36# Health check
37HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
38  CMD node -e "require('http').get('http://localhost:8080/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"
39
40EXPOSE 8080
41
42CMD ["node", "dist/server.js"]

Deploy with proper configuration:

1#!/bin/bash
2# deploy-cloud-run.sh
3
4PROJECT_ID="your-project-id"
5SERVICE_NAME="braingrid-mcp-server"
6REGION="us-central1"
7REDIS_URL="rediss://<your-redis-instance>"
8
9# Build and push image
10gcloud builds submit --tag gcr.io/${PROJECT_ID}/${SERVICE_NAME}
11
12# Deploy to Cloud Run
13gcloud run deploy ${SERVICE_NAME} \
14  --image gcr.io/${PROJECT_ID}/${SERVICE_NAME} \
15  --platform managed \
16  --region ${REGION} \
17  --allow-unauthenticated \
18  --set-env-vars "MCP_TRANSPORT=httpStream" \
19  --set-env-vars "BRAINGRID_ENV=production" \
20  --set-env-vars "REDIS_URL=${REDIS_URL}" \
21  --set-secrets "ENCRYPTION_KEY=mcp-encryption-key:latest" \
22  --cpu 1 \
23  --memory 512Mi \
24  --min-instances 1 \
25  --max-instances 100 \
26  --concurrency 80 \
27  --timeout 300

Vercel Configuration

For Vercel deployment, create vercel.json:

1{
2  "version": 2,
3  "builds": [
4    {
5      "src": "dist/server.js",
6      "use": "@vercel/node"
7    }
8  ],
9  "routes": [
10    {
11      "src": "/health",
12      "dest": "/dist/server.js"
13    },
14    {
15      "src": "/mcp",
16      "dest": "/dist/server.js"
17    },
18    {
19      "src": "/.well-known/oauth-protected-resource",
20      "dest": "/dist/server.js"
21    }
22  ],
23  "env": {
24    "MCP_TRANSPORT": "httpStream",
25    "NODE_ENV": "production"
26  }
27}

Monitoring and Debugging

Structured Logging

Implement comprehensive logging for production debugging:

1import pino from 'pino';
2
3// Configure structured logging
4export const logger = pino({
5  level: process.env.LOG_LEVEL || 'info',
6  transport: process.env.NODE_ENV === 'production' ? undefined : {
7    target: 'pino-pretty',
8    options: {
9      colorize: true,
10      translateTime: 'HH:MM:ss Z',
11      ignore: 'pid,hostname'
12    }
13  },
14  formatters: {
15    level: (label) => {
16      return { level: label };
17    }
18  },
19  serializers: {
20    req: (req) => ({
21      method: req.method,
22      url: req.url,
23      headers: {
24        ...req.headers,
25        authorization: req.headers.authorization ? '[REDACTED]' : undefined
26      }
27    }),
28    err: pino.stdSerializers.err
29  }
30});
31
32// Request tracking middleware
33import { IncomingMessage, ServerResponse } from 'http';
34
35export function requestLogging() {
36  return (req: IncomingMessage, res: ServerResponse, next: () => void) => {
37    const start = Date.now();
38    const requestId = crypto.randomUUID();
39
40    // Attach to request
41    (req as any).requestId = requestId;
42
43    // Log request
44    logger.info({
45      requestId,
46      req,
47      type: 'request'
48    }, 'Incoming request');
49
50    // Log response
51    res.on('finish', () => {
52      const elapsed = Date.now() - start;
53      logger.info({
54        requestId,
55        statusCode: res.statusCode,
56        elapsed,
57        type: 'response'
58      }, 'Request completed');
59    });
60
61    next();
62  };
63}

Metrics Collection

For production deployments, export metrics to your observability platform:

1// Example: Exporting MCP tool call metrics to DataDog
2import { StatsD } from 'node-dogstatsd';
3
4const dogstatsd = new StatsD({
5  host: process.env.DD_AGENT_HOST || 'localhost',
6  port: 8125,
7  prefix: 'mcp.server.',
8  tags: [`env:${process.env.BRAINGRID_ENV || 'development'}`]
9});
10
11// Track tool usage
12export function recordToolCall(toolName: string, duration: number, success: boolean) {
13  // Record timing metric
14  dogstatsd.timing('tool.call.duration', duration, [
15    `tool:${toolName}`,
16    `status:${success ? 'success' : 'failure'}`
17  ]);
18
19  // Increment counter
20  dogstatsd.increment('tool.call.count', 1, [
21    `tool:${toolName}`,
22    `status:${success ? 'success' : 'failure'}`
23  ]);
24}
25
26// In your tool implementation:
27server.addTool({
28  name: 'create_requirement',
29  execute: async (args, context) => {
30    const startTime = Date.now();
31    try {
32      const result = await apiClient.createRequirement(args);
33      recordToolCall('create_requirement', Date.now() - startTime, true);
34      return result;
35    } catch (error) {
36      recordToolCall('create_requirement', Date.now() - startTime, false);
37      throw error;
38    }
39  }
40});

Performance Optimization

Connection Pooling

Optimize Redis connections for serverless:

1// Redis connection pool for serverless
2export class RedisConnectionPool {
3  private static instance: Redis | null = null;
4
5  static getInstance(): Redis | null {
6    if (!this.instance && process.env.REDIS_URL) {
7      this.instance = new Redis(process.env.REDIS_URL, {
8        // Connection pool settings
9        maxRetriesPerRequest: 3,
10        enableReadyCheck: true,
11        lazyConnect: true, // Important for serverless
12
13        // Serverless-optimized timeouts
14        connectTimeout: 5000,
15        commandTimeout: 5000,
16
17        // Connection reuse
18        keepAlive: 30000,
19        noDelay: true,
20
21        // Handle connection errors gracefully
22        retryStrategy: (times) => {
23          if (times > 3) return null; // Stop retrying
24          return Math.min(times * 100, 3000);
25        }
26      });
27
28      // Ensure connection is established
29      this.instance.connect().catch((err: Error) => {
30        logger.error({ err: err.message }, 'Redis connection failed');
31        this.instance = null;
32      });
33    }
34
35    return this.instance;
36  }
37
38  static async close(): Promise<void> {
39    if (this.instance) {
40      await this.instance.quit();
41      this.instance = null;
42    }
43  }
44}

Request Batching

Optimize for concurrent requests:

1export class BatchedJWTValidator {
2  private readonly pendingValidations = new Map<string, Promise<MCPSession>>();
3
4  async validateToken(token: string): Promise<MCPSession> {
5    // Check if validation is already in progress
6    if (this.pendingValidations.has(token)) {
7      logger.debug('Reusing pending validation');
8      return this.pendingValidations.get(token)!;
9    }
10
11    // Start new validation
12    const validationPromise = this.performValidation(token)
13      .finally(() => {
14        // Clean up after completion
15        this.pendingValidations.delete(token);
16      });
17
18    this.pendingValidations.set(token, validationPromise);
19    return validationPromise;
20  }
21
22  private async performValidation(token: string): Promise<MCPSession> {
23    // Actual JWT validation logic
24    return validateJWTWithWorkOS(token);
25  }
26}

Conclusion

Hosting MCP servers in serverless environments is challenging, but the patterns we've covered make it possible to build production-ready solutions that scale.

The key technical takeaways:

  1. Session persistence is non-negotiable - Without Redis or similar external storage, your users face constant re-authentication
  2. Security can't be an afterthought - Proper encryption (AES-256-GCM) and secure token handling are essential
  3. Fast-path optimization matters - JWT validation is expensive; caching authenticated sessions dramatically improves performance
  4. Graceful degradation over complex fallbacks - When Redis fails, force re-authentication rather than trying clever in-memory solutions
  5. Observable systems are debuggable systems - Export metrics to DataDog or your platform of choice

By solving these challenges, we transformed our MCP server from a local development tool into infrastructure that our entire team relies on. The same patterns apply whether you're building tools for internal use or creating MCP servers for the broader community.

The future of development involves AI assistants that understand context and can take meaningful actions. Making that future accessible to teams - not just individual developers - requires solving the infrastructure challenges we've outlined here.


About the Author

Tyler Wells is the Co-founder and CTO of BrainGrid, where we're building the future of AI-assisted software development. With over 25 years of experience in distributed systems and developer tools, Tyler focuses on making complex technology accessible to engineering teams.

Want to discuss MCP server architecture or share your experiences? Find me on X or connect on LinkedIn.

Interested in turning half-baked thoughts into crystal-clear, AI-ready specs and tasks that your IDE can nail, the first time? Check out BrainGrid - Follow us on X for updates.

Ready to build without the back-and-forth?

Turn messy thoughts into engineering-grade prompts that coding agents can nail, the first time.

Get Started