bun's fetch() will silently kill your production app
IDbun’s fetch() will silently kill your production app
this is a story about a few days of debugging a fetch timeout issue in production. if you’re running bun 1.3.x in docker with outgoing HTTPS calls, this might save you a few sleepless nights
background
API server running in docker swarm, multiple replicas behind nginx. the app makes outgoing HTTPS calls to several external services, some on cloud run, push notifications, SMS gateway
originally on bun 1.2.18 with compiled binary (bun build —compile). functionally fine, fetch to external services worked without issues
but bun 1.2.18 had a GC HeapHelper bug where each process ate ~28-50% CPU even when idle. running 5 replicas meant 150-250% CPU just for garbage collection doing nothing
eventually upgraded to bun 1.3.13 to fix the GC CPU issue. CPU dropped immediately
this is what CPU and memory looked like on bun 1.2.18

the symptom
after upgrading to bun 1.3.13, outgoing external API calls started timing out at exactly 15 seconds. not always, intermittently. sometimes 50% of requests would timeout, sometimes 0%
the tricky part: it only happened under real traffic load. every isolated test passed perfectly
the debugging journey
it’s the third-party service
first instinct, the external API is slow. but curl from the same container? 65-200ms. every time. so it’s not the service
it’s DNS
searched bun issues, found oven-sh/bun#10731 about c-ares DNS issues. tried:
- BUN_DNS_RESOLVER=getaddrinfo at build time
- setDefaultResultOrder(‘ipv4first’)` in code
- disabled IPv6 via sysctl
DNS from container was fine (1-20ms). none of these fixed it
it’s the compiled binary
we were using bun build —compile. installed bun runtime in the same container and ran isolated tests, results were fine. switched to runtime mode
but under real traffic it still happened
it’s connection pooling
found oven-sh/bun#9034 about keep-alive reuse on dead sockets. added keepalive: false to all fetch calls
didn’t fix it
it’s IPv6
discovered cloud run target had no AAAA record. curl IPv6 test returned HTTP 000, 1-6 seconds. bun was trying IPv6 first!
disabled IPv6, added ipv4first. helped a little, timeouts went from 50% to ~5%. but still happened
it needs HTTP/2
found oven-sh/bun#13586, bun maintainer confirmed bun fetch uses HTTP/1.1 while some servers respond faster to HTTP/2. tried {protocol: “http2”} (experimental in 1.3.14)
isolated test worked great. under load still timed out. experimental feature wasn’t stable
the breakthrough
installed bun runtime in a container and ran a comparison test:
// bun's fetch()
const r = await fetch(url, { headers })
// vs node:https
const r = await httpsRequest(url, { headers })
both returned identical results in isolated tests. but here’s what happened under real traffic:
| client | isolated | under load |
|---|---|---|
| fetch() | 65ms | 15,000ms timeout |
| node:https | 55ms | 55ms |
| curl | 65ms | 65ms |
node:https and curl were completely unaffected by concurrent load. only bun’s fetch() degraded
this is what outgoing and incoming latency looked like during the issue:


something interesting here: outgoing hit 15 seconds but incoming P99 only showed ~3-4 seconds. the request that triggered the outgoing fetch should also be 15 seconds. the reason is the incoming histogram (from hono-prometheus) only has buckets up to 10 seconds, then jumps to +Inf. histogram_quantile interpolates between 10s and infinity, giving inaccurate results. our outgoing histogram has a bucket at 15 seconds (custom defined), so it shows the actual value
the pattern
the degradation was progressive, not sudden:
- 0-10 min: ~100ms (normal)
- 10-20 min: ~2-4s (degrading)
- 20-30 min: ~8-15s (timeout)
and it affected all external HTTPS calls made via fetch(). internal HTTP calls and the worker process (sequential, no Bun.serve()) were unaffected
the last point was key. same bun version, same fetch(), same docker container, but the worker (which processes jobs sequentially, no Bun.serve()) never had this issue
root cause
bun 1.3.x has a regression in its native fetch() implementation when used for outgoing HTTPS requests under concurrent Bun.serve() incoming load. the internal TLS connection pool degrades over time
this doesn’t happen with:
- node:https (different networking stack in bun)
- curl (completely separate)
- bun 1.2.18 (different fetch internals)
- sequential processing (no concurrent Bun.serve() pressure)
the fix
replace fetch() with node:https for external HTTPS calls:
import https from 'node:https'
function httpsRequest(url: string, options: RequestOptions): Promise<HttpResponse> {
return new Promise((resolve, reject) => {
const u = new URL(url)
const req = https.request({
hostname: u.hostname,
path: u.pathname + u.search,
method: options.method || 'GET',
headers: options.headers || {},
family: 4,
}, (res) => {
let data = ''
res.on('data', (chunk) => data += chunk)
res.on('end', () => resolve({ status: res.statusCode || 0, data }))
})
req.on('error', reject)
req.setTimeout(options.timeout || 15000, () => {
req.destroy(new Error('Request timeout'))
})
if (options.body) req.write(options.body)
req.end()
})
}
after deploying this change, external API calls went from 15s timeout to consistent 50-200ms. zero degradation over time
after (using node:https):


lessons learned
-
isolated tests lie. the issue only appeared under real concurrent load. no amount of bun script.js testing could reproduce it
-
curl from container proves the network is fine. if curl works but your runtime doesn’t, the problem is in the runtime
-
worker vs API comparison was the breakthrough. same code, same container, different concurrency model. if one works and the other doesn’t, it’s a concurrency bug
-
node:https is a valid escape hatch in bun. bun implements node.js APIs using a different code path than native fetch(). when one breaks, the other might work
-
monitor outgoing latency separately. we had prometheus histograms for outgoing calls. without this, we’d never have noticed the progressive degradation
relevant bun issues:
- oven-sh/bun#9034 bad networking performance with connection reuse
- oven-sh/bun#13586 fetch response delay (HTTP/1.1 vs HTTP/2)
- oven-sh/bun#17525 fetch hangs when bun.serve running
- oven-sh/bun#7260 fetch hangs randomly with HTTPS
if you’re hitting similar issues, try node:https before blaming your infrastructure