Skip to content

feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android#367

Open
jkmassel wants to merge 4 commits intotrunkfrom
jkmassel/swift-http-proxy-server
Open

feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android#367
jkmassel wants to merge 4 commits intotrunkfrom
jkmassel/swift-http-proxy-server

Conversation

@jkmassel
Copy link
Contributor

@jkmassel jkmassel commented Mar 11, 2026

What?

Adds a hardened, RFC-conformant HTTP/1.1 request parser and local proxy server for both iOS (Swift) and Android (Kotlin), with shared cross-platform test fixtures to guarantee behavioral parity.

Why?

GutenbergKit's native integration embeds a web editor that communicates with native networking through an in-process HTTP server. Because the server is exposed to JavaScript running in the WebView, the parser must be hardened against malformed and adversarial input per RFC 7230/9110/9112 — a lenient parser could enable request smuggling, header injection, or denial-of-service via resource exhaustion.

How?

Swift (iOS)

  • GutenbergKitHTTP module — Incremental, stateful parser (HTTPRequestParser) that buffers to a temporary file on disk so memory stays flat regardless of body size. Strict RFC conformance: rejects obs-fold, whitespace before colon, conflicting Content-Length, Transfer-Encoding, invalid UTF-8 (round-trip validated), lone surrogates, overlong encodings.
  • HTTPServer — Local HTTP/1.1 server on Network.framework with async handler API, connection limits, read timeouts, and constant-time bearer token authentication.
  • Multipart parsing — RFC 7578 multipart/form-data support with lazy body references (file slices, not copies).
  • RequestBody — Abstracts over in-memory and file-backed storage with InputStream and async data access.

Kotlin (Android)

  • Pure-Kotlin HTTP parser (org.wordpress.gutenberg.http) — Feature-identical port of the Swift parser with no native dependencies. Includes HeaderValue, HTTPRequestSerializer, HTTPRequestParser (with disk-backed buffering via Buffer/TempFileOwner), ParsedHTTPRequest, MultipartPart, and RequestBody.
  • HttpServer — Local HTTP/1.1 server with connection limits, read timeouts, pure-Kotlin response serialization, and proper 400 responses on premature connection close.

Shared cross-platform test fixtures

  • 163 JSON test fixtures in test-fixtures/http/ covering header value extraction (20 cases), request parsing (58 basic + 42 error + 4 incremental), and multipart parsing (32 field-based + 7 error).
  • 8 dedicated UTF-8 edge cases — overlong encodings, lone surrogates, truncated sequences, code points above U+10FFFF.
  • 2 whitespace-before-colon cases — space and tab before colon, returning the specific whitespaceBeforeColon error (request smuggling vector per RFC 7230 §3.2.4).
  • Both platforms load the same fixture files, guaranteeing identical parse behavior.

Key design decisions

  • 64-bit Content-Length on both platforms (Swift Int64, Kotlin Long) with a 4 GB default max body size.
  • UTF-8 round-trip validation on both platforms to reject silently-accepted malformed sequences.
  • Disk-backed buffering with configurable inMemoryBodyThreshold (512 KB default) — bodies below threshold stay in memory, larger ones reference the temp file directly as a slice.
  • Multipart part bodies are lazy file-slice references for file-backed sources, avoiding copies during parsing.

Testing Instructions

Automated (CI runs these on every push)

  1. swift test — runs 820+ tests including 163 fixture-based cross-platform tests
  2. cd android && ./gradlew :Gutenberg:test — runs all Android unit tests including fixture tests

On-device (manual)

  1. iOS: Open ios/Demo-iOS/Gutenberg.xcodeproj in Xcode, run on a device, navigate to the Media Proxy Server screen — it prints the server URL
  2. Android: Build and install the demo app, open the Media Proxy Server activity — it prints the server URL
  3. Send adversarial requests to verify hardening (all should return appropriate 4xx errors):
    # Missing Host header → 400
    printf "GET / HTTP/1.1\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Transfer-Encoding (rejected) → 400
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nTransfer-Encoding: chunked\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Oversized headers → 431
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nX-Long: $(python3 -c 'print("X"*70000)')\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Conflicting Content-Length → 400
    printf "POST / HTTP/1.1\r\nHost: localhost\r\nContent-Length: 5\r\nContent-Length: 10\r\n\r\nhello" | nc -w 3 <device-ip> <port>
    
    # Whitespace before colon (smuggling vector) → 400
    printf "GET / HTTP/1.1\r\nHost: localhost\r\nContent-Length : 0\r\n\r\n" | nc -w 3 <device-ip> <port>
    
    # Premature connection close → 400
    printf "\r\n\r\n" | nc -w 3 <device-ip> <port>
  4. Instrumented tests: cd android && ./gradlew :Gutenberg:connectedDebugAndroidTest -Pandroid.testInstrumentationRunnerArguments.class=org.wordpress.gutenberg.http.InstrumentedFixtureTests runs all 163 fixture cases on a connected device

🤖 Generated with Claude Code

@jkmassel jkmassel added the [Type] Enhancement A suggestion for improvement. label Mar 11, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch 3 times, most recently from cce3ff6 to 57d1b7c Compare March 12, 2026 22:36
@jkmassel jkmassel changed the title Add GutenbergKitHTTP: HTTP/1.1 parser and local proxy server feat: add cross-platform HTTP/1.1 parser and local proxy server for iOS and Android Mar 14, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch 16 times, most recently from 29f9c27 to 0a09c12 Compare March 17, 2026 17:49
jkmassel and others added 4 commits March 17, 2026 17:32
Add a hardened, RFC-conformant HTTP/1.1 request parser and local proxy
server for iOS. The parser is exposed to JavaScript running in the
WebView, so it enforces strict validation per RFC 7230/9110/9112 to
prevent request smuggling, header injection, and resource exhaustion.

Includes:
- HTTPRequestParser: Incremental parser with disk-backed buffering
- HTTPRequestSerializer: Stateless header parsing with full RFC validation
- HeaderValue: RFC 2045 parameter extraction with quoted string handling
- MultipartPart: RFC 7578 multipart/form-data with lazy file-slice refs
- RequestBody: In-memory and file-backed storage with InputStream access
- HTTPServer: Local server on Network.framework with async handler API,
  connection limits, read timeouts (408 per RFC 9110 §15.5.9), and
  constant-time bearer token auth
- HTTPResponse: Response serialization with header sanitization,
  Content-Length always derived from actual body size

All size-related types use Int64 to match Kotlin's Long and prevent
ambiguity on future platforms.

820+ tests covering RFC 7230, 7578, 8941, 9110, 9112, and 9651.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tures

Add a pure-Kotlin implementation of the HTTP/1.1 request parser that
mirrors the Swift GutenbergKitHTTP library, enabling HTTP parsing on
Android without native dependencies.

Includes HeaderValue, HTTPRequestSerializer, HTTPRequestParser (with
disk-backed buffering), ParsedHTTPRequest, MultipartPart, RequestBody,
and HttpServer with bearer token authentication, constant-time token
comparison, status code clamping, and Content-Length always derived
from actual body size — matching the Swift server's security model.

Both platforms are validated against 101 shared JSON test fixtures in
test-fixtures/http/ covering header value extraction, request parsing
(basic, error, incremental), multipart parsing (field-based, raw-body,
error), and 8 dedicated UTF-8 edge cases (overlong encodings, lone
surrogates, truncated sequences).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add demo app screens that start the local HTTP server and display its
address for manual testing with curl or a browser. Enables on-device
validation of the parser against adversarial inputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run the shared JSON test fixtures on an actual Android device via
connectedDebugAndroidTest, validating the pure-Kotlin HTTP parser
under ART in addition to the JVM unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jkmassel jkmassel requested a review from dcalhoun March 18, 2026 16:07
@jkmassel jkmassel self-assigned this Mar 18, 2026
@jkmassel jkmassel force-pushed the jkmassel/swift-http-proxy-server branch from 0a09c12 to e8cda3c Compare March 18, 2026 16:15
@jkmassel jkmassel marked this pull request as ready for review March 18, 2026 16:38
Copy link
Member

@dcalhoun dcalhoun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkmassel the testing instructions succeeded for me.

I also had Claude swiftly rebase #357 atop this work and update it to use the HTTP server library in this work. Good news is that it seems to work. The result of the work is in the feat/leverage-host-media-processing-stacked branch (here is the current diff).

I did encounter a couple of issues that I noted in inline comments below. I worked around them in the feat/leverage-host-media-processing-stacked branch, but we might address them in the library instead. WDYT?


// Check auth before consuming body to avoid buffering
// up to maxRequestBodySize for unauthenticated clients.
if requiresAuthentication {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When requiresAuthentication is enabled, auth is checked on every request including OPTIONS. In practice this makes the server incompatible with any browser client using fetch(): browsers send a CORS preflight OPTIONS before the actual request, and the Fetch spec forbids auth headers on preflight — so it always gets rejected before the real request can be made.

Since OPTIONS responses contain no sensitive information (just allowed methods/headers), there's no security value in authenticating them. A simple exemption here would make the built-in auth usable from browser contexts without requiring callers to opt out entirely:

if requiresAuthentication && partial.method.uppercased() != "OPTIONS" {
    guard authenticate(partial, token: token) else {
        throw HTTPServerError.authenticationFailed
    }
}

Without this, callers who need to support browser clients must set requiresAuthentication: false and reimplement auth in their handler, losing the library's constant-time comparison and other protections.

Note: This applies to the Android implementation as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yep, this is a sensible change. c68a223 skips auth for OPTIONS

Comment on lines +448 to +463
/// Validates the proxy bearer token from the `Proxy-Authorization` header
/// (RFC 9110 §11.7.1). Using `Proxy-Authorization` keeps the client's
/// `Authorization` header available for upstream credentials.
private static func authenticate(_ request: ParsedHTTPRequest, token: String) -> Bool {
guard let proxyAuth = request.header("Proxy-Authorization") else {
return false
}

let prefix = "Bearer "
guard proxyAuth.prefix(prefix.count).caseInsensitiveCompare(prefix) == .orderedSame else {
return false
}

let provided = String(proxyAuth.dropFirst(prefix.count))
return constantTimeEqual(provided, token)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The built-in auth uses Proxy-Authorization, which is a forbidden request header in the Fetch spec. WebKit's fetch() (and all standards-compliant browsers) silently strip it before sending — meaning any browser client using the built-in auth will always get a 407, with no error surfaced to the caller.

The RFC 9110 §11.7.1 rationale (keeping Authorization free for upstream credentials) is sound, but Proxy-Authorization is the wrong vehicle when the client is a browser. A custom header like X-GBKit-Token avoids the forbidden-header restriction while preserving the same separation of concerns.

Proposal: change the built-in auth to use a custom header (e.g. X-GBKit-Token) so it works correctly from browser contexts out of the box.

Note: This applies to the Android implementation as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about forbidden request headers. The reasoning seems sound though! 0f83848 adds the Relay-Authorization header (using X- as a prefix is deprecated).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Type] Enhancement A suggestion for improvement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants