IMG Processing

Every developer hits this wall eventually. You need to resize images, remove backgrounds, or add watermarks. You Google around, find a dozen different libraries, spend hours integrating them, and realize half don’t work the way you expected. Then you need to deploy it somewhere with enough compute power to handle image processing. The rabbit hole goes deep.

I built IMG Processing because I was tired of solving the same problem over and over. A REST API where you upload an image and get back exactly what you need. No infrastructure headaches, no dependency nightmares.

What It Does

IMG Processing handles the common image operations that developers need:

Storage: Unlimited image hosting with direct URLs
Transformations: Resize, crop, rotate, mirror
Editing: Background removal, blur, color modulation
Analysis: Image classification, OCR text extraction, visual Q&A
AI Generation: Create images from text prompts

The API follows REST conventions. You authenticate with an API key in the header and get JSON responses.

curl -X POST https://api.img-processing.com/v1/images/upload \
  -H "x-api-key: YOUR_API_KEY" \
  -F "file=@image.jpg" \
  -F "name=my-image"

Architecture

The entire platform runs on Cloudflare’s edge infrastructure. This was a deliberate choice for both performance and cost.

Core Stack:

Hono - Lightweight web framework optimized for edge runtimes
Cloudflare Workers - Serverless compute at the edge
D1 - SQLite database on Cloudflare with Prisma ORM
R2 - Object storage for images (dual buckets: private and public)

The codebase follows a feature-based architecture. Each capability—access, analysis, creation, edition, transformation, and batch operations—lives in its own module. This organization keeps related code together and makes it straightforward to add new image operations.

Image Processing with WebAssembly

Traditional image processing libraries like Sharp require native bindings that don’t work in serverless environments. I use @cf-wasm/photon, a WebAssembly port of the Photon image library.

WebAssembly runs at near-native speed in Workers. Operations like resize, crop, and color adjustments happen in milliseconds without cold starts. The Lanczos resampling algorithm produces high-quality results, especially important for downscaling photographs.

AI and ML Integration

The analysis and generation features integrate multiple AI services:

Cloudflare AI handles classification and visual question answering directly at the edge. No external API calls for basic inference.

AWS Textract powers the OCR text extraction. It handles complex layouts, tables, and handwritten text better than simpler alternatives.

Google GenAI provides the image generation capabilities. Users write a prompt, and the model generates a 1024x1024 image.

Image generated using the /imagine endpoint

The background removal service uses a segmentation model that separates foreground from background. Quality competes with RemoveBG at a fraction of the cost.

The TypeScript SDK

Beyond the REST API, I ship a Node.js SDK generated with Stainless. This tool reads the OpenAPI specification and produces idiomatic TypeScript code.

import ImgProcessing from 'img-processing-sdk';

const client = new ImgProcessing({
  apiKey: process.env.IMG_PROCESSING_API_KEY,
});

// Upload an image
const image = await client.images.upload({
  image: fs.createReadStream('/path/to/photo.jpg'),
  name: 'product-photo'
});

// Remove background
const result = await client.images.edition.removeBackground(image.id);

The SDK handles authentication, retries, file uploads, and response parsing. TypeScript definitions provide autocomplete and compile-time checks.

Multi-Tenant Architecture

The platform supports workspaces for team collaboration. Each workspace has its own:

API keys (test and live)
Image storage quota
Billing and subscription
Member access controls

Authentication uses Clerk, which handles signup, login, and session management. The @hono/clerk-auth middleware validates tokens on every request.

Rate Limiting and Billing

Cloudflare’s native rate limiting protects the API from abuse. Stripe handles subscriptions with three tiers: Hobby, Pro, and Enterprise. Each tier unlocks higher rate limits and removes watermarks from processed images.

Test API keys add watermarks and expire images after 90 days, but operations don’t count against quotas. Developers can prototype without providing payment details.

API Documentation

The documentation uses Scalar for an interactive API reference. Users test endpoints directly in the browser. The playground reduced support questions significantly. Seeing a working request is worth more than reading about one.

What I Learned

Building an API product taught me about the gap between “works for me” and “works for everyone”. Edge cases multiply when you have diverse users.

The D1 + R2 + Workers stack keeps costs extremely low. I pay a fraction of what traditional cloud infrastructure would cost, and the performance is better because everything runs at the edge.

The SDK generator was a game changer. Maintaining client libraries manually is tedious and error-prone. Generating them from OpenAPI means the SDK always matches the API.