AI Solutions

AI-Powered Auction Aggregation Platform

A single catalogue for a fragmented market. Veilingen.ai ingests 25,000+ active lots from 18+ Dutch and international auction houses, enriches them with Gemini vision and text models, and surfaces them through a unified discovery experience with creator analytics, saved searches, and a premium tier.

18+ Sources

Auction house coverage

Gemini 2.0

Vision + categorisation

Stripe Billing

Premium subscriptions

PWA

Push-ready, offline-first

// Enrichment pipeline

Scrape

Playwright + Selenium

→

Sync

Batched API ingest

→

Classify

Gemini + keyword

→

Serve

Cached Next.js

// What we built

Unified catalogue — 14-dimension filtering across category, creator, period, source, condition and price — one search for the entire Dutch auction landscape
Scraper fleet — 18+ independent scrapers in Python (Playwright, Selenium, Requests) run daily via GitHub Actions with batched sync to the API
AI categorisation — 45 main categories and 894 subcategories resolved via hybrid keyword matching (~65%) plus Gemini 2.0 Flash for the long tail (~35%)
Creator analytics — Dedicated maker pages with 10-year price trends, monthly distributions, top sales, and source breakdown charts
Dynamic taxonomy — Admin hub edits categories live; the Gemini system prompt rebuilds on mutation with zero downtime and optional git auto-sync
Saved searches & favourites — Per-user filter presets, wishlist with optimistic UI, search history and trending suggestions
Premium subscriptions — Stripe billing with webhook-driven lifecycle, billing portal, and server-side gating for extended analytics and alerts
Admin hub — Scraper monitoring, taxonomy management, lot quality checks, orphan clustering, and audit logging

// By the numbers

25k+

Active lots

18+

Auction sources

894

Subcategories

10y

Price history

// Architecture

Scraping layer

Python 3.11 with Playwright and Selenium for JS-heavy houses, plain Requests for API-friendly sources. GitHub Actions schedules daily runs and posts in 25-lot batches to the Node backend.

API & enrichment

Express on Render with MongoDB Atlas. A 2-hour text-enrichment job and a nightly vision job route lots through Gemini 2.0 Flash with confidence scoring and a keyword fallback.

Discovery app

Next.js 15 on Vercel with a service worker for network-first HTML and cache-first assets. In-memory API caching, dynamic sitemap, and JSON-LD for SEO.

// Engineering challenges

Cost control at AI scale — Naively routing every lot through Gemini would have been uneconomical. A keyword dictionary handles roughly two-thirds of lots for free; only the ambiguous remainder hits the model, bringing per-lot enrichment cost down to ~$0.001.
Taxonomy that evolves without downtime — Auction stock shifts constantly. The hub writes category changes to MongoDB overrides and rebuilds the Gemini system prompt in-process, so a new subcategory is trained and live within seconds of approval.
Heterogeneous sources, one schema — Each auction house exposes lots differently. A shared normaliser and per-source adapter layer turn 18 wildly different data shapes into one canonical lot document with consistent dating, pricing, and condition fields.
Long-horizon price analytics — 10-year rolling price history is precomputed per creator and served via cached routes, so maker pages render instantly even with hundreds of thousands of historical sales behind them.

// Stack

Next.js 15React 19TypeScriptTailwind CSSNode.jsExpressMongoDB AtlasPythonPlaywrightSeleniumGemini 2.0 FlashStripeAWS S3VercelRenderGitHub ActionsSentryPWA

// What we built

Unified catalogue — 14-dimension filtering across category, creator, period, source, condition and price — one search for the entire Dutch auction landscape

Scraper fleet — 18+ independent scrapers in Python (Playwright, Selenium, Requests) run daily via GitHub Actions with batched sync to the API

AI categorisation — 45 main categories and 894 subcategories resolved via hybrid keyword matching (~65%) plus Gemini 2.0 Flash for the long tail (~35%)

Creator analytics — Dedicated maker pages with 10-year price trends, monthly distributions, top sales, and source breakdown charts

Dynamic taxonomy — Admin hub edits categories live; the Gemini system prompt rebuilds on mutation with zero downtime and optional git auto-sync

Saved searches & favourites — Per-user filter presets, wishlist with optimistic UI, search history and trending suggestions

Premium subscriptions — Stripe billing with webhook-driven lifecycle, billing portal, and server-side gating for extended analytics and alerts

Admin hub — Scraper monitoring, taxonomy management, lot quality checks, orphan clustering, and audit logging

// Architecture

Scraping layer

Python 3.11 with Playwright and Selenium for JS-heavy houses, plain Requests for API-friendly sources. GitHub Actions schedules daily runs and posts in 25-lot batches to the Node backend.

API & enrichment

Express on Render with MongoDB Atlas. A 2-hour text-enrichment job and a nightly vision job route lots through Gemini 2.0 Flash with confidence scoring and a keyword fallback.

Discovery app

Next.js 15 on Vercel with a service worker for network-first HTML and cache-first assets. In-memory API caching, dynamic sitemap, and JSON-LD for SEO.

// Engineering challenges

Cost control at AI scale — Naively routing every lot through Gemini would have been uneconomical. A keyword dictionary handles roughly two-thirds of lots for free; only the ambiguous remainder hits the model, bringing per-lot enrichment cost down to ~$0.001.

Taxonomy that evolves without downtime — Auction stock shifts constantly. The hub writes category changes to MongoDB overrides and rebuilds the Gemini system prompt in-process, so a new subcategory is trained and live within seconds of approval.

Heterogeneous sources, one schema — Each auction house exposes lots differently. A shared normaliser and per-source adapter layer turn 18 wildly different data shapes into one canonical lot document with consistent dating, pricing, and condition fields.

Long-horizon price analytics — 10-year rolling price history is precomputed per creator and served via cached routes, so maker pages render instantly even with hundreds of thousands of historical sales behind them.