Every bot verification challenge is a labeling task. HiveGuard sits in front of your web app โ suspicious traffic gets an interactive challenge, and the user's answer labels a training item in your dataset. One feedback loop: protect your site, build your dataset, train a better model.
Runs on-premise, behind a firewall, or on any cloud. MIT-licensed. No per-solve fees, ever.
Starts with heuristic rules, graduates to a trained ML model. Challenge outcomes (solved = human, expired = bot) feed back into a LogisticRegression classifier that improves with every interaction.
Image grids, audio snippets, text classification, and more. Each challenge type is pluggable โ serve the modality that makes sense for your audience and your labeling pipeline.
Responses from multiple users are merged through majority voting. Ground-truth items verify that the human is real; unknown items accumulate votes until a high-confidence label emerges.
Every challenge outcome becomes training data. Retrain the risk model from the dashboard with one click. The model learns which request patterns (headers, UA, timing) correlate with bots vs. humans.
Run as a reverse proxy, embed the widget standalone, or call the REST API directly. Works on-premise, in the cloud, or air-gapped. Single Docker Compose command to deploy.
Monitor challenge throughput, model accuracy, dataset growth, and consensus convergence. Export labeled datasets for your ML pipeline. Manage datasets, API keys, and proxy rules.
Turn existing web traffic into an annotation workforce. Label images, text, or audio without hiring annotators โ every bot check is a labeling task. Scale with your traffic, not your budget.
Self-hosted annotation on your own infrastructure. Data never leaves your network. Deploy on campus servers with Docker โ GDPR and ethics-board friendly.
No SaaS dependency. The ML model trains locally from your own traffic data. Works completely offline โ risk scoring, challenges, and consensus all self-contained.
Drop-in reverse proxy that protects login pages, APIs, and forms from automated traffic while building labeled datasets from real user interactions.
Every inbound request is scored by the ML risk model (or heuristic fallback). Requests above the threshold are intercepted and served a verification challenge. Request features are captured for training.
12 features โ risk score 0.0โ1.0Each challenge pairs a ground-truth item (known answer, verifies the human) with an unknown item (collects a label). Correct GT answers are recorded as "solved" โ model learns this = human.
solved โ human ยท expired โ botRetrain the model from the dashboard API. The classifier learns which request patterns correlate with bots vs. humans. Unknown item labels converge via consensus voting and are exported for your ML pipeline.
POST /dashboard/api/ml/trainFive modalities โ the exact same widget your users will see.
Pick one and interact with it live.
Three lines of config. One Docker command. Annotation pipeline running.
# docker-compose.yml services: hiveguard: image: ghcr.io/buiapp/hiveguard-internal:latest ports: ["8080:8080"] environment: UPSTREAM_URL: http://your-app:3000 OWNER_ORG: acme PROTECTED_PATHS: '["/login", "/register", "/api/submit"]' DATABASE_URL: postgresql+asyncpg://user:pass@db/hiveguard REDIS_URL: redis://redis:6379 depends_on: [db, redis]
# nginx.conf โ point your domain at HiveGuard upstream hiveguard { server localhost:8080; } server { listen 443 ssl; server_name app.example.com; location / { proxy_pass http://hiveguard; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } }
<!-- 1. Load the widget script once --> <script src="https://guard.example.com/_hiveguard/widget.js" defer></script> <!-- 2. Place the mount point --> <!-- data-modality: pattern | grid | image | text | audio --> <div data-hg-widget data-api-key="hg_your_widget_token" data-modality="pattern" data-callback="onHiveGuardPass" ></div> <!-- 3. Handle the verified token --> <script> function onHiveGuardPass(token) { // pass token to your server for verification document.getElementById('hg-token').value = token; document.getElementById('my-form').submit(); } </script>
# Verify a submitted challenge token server-side import httpx response = httpx.post( "https://guard.example.com/api/challenge/verify", headers={"X-API-Key": "your-api-key"}, json={"token": request.form["hg-token"]}, ) if response.json()["success"]: # human verified โ proceed create_account(request) else: # bot or failed โ reject / re-challenge abort(403)
Full documentation and OpenAPI spec available in the GitHub repository. Need help? Open an issue or start a discussion.