Edge-AI DEV LOG

Production-grade DevSecOps architecture hosting localized LLM inference engines on frugal, repurposed consumer hardware. Isolating micro-services, unmasking reverse proxies, and engineer-trapping malicious actors.

Nginx / Docker Isolation MaxMind Binary GeoIP Systemd Supervised Watchdog Ollama Llama3.2-1B

// 01. HARDWARE CAP & CONTEXT WAF

The core node runs completely locally on a repurposed Dell OptiPlex small-form-factor chassis. Operating on standard consumer CPU architecture and constrained DDR3 RAM buffers, traditional brute-force inference loops would instantly starve host system resource boundaries or invoke kernel out-of-memory (OOM) faults.

Architectural Defense: Hardware Truncation

By strictly bounding client payloads to "num_ctx": 1024, resource exhaustion is structurally prevented at the machine level. This restriction creates a physical Web Application Firewall (WAF).

"Because memory bandwidth is bound to hardware constraints, high-volume injection vectors or massive prompt payloads are natively truncated by the allocation boundary before they can influence internal model weights or alter application states."

// 02. SERVER-SIDE GEO-UNMASKING & LOCAL DATABASE RE

Standard client-side geolocation extraction tools are easily manipulated or blocked entirely by custom user privacy tools. To gather un-fakeable audit logs, all incoming traffic passes through an automated lookup architecture.

Because the site uses Cloudflare's CDN network parameters to shield the physical home network infrastructure, standard connection streams register Cloudflare edge routing points rather than the true attacker footprint. To bypass this proxy mask, the backend intercepts and prioritizes the immutable CF-Connecting-IP socket variables.

// Microsecond Binary Execution Instead of firing external API requests that stall compute threads, the Python broker performs immediate local binary matching against MaxMind GeoLite2 databases on-disk, returning City, Zip, Autonomous System Numbers (ASN), and ISP identities within microseconds.
// Dynamic Context Feeding The extracted metrics string is dynamically wrapped inside a verified [SYSTEM CONTEXT] string and injected into the frontend log database. This enriched string is passed directly into Ollama's inference stream, letting the 1B model dynamically gaslight attackers with their own network data.

// 03. MICRO-SERVICE ISOLATION & WATCHDOG DAEMON

The front-facing routing layers run within an isolated, containerized environment managed via an Nginx deployment workspace. To protect the host OS, strict reverse-proxy configuration profiles pass validated requests down to the host Python telemetry daemon via internal socket paths.

Systemd Supervisor: A standalone background process wrapper manages the custom broker executable path inside an isolated virtual environment shell (venv), bypassing global package boundaries completely.
Self-Healing Orchestration: Explicit Restart=always and RestartSec=5s parameters act as a resilient system-level watchdog, recovering the application footprint automatically if it runs into an unhandled compute loop exception.
Streamed Error Logs: Output logging channels natively append real-time telemetry tracebacks directly into historical tracking logs with clean file-appending flags.

// 04. THE FAFO HONEYPOT STRATEGY

Rather than deploying an easily identifiable blocking firewall that tells script-kiddies exactly where the boundary is, the chat interface operates as a psychological honeypot. The local engine acts as an overly eager, insecure AI intern who "clumsily" leaks a master credential when prompted with keywords like "unlock" or "help".

// PAYLOAD EXECUTION SEQUENCE //
1. Actor bypasses system rules to retrieve simulated access token: luna1010!
2. Actor logs the extracted string inside the front-facing "Secure Access" terminal link.
3. Frontend locks the viewport, deploys a high-fidelity client-side audio/video Rickroll loop, and forces a mock system encryption breach panel.
4. The interface auto-downloads an payload executable that is actually just a local copy of the video string, completing the trap.

Meanwhile, the authentic network logs and deep geolocation parameters are safely piped straight into a secure, HTTP Basic Auth-walled NOC Telemetry Dashboard panel, updating real-time compute velocities (Tokens per Second), system load metrics, storage thresholds, and true server uptime.