Worker Activity Monitor
Desktop AppIn Development
Worker Activity Monitor is a production-grade desktop application built for workforce management. It provides real-time, multi-dimensional productivity analysis by combining physical input tracking, application usage analytics, AI-powered attention detection, and intelligent content classification. The system uses a dual-process architecture (Flutter + Python) with a JSON streaming protocol, where the Python sidecar runs MediaPipe Face Mesh locally for gaze detection and head pose estimation — all with zero cloud dependency. A dual-timer system provides per-second polling for real-time UI updates alongside per-minute aggregation for efficient database writes.
Key Features
- Real-time mouse movement and keyboard input tracking via Win32 FFI with configurable idle thresholds (10-60s)
- Application usage analytics with foreground app detection and friendly name mapping for 25+ applications
- AI-powered eye tracking using local MediaPipe Face Mesh (468-point landmarks) at ~5 FPS for EAR, head pose, iris gaze, and face occlusion detection
- Combined attention status merging physical activity + eye tracking into 4 states: Active, Watching, Suspicious, Idle
- Smart video content detection classifying watching as productive or idle across 30+ streaming/educational platforms
- Typing quality analysis with per-minute keystroke metrics, burst patterns, key ratios, and suspicious activity detection
- Shift lifecycle management with crash-resilient resume, timeline reconstruction, and auto-end at configured time
- Professional PDF reports with shift summary, activity timeline charts, app usage breakdown, and attention metrics
- System tray integration with minimal footprint and full shift control from the taskbar notification area
- Browser URL tracking via COM automation supporting Chrome, Edge, Firefox, Brave, Opera, and Vivaldi
Tech Stack
Technical Details
Built with Flutter Desktop (Windows) using Provider state management, fl_chart for data visualization, and PDF generation for reports. The AI layer runs a Python sidecar using MediaPipe FaceLandmarker with 468-point face mesh for Eye Aspect Ratio calculation, 3D head pose estimation via solvePnP, iris gaze tracking, and distance-adaptive thresholds. System integration uses Win32 API calls via Dart FFI for mouse tracking, keyboard input detection, foreground window identification, and COM Automation for browser URL extraction. The dual-process architecture communicates over a JSON streaming protocol via stdout, enabling graceful degradation if Python is unavailable. SQLite database with 7 progressive schema migrations stores per-second granularity alongside minute aggregates. The app features 9 screens including dashboard with live status, reports with date navigation, app deep-dive, watching timeline, and settings.
Challenges & Solutions
The biggest challenge was implementing reliable AI eye tracking without cloud dependencies. MediaPipe Face Mesh runs at ~5 FPS locally, requiring careful optimization of the Python sidecar and efficient JSON streaming to avoid blocking Flutter's UI thread. Combining physical activity data with eye tracking into a unified attention status required designing a state machine that handles edge cases like face occlusion and camera unavailability gracefully. Browser URL extraction via COM automation needed per-browser handling since each browser exposes accessibility differently. The 7-version database migration system had to preserve per-second data granularity while evolving the schema from basic activity logs to full attention and watching session tracking.
Role
Full Stack Developer
Duration
Ongoing
Status
In Development