Optimise parameters

2026-03-16 15:20:59 +00:00
parent ee8fa0960d
commit c447cc41ea
2 changed files with 200 additions and 16 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,92 @@
+# Rowing Stats
+
+Extract workout data from photos of Concept 2 PM5 rowing machine displays using computer vision and Claude's vision API.
+
+## How It Works
+
+Photos go through a three-stage pipeline:
+
+```
+photos/ → crop_to_screen.py → screen_classifier.py → extract_screen_data.py → rowing_results.csv
+```
+
+1. **Screen Detection** (`crop_to_screen.py`) — Finds and perspective-corrects the LCD screen region using OpenCV edge detection, contour filtering, and morphological operations. Candidates are scored by `edge_density × area × rectangularity`.
+2. **Classification** (`screen_classifier.py`) — Filters out non-rowing images. Supports a rule-based feature scorer (no training needed) and a 4-layer CNN with batch norm.
+3. **Data Extraction** (`extract_screen_data.py`) — Extracts time and distance from cropped screen images using Tesseract OCR with multiple preprocessing variants (CLAHE, thresholding, scaling) and majority-vote extraction.
+
+There is also `extract_rowing_data.py`, which uses Claude Haiku's vision API instead of Tesseract for data extraction. This serves as a reference/test for validating OCR accuracy but is more expensive to run due to API costs.
+
+There is also an Optuna-based hyperparameter tuner (`optimize_crop.py`) for the screen detection parameters.
+
+## Setup
+
+### Dependencies
+
+```
+pip install anthropic torch torchvision opencv-python Pillow numpy optuna
+```
+
+### API Key
+
+Create a `.env` file with your Anthropic API key:
+
+```
+ANTHROPIC_API_KEY=sk-ant-...
+```
+
+## Usage
+
+### Full pipeline
+
+```bash
+# 1. Crop screens from photos
+python crop_to_screen.py photos/ cropped/
+
+# 2. Classify — keep only rowing displays
+python screen_classifier.py predict --dir cropped/
+
+# 3. Extract workout data via Tesseract OCR
+python extract_screen_data.py --dir cropped/
+
+# 3b. (Test) Extract via Claude API — more expensive, useful for validating OCR accuracy
+python extract_rowing_data.py --dir photos/
+```
+
+### Individual commands
+
+```bash
+# Classify a single image (feature-based or CNN)
+python screen_classifier.py predict --image path/to/img.jpg
+python screen_classifier.py predict --image path/to/img.jpg --mode cnn
+
+# Extract data from a single image (Tesseract OCR)
+python extract_screen_data.py --image path/to/img.jpg
+
+# Extract data from a single image (Claude API — for testing/validation)
+python extract_rowing_data.py --image path/to/img.jpg
+
+# Train the CNN classifier
+python screen_classifier.py train --data-dir train/
+
+# Optimize crop detection parameters
+python optimize_crop.py --n-trials 300 --photos-dir photos/
+```
+
+## Training Data
+
+The CNN classifier trains on labeled images in `train/`:
+
+- `train/0/` — non-rowing images (negatives)
+- `train/1/` — rowing display images (positives)
+
+The trained model is saved as `screen_classifier_model.pth`.
+
+## Validation
+
+Extracted metrics are validated against sensible bounds:
+
+| Metric   | Min       | Max       |
+| -------- | --------- | --------- |
+| Distance | 100 m     | 100,000 m |
+| Time     | 30 s      | 2 hrs     |
+| Pace     | 1:20/500m | 2:30/500m |