# SEC-MS-02 — Streaming Upload Refactor (Requirements Draft) **Goal** Replace the current “single POST with multipart FormData” guest upload with a streaming / chunked pipeline that: - avoids buffering entire files in PHP memory - supports larger assets (target 25 MB originals) - keeps antivirus/EXIF scrubbing and storage accounting intact - exposes clear retry semantics to the guest PWA This document captures the scope for SEC-MS-02 and feeds into implementation tickets. --- ## 1. Current State (Baseline) - Upload endpoint: `POST /api/v1/events/{token}/upload` handled by `EventPublicController::upload`. - Laravel validation enforces `image|max:6144` (≈6 MB). Entire file is received via `Request::file('photo')`. - Storage flow: `Storage::disk($hotDisk)->putFile(...)` followed by synchronous thumbnail creation and `event_media_assets` bookkeeping. - Device rate limiting: simple counter (`guest_name` = device id) per event. - Security: join token validation + IP rate limiting; antivirus/exif cleanup handled asynchronously by `ProcessPhotoSecurityScan` (queued). - Frontend: guest PWA uses `fetch` + FormData; progress handled by custom XHR queue for UI feedback. Pain points: - Upload size ceiling due to PHP post_max_size + memory usage. - Slow devices stall the controller request; no streaming/chunk resume. - Throttling/locks only consider completed uploads; partial data still consumes bandwidth. --- ## 2. Target Architecture Overview ### 2.1 Session-Based Chunk Upload 1. **Create session** - `POST /api/v1/events/{token}/uploads` → returns `upload_id`, `upload_key`, storage target, chunk size. - Validate join token + device limits *before* accepting session. Record session in new table `event_upload_sessions`. 2. **Upload chunks** - `PUT /api/v1/events/{token}/uploads/{upload_id}/chunk` with headers: `Content-Range`, `Content-Length`, `Upload-Key`. - Chunks written to hot storage *stream* destination (e.g. `storage/app/uploads/{upload_id}/chunk_{index}`) via `StreamedResponse`/`fopen`. - Track received ranges in session record; enforce sequential or limited parallel chunks. 3. **Complete upload** - `POST /api/v1/events/{token}/uploads/{upload_id}/complete` - Assemble chunks → single file (use stream copy to final path), compute checksum, dispatch queue jobs (AV/EXIF, thumbnail). - Persist `photos` row + `event_media_assets` references (mirroring current logic). 4. **Abort** - `DELETE /api/v1/events/{token}/uploads/{upload_id}` to clean up partial data. ### 2.2 Storage Strategy - Use `EventStorageManager` hot disk but with temporary “staging” directory. - After successful assembly, move to final `events/{eventId}/photos/{uuid}.ext`. - For S3 targets, evaluate direct multipart upload to S3 using pre-signed URLs: - Option A (short-term): stream into local disk, then background job pushes to S3. - Option B (stretch): delegate chunk upload directly to S3 using `createMultipartUpload`, storing uploadId + partETags. - Ensure staging cleanup job removes abandoned sessions (cron every hour). ### 2.3 Metadata & Limits - New table `event_upload_sessions` fields: `id (uuid)`, `event_id`, `join_token_id`, `device_id`, `status (pending|uploading|assembling|failed|completed)`, `total_size`, `received_bytes`, `chunk_size`, `expires_at`, `failure_reason`, timestamps. - Device/upload limits: enforce daily cap per device via session creation; consider max concurrent sessions per device/token (default 2). - Maximum file size: 25 MB (configurable via `config/media.php`). Validate at `complete` by comparing expected vs actual bytes. ### 2.4 Validation & Security - Require `Upload-Key` secret per session (stored hashed) to prevent hijacking. - Join token + device validations reused; log chunk IP + UA for anomaly detection. - Abort sessions on repeated integrity failures or mismatched `Content-Range`. - Update rate limiter to consider `PUT` chunk endpoints separately. ### 2.5 API Responses & Errors - Provide consistent JSON: - `201` create: `{ upload_id, chunk_size, expires_at }` - chunk success: `204` - complete: `201 { photo_id, file_path, thumbnail_path }` - error codes: `upload_limit`, `chunk_out_of_order`, `range_mismatch`, `session_expired`. - Document in `docs/prp/03-api.md` + update guest SDK. ### 2.6 Backend Jobs - Assembly job (if asynchronous) ensures chunk merge is offloaded for large files; update `ProcessPhotoSecurityScan` to depend on final asset record. - Add metric counters (Prometheus/Laravel events) for chunk throughput, failed sessions, average complete time. --- ## 3. Frontend Changes (Guest PWA) - Replace current FormData POST with streaming uploader: - Request session, slice file into `chunk_size` (default 1 MB) using `Blob.slice`, upload sequentially with retry/backoff. - Show granular progress (bytes uploaded / total). - Support resume: store `upload_id` & received ranges in IndexedDB; on reconnect query session status from new endpoint `GET /api/v1/events/{token}/uploads/{upload_id}`. - Ensure compatibility fallback: if browser lacks required APIs (e.g. old Safari), fallback to legacy single POST (size-limited) with warning. - Update service worker/queue to pause/resume chunk uploads when offline. --- ## 4. Integration & Migration Tasks 1. **Schema**: create `event_upload_sessions` table + indices; optional `event_upload_chunks` if tracking per-part metadata. 2. **Config**: new entries in `config/media.php` for chunk size, staging path, session TTL, max size. 3. **Env**: add `.env` knobs (e.g. `MEDIA_UPLOAD_CHUNK_SIZE=1048576`, `MEDIA_UPLOAD_MAX_SIZE=26214400`). 4. **Cleanup Command**: `php artisan media:prune-upload-sessions` to purge expired sessions & staging files. Hook into cron `/cron/media-prune-sessions.sh`. 5. **Docs**: update PRP (sections 03, 10) and guest PWA README; add troubleshooting guide for chunk upload errors. 6. **Testing**: - Unit: session creation, chunk validation, assembly with mocked storage. - Feature: end-to-end upload success + failure (PHPUnit). - Playwright: simulate chunked upload with network throttling. - Load: ensure concurrent uploads do not exhaust disk IO. --- ## 5. Open Questions - **S3 Multipart vs. Local Assembly**: confirm timeline for direct-to-S3; MVP may prefer local assembly to limit complexity. - **Encryption**: decide whether staging chunks require at-rest encryption (likely yes if hot disk is shared). - **Quota Enforcement**: should device/event caps be session-based (limit sessions) or final photo count (existing)? Combine both? - **Backward Compatibility**: decide when to retire legacy endpoint; temporarily keep `/upload` fallback behind feature flag. --- ## 6. Next Steps - Finalise design choices (S3 vs local) with Media Services. - Break down into implementation tasks (backend API, frontend uploader, cron cleanup, observability). - Schedule dry run in staging with large sample files (20+ MB) and monitor memory/CPU. - Update SEC-MS-02 ticket checklist with deliverables above.