Fix tenant event form package selector so it no longer renders empty-value options, handles loading/empty
states, and pulls data from the authenticated /api/v1/tenant/packages endpoint.
(resources/js/admin/pages/EventFormPage.tsx, resources/js/admin/api.ts)
- Harden tenant-admin auth flow: prevent PKCE state loss, scope out StrictMode double-processing, add SPA
routes for /event-admin/login and /event-admin/logout, and tighten token/session clearing semantics (resources/js/admin/auth/{context,tokens}.tsx, resources/js/admin/pages/{AuthCallbackPage,LogoutPage}.tsx,
resources/js/admin/router.tsx, routes/web.php)
This commit is contained in:
28
docs/deployment/join-token-analytics.md
Normal file
28
docs/deployment/join-token-analytics.md
Normal file
@@ -0,0 +1,28 @@
|
||||
# Join Token Analytics & Alerting (SEC-GT-02)
|
||||
|
||||
## Data Sources
|
||||
- Table `event_join_token_events` captures successes, failures, rate-limit hits, and uploads per join token.
|
||||
- Each row records route, device id, IP, HTTP status, and context for post-incident drill downs.
|
||||
- Logged automatically from `EventPublicController` for `/api/v1/events/*` and `/api/v1/gallery/*`.
|
||||
|
||||
- Super Admin: Event resource → “Join Link / QR” modal now summarises total successes/failures, rate-limit hits, 24h volume, and last activity timestamp per token.
|
||||
- Tenant Admin: identical modal surface so operators can monitor invite health.
|
||||
|
||||
## Alert Thresholds (initial)
|
||||
- **Rate limit spike**: >25 `token_rate_limited` entries for a token within 10 minutes → flag in monitoring (Grafana/Prometheus TODO).
|
||||
- **Failure ratio**: failure_count / success_count > 0.5 over rolling hour triggers warning for support follow-up.
|
||||
- **Inactivity**: tokens without access for >30 days should be reviewed; scheduled report TBD.
|
||||
|
||||
Rate-limiter knobs (see `.env.example`):
|
||||
- `JOIN_TOKEN_FAILURE_LIMIT` / `JOIN_TOKEN_FAILURE_DECAY` — repeated invalid attempts before temporary block (default 10 tries per 5 min).
|
||||
- `JOIN_TOKEN_ACCESS_LIMIT` / `JOIN_TOKEN_ACCESS_DECAY` — successful request ceiling per token/IP (default 120 req per minute).
|
||||
- `JOIN_TOKEN_DOWNLOAD_LIMIT` / `JOIN_TOKEN_DOWNLOAD_DECAY` — download ceiling per token/IP (default 60 downloads per minute).
|
||||
|
||||
## Follow-up Tasks
|
||||
1. Wire aggregated metrics into Grafana once metrics pipeline is ready (synthetic monitors pending SEC-GT-03).
|
||||
2. Implement scheduled command to email tenants a weekly digest of token activity and stale tokens.
|
||||
3. Consider anonymising device identifiers before long-term retention (privacy review).
|
||||
|
||||
## Runbook Notes
|
||||
- Analytics table may grow quickly for high-traffic events; plan nightly prune job (keep 90 days).
|
||||
- Use `php artisan tinker` to inspect token activity: `EventJoinTokenEvent::where('event_join_token_id', $id)->latest()->limit(20)->get()`.
|
||||
58
docs/deployment/oauth-key-rotation.md
Normal file
58
docs/deployment/oauth-key-rotation.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# OAuth JWT Key Rotation Playbook (Dual-Key)
|
||||
|
||||
## Purpose
|
||||
Ensure marketing/tenant OAuth tokens remain valid during RSA key rotations by keeping the previous signing key available until all legacy tokens expire.
|
||||
|
||||
## Prerequisites
|
||||
- Environment variable `OAUTH_KEY_STORE` points to a shared filesystem (default `storage/app/oauth-keys`).
|
||||
- `OAUTH_JWT_KID` set to the current signing key id.
|
||||
- Application deploy tooling able to propagate `.env` changes promptly.
|
||||
- Operations access to run artisan commands in the target environment.
|
||||
|
||||
## Rotation Workflow
|
||||
|
||||
1. **Review existing keys**
|
||||
```bash
|
||||
php artisan oauth:list-keys
|
||||
```
|
||||
Confirm the `current` entry matches `OAUTH_JWT_KID`, note any legacy KIDs that should remain trusted until rotation completes.
|
||||
|
||||
2. **Generate new key pair**
|
||||
```bash
|
||||
php artisan oauth:rotate-keys --kid=fotospiel-jwt-$(date +%Y%m%d%H%M)
|
||||
```
|
||||
- The command now *copies* the existing key into the `/archive` folder but leaves it in-place for token verification.
|
||||
- After the command, run `php artisan oauth:list-keys` again to verify both the old and new KIDs exist.
|
||||
|
||||
3. **Update environment configuration**
|
||||
- Set `OAUTH_JWT_KID` to the newly generated value.
|
||||
- Deploy the updated config (restart queue workers/web instances if they cache config).
|
||||
|
||||
4. **Smoke test issuance**
|
||||
- Request a fresh OAuth token (PKCE flow) and inspect the JWT header — `kid` must match the new value.
|
||||
- Use an existing token issued **before** the rotation to hit a tenant API route; it should continue to verify because the old key remains present.
|
||||
|
||||
5. **Monitor**
|
||||
- Watch application logs for `Invalid token` / `JWT public key not found` errors over the next 24h.
|
||||
- Investigate any anomalies before pruning.
|
||||
|
||||
## Pruning Legacy Keys
|
||||
After the longest access-token + refresh-token lifetime (default: 30 days for refresh), prune the legacy signing directory.
|
||||
|
||||
```bash
|
||||
php artisan oauth:prune-keys --days=45 --force
|
||||
```
|
||||
|
||||
- Use `--dry-run` first to see which directories would be removed.
|
||||
- The prune command never deletes the `current` KID.
|
||||
- Archived copies remain under `storage/app/oauth-keys/archive/...` for forensics.
|
||||
|
||||
## Runbook Summary
|
||||
| Step | Command | Outcome |
|
||||
| --- | --- | --- |
|
||||
| Inspect | `php artisan oauth:list-keys` | Inventory current + legacy keys |
|
||||
| Rotate | `php artisan oauth:rotate-keys --kid=...` | Creates new key while keeping legacy key active |
|
||||
| Verify | Issue new token + test old token | Ensures dual-key window works |
|
||||
| Prune | `php artisan oauth:prune-keys --days=45` | Removes legacy key once safe |
|
||||
|
||||
Document completion of `SEC-IO-01` in `docs/todo/security-hardening-epic.md` when the rotation runbook has been rehearsed in staging.
|
||||
106
docs/deployment/public-api-incident-playbook.md
Normal file
106
docs/deployment/public-api-incident-playbook.md
Normal file
@@ -0,0 +1,106 @@
|
||||
# Public API Incident Response Playbook (SEC-API-02)
|
||||
|
||||
Scope: Guest-facing API endpoints that rely on join tokens and power the guest PWA plus the public gallery. This includes:
|
||||
|
||||
- `/api/v1/events/{token}/*` (stats, tasks, uploads, photos)
|
||||
- `/api/v1/gallery/{token}/*`
|
||||
- Signed download/asset routes generated via `EventPublicController`
|
||||
|
||||
The playbook focuses on abuse, availability loss, and leaked content.
|
||||
|
||||
---
|
||||
|
||||
## 1. Detection & Alerting
|
||||
|
||||
| Signal | Where to Watch | Notes |
|
||||
| --- | --- | --- |
|
||||
| 4xx/5xx spikes | Application logs (`storage/logs/laravel.log`), centralized logging | Look for repeated `Join token access denied` / `token_rate_limited` or unexpected 5xx. |
|
||||
| Rate-limit triggers | Laravel log lines emitted from `EventPublicController::handleTokenFailure` | Contains IP + truncated token preview. |
|
||||
| CDN/WAF alerts | Reverse proxy (if enabled) | Ensure 429/403 anomalies are forwarded to incident channel. |
|
||||
| Synthetic monitors | Planned via `SEC-API-03` | Placeholder until monitors exist. |
|
||||
|
||||
Manual check commands:
|
||||
|
||||
```bash
|
||||
php artisan log:tail --lines=200 | grep "Join token"
|
||||
php artisan log:tail --lines=200 | grep "gallery"
|
||||
```
|
||||
|
||||
## 2. Severity Classification
|
||||
|
||||
| Level | Criteria | Examples |
|
||||
| --- | --- | --- |
|
||||
| SEV-1 | Wide outage (>50% error rate), confirmed data leak or malicious mass-download | Gallery downloads serving wrong event, join-token table compromised. |
|
||||
| SEV-2 | Localised outage (single tenant/event) or targeted brute force attempting to enumerate tokens | Single event returning 500, repeated `invalid_token` from single IP range. |
|
||||
| SEV-3 | Minor functional regression or cosmetic issue | Rate limit misconfiguration causing occasional 429 for legitimate users. |
|
||||
|
||||
Escalate SEV-1/2 immediately to on-call via Slack `#incident-response` and open PagerDuty incident (if configured).
|
||||
|
||||
## 3. Immediate Response Checklist
|
||||
|
||||
1. **Confirm availability**
|
||||
- `curl -I https://app.test/api/v1/gallery/{known_good_token}`
|
||||
- Use tenant-provided test token to validate `/events/{token}` flow.
|
||||
2. **Snapshot logs**
|
||||
- Export last 15 minutes from log aggregator or `storage/logs`. Attach to incident ticket.
|
||||
3. **Assess scope**
|
||||
- Identify affected tenant/event IDs via log context.
|
||||
- Note IP addresses triggering rate limits.
|
||||
4. **Decide mitigation**
|
||||
- Brute force? → throttle/bock offending IPs.
|
||||
- Compromised token? → revoke token via Filament or `php artisan tenant:join-tokens:revoke {id}` (once command exists).
|
||||
- Endpoint regression? → begin rolling fix or feature flag toggle.
|
||||
|
||||
## 4. Mitigation Tactics
|
||||
|
||||
### 4.1 Abuse / Brute force
|
||||
- Increase rate-limiter strictness temporarily by editing `config/limiting.php` (if available) or applying runtime block in the load balancer.
|
||||
- Use fail2ban/WAF rules to block offending IPs. For quick local action:
|
||||
```bash
|
||||
sudo ufw deny from <ip_address>
|
||||
```
|
||||
- Consider temporarily disabling gallery download by setting `PUBLIC_GALLERY_ENABLED=false` (feature flag planned) and clearing cache.
|
||||
|
||||
### 4.2 Token Compromise
|
||||
- Revoke specific token via Filament “Join Tokens” modal (Event → Join Tokens → revoke).
|
||||
- Notify tenant with replacement token instructions.
|
||||
- Audit join-token logs for additional suspicious use and consider rotating all tokens for the event.
|
||||
|
||||
### 4.3 Internal Failure (500s)
|
||||
- Tail logs for stack traces.
|
||||
- If due to downstream storage, fail closed: return 503 with maintenance banner while running `php artisan storage:diagnostics`.
|
||||
- Roll back recent deployment or disable new feature flag if traced to release.
|
||||
|
||||
## 5. Communication
|
||||
|
||||
| Audience | Channel | Cadence |
|
||||
| --- | --- | --- |
|
||||
| Internal on-call | Slack `#incident-response`, PagerDuty | Initial alert, hourly updates. |
|
||||
| Customer Support | Slack `#support` with summary | Once per significant change (mitigation applied, issue resolved). |
|
||||
| Tenants | Email template “Public gallery disruption” (see `resources/lang/*/emails.php`) | Only for SEV-1 or impactful SEV-2 after mitigation. |
|
||||
|
||||
Document timeline, impact, and mitigation in the incident ticket.
|
||||
|
||||
## 6. Verification & Recovery
|
||||
|
||||
After applying mitigation:
|
||||
|
||||
1. Re-run test requests for affected endpoints.
|
||||
2. Validate join-token creation/revocation via Filament.
|
||||
3. Confirm error rates return to baseline in monitoring/dashboard.
|
||||
4. Remove temporary firewall blocks once threat subsides.
|
||||
|
||||
## 7. Post-Incident Actions
|
||||
|
||||
- File RCA within 48 hours including: root cause, detection gaps, follow-up tasks (e.g., enabling synthetic monitors, adding audit fields).
|
||||
- Update documentation if new procedures are required (`docs/prp/11-public-gallery.md`, `docs/prp/03-api.md`).
|
||||
- Schedule backlog items for long-term fixes (e.g., better anomaly alerting, token analytics dashboards).
|
||||
|
||||
## 8. References & Tools
|
||||
|
||||
- Log aggregation: `storage/logs/laravel.log` (local), Stackdriver/Splunk (staging/prod).
|
||||
- Rate limit config: `App\Providers\AppServiceProvider` → `RateLimiter::for('tenant-api')` and `EventPublicController::handleTokenFailure`.
|
||||
- Token management UI: Filament → Events → Join Tokens.
|
||||
- Signed URL generation: `app/Http/Controllers/Api/EventPublicController` (for tracing download issues).
|
||||
|
||||
Keep this document alongside the other deployment runbooks and review quarterly.
|
||||
Reference in New Issue
Block a user