Add control room help article and move ops docs
This commit is contained in:
22
docs/ops/admin-issue-resolution.md
Normal file
22
docs/ops/admin-issue-resolution.md
Normal file
@@ -0,0 +1,22 @@
|
||||
# Admin Issue Resolution (Ops Playbook)
|
||||
|
||||
Internal troubleshooting guide for superadmins and on-call.
|
||||
|
||||
## Upload incidents
|
||||
| Symptom | Likely cause | First action |
|
||||
| --- | --- | --- |
|
||||
| Queue stuck >10 min | Workers stalled or storage pressure | Check queue workers and storage health; see `docs/ops/queue-workers.md` and `docs/ops/dr-storage-issues.md` |
|
||||
| Guests blocked | Per-device limits reached | Confirm limits and whether exceptions are allowed |
|
||||
| Thumbnails missing | Backfill jobs stalled | Run `php artisan media:backfill-thumbnails --tenant=XYZ` |
|
||||
|
||||
## Access issues
|
||||
- **Admin cannot log in**: verify invite acceptance, check SSO mapping if enforced, re-send invite.
|
||||
- **Guest cannot join**: confirm event is published and the join link is current.
|
||||
|
||||
## Billing and quota blocks
|
||||
- Check Paddle / RevenueCat status dashboards.
|
||||
- Confirm webhook freshness and retry failures if needed.
|
||||
|
||||
## Communications
|
||||
- Use the support escalation guide at `docs/ops/support-escalation-guide.md` for customer comms.
|
||||
- Log all actions and timestamps in a bd issue.
|
||||
34
docs/ops/live-ops-control.md
Normal file
34
docs/ops/live-ops-control.md
Normal file
@@ -0,0 +1,34 @@
|
||||
# Live Ops Control (Ops Playbook)
|
||||
|
||||
Use this playbook when supporting an event in real time. This is internal guidance for superadmins/on-call.
|
||||
|
||||
## Scope
|
||||
- Moderation queues and Live Show queues.
|
||||
- High-volume events with potential backlog or device failures.
|
||||
- Incident response when content safety or performance is at risk.
|
||||
|
||||
## Baseline checks
|
||||
1. Confirm event status and moderation mode.
|
||||
2. Verify queue counts and recent upload rate.
|
||||
3. Check if any trusted devices are bypassing review.
|
||||
|
||||
## Triage workflow
|
||||
- **Queue backlog** (>25 items or >10 min):
|
||||
- Increase moderation staffing.
|
||||
- Tighten upload visibility rules.
|
||||
- Reduce Live Show effects or layout to lower throughput pressure.
|
||||
- **Offensive content reported**:
|
||||
- Hide the item, capture evidence, notify duty officer.
|
||||
- Confirm the report appears in the audit log.
|
||||
- **Live Show empty**:
|
||||
- Confirm correct show link and moderation mode.
|
||||
- Check whether items are waiting in the queue.
|
||||
|
||||
## Escalation
|
||||
- Reliability on-call for queue or processing failures.
|
||||
- Legal duty officer for sensitive content handling.
|
||||
- Customer Success for comms to organizers.
|
||||
|
||||
## After action
|
||||
- Capture timeline and actions in a bd issue.
|
||||
- Add follow-ups for any repeated failure modes.
|
||||
Reference in New Issue
Block a user