5 · Functional QA & debugger review
Objective — drive every critical journey and role (automate the repetitive smoke tests with Playwright MCP), run live security tests, triage failures by severity, then mine Telescope/Sentry/logs and check for unmerged fixes — the live counterpart to the static audit.
Background
Section titled “Background”This is the live counterpart to the earlier static audits: you drive the running app instead of reading the code. Automate the repetitive smoke tests; keep human eyes on the journeys that need judgment.
flowchart LR Static[Static + security audits] --> Auto[Automated smoke Playwright MCP] Auto --> Human[Human judgment journeys] Human --> UAT[User acceptance sign-off]1. Run functional QA & runtime testing
Section titled “1. Run functional QA & runtime testing”Drive the running app across every critical journey and role. Automate the repetitive smoke tests with the Playwright MCP (navigate, fill forms, capture console errors, screenshot at each step); keep human eyes on judgment-heavy flows.
| Suite | Priority | Covers |
|---|---|---|
| Functional testing | MUST | Auth, subscriptions, CRUD, emails, error pages |
| Security testing (runtime) | MUST | Rate limiting, IDOR, session handling, prod config |
| Performance testing | MUST | Page speed, TTFB, caching, load |
| User acceptance testing | MUST | End-to-end user journeys, onboarding |
| Cross-browser | SHOULD | Chrome, Firefox, Safari, Edge |
| Mobile & responsive | SHOULD | 375 / 768 / 1024 / 1440px, touch targets |
| Load testing | SHOULD | Apache Bench / k6 on staging |
Reference test matrices (functional + runtime security)
Authentication — register (valid / duplicate-email / weak-password / missing-fields), login (valid / wrong-password / non-existent-email → generic “Invalid credentials”), remember-me across browser restart, password-reset request → email → reset form → new password works / old fails, logout protects /dashboard.
Subscription — /pricing shows correct plans/prices/currency, checkout with test card 4242 4242 4242 4242 → success, dashboard shows active plan, invoice email received, upgrade / downgrade-at-period-end / cancel, billing history visible.
Core CRUD (per entity) — create / edit / delete (with confirm) / list / search / paginate / export / report generate / report export / custom date range.
Error handling — custom pages:
| Trigger | Expected |
|---|---|
/nonexistent-page-xyz | Custom 404 page |
| Forced 500 | Custom 500 page |
| Forbidden resource | Custom 403 page |
| Expired session / stale CSRF | Custom 419 page |
| Empty / invalid-email / excessive-length / special-char inputs | Inline validation, handled gracefully |
Runtime security:
| Test | Expected |
|---|---|
| 10+ rapid failed logins | Rate-limit lockout after threshold |
| Password-reset token after 1 hour | Expired |
| Password-reset token reused | Single-use — rejected on second use |
| Session after password change | Old sessions invalidated |
| User A opens User B’s resource by ID (IDOR) | 403 or redirect |
| Non-admin hits admin route | 403 or redirect |
| Unauthenticated API call | 401 |
| Session cookies | Secure=true, HttpOnly=true |
-
Drive each suite, then triage every failure by severity. Use the Stripe test card
4242 4242 4242 4242for the full payment lifecycle (subscribe → upgrade → downgrade → cancel → billing history). Confirm every transactional email lands in the inbox (mail-tester.com target 9–10/10) and that custom 404/500/403/419 pages render.- ✅ Every MUST suite passes; the payment lifecycle, emails (9–10/10), and custom error pages all work.
- ✅ Each failure is triaged — CRITICAL blocks launch, MEDIUM has a workaround (fix within a week), LOW is cosmetic.
2. Mine the debugger tools
Section titled “2. Mine the debugger tools”Mine the tools you wired in earlier for bugs that QA didn’t surface.
| Review | What to find |
|---|---|
| Telescope exceptions | Hidden exceptions (local); categorize by severity |
| Slow queries | > 500ms = CRITICAL, > 100ms = HIGH; fix with eager loading / indexes |
| Sentry (production) | Unresolved issues, prioritized by events & users affected |
| Log file scan | Errors, warnings, deprecations across local + production logs |
| Git unmerged fixes | Fixes stranded on feature branches, never merged to develop |
-
Work each review row, categorizing findings by severity and confirming no fix is stranded off
develop.Terminal window git branch -a --contains <hash> | grep -E 'origin/develop|origin/main|develop|main'# Expected: develop or main is listed — the fix is in the release lineAlso check for dependency bumps stranded off the release line — a
composer.lockupdate merged only to a feature branch is a silent regression (e.g. a package fix that never reacheddevelop):Terminal window git log --oneline --all -- composer.lock | head -10# For any bump not obviously on develop/main:# git branch -a --contains <hash> | grep -E 'origin/develop|origin/main' || echo "STRANDED: bump not in release line"- ✅ Telescope/Sentry/log findings are categorized; slow queries fixed; every fix commit is on
develop.
- ✅ Telescope/Sentry/log findings are categorized; slow queries fixed; every fix commit is on
3. Run the SHOULD suites — cross-browser, mobile, load
Section titled “3. Run the SHOULD suites — cross-browser, mobile, load”The MUST suites gate launch; these catch what real users hit on other browsers, on phones, and under traffic. Run them on staging, never production.
-
Cross-browser — drive the key journeys (homepage, login/register, all form types, modals/dropdowns, AJAX) in Chrome, Firefox, Safari, and Edge, watching the console on each.
- ✅ Every journey renders and works in all four browsers with a clean console;
browserslistinpackage.jsoncovers> 1%, last 2 versions, not dead.
- ✅ Every journey renders and works in all four browsers with a clean console;
-
Mobile & responsive — test each page at 375 / 768 / 1024 / 1440px in DevTools, then on at least one real iPhone + one real Android.
- ✅ No horizontal scroll at any breakpoint; touch targets ≥ 44×44px; passes on a real iPhone (Safari) and a real Android (Chrome).
-
Load test (staging only) — a quick Apache Bench pass, then a scripted k6 ramp if you need a realistic profile.
Terminal window ab -n 500 -c 50 https://<staging-domain>/ # quick baseline# realistic ramp: brew install k6 && k6 run load-test.jsMonitor the server during the run (separate shell on staging):
Terminal window top # CPU — target < 80%free -m # Memory — should stay stable, no runaway growth# In a MySQL client:# SHOW STATUS LIKE 'Threads_connected'; # not near max_connections# SHOW GLOBAL STATUS LIKE 'Slow_queries'; # should not climb sharply under loadRead the bottleneck:
Symptom Diagnosis Quick fix High CPU Un-cached PHP / hot code path Enable OPcache Memory grows Leak / unbounded workers Fix leak, limit workers Slow DB Missing indexes Add indexes, add Redis cache Connection errors Connection limit reached Raise max_connections- ✅ On staging: avg response < 500ms, 95th percentile < 1s, error rate < 0.1% (requests/sec > 100, 0 failed).
Checklist
Section titled “Checklist”Do not mark this step done until every box below is checked.
- 🔀 MUST suites pass — functional, runtime security, performance, and UAT all green.
- 👤 Payments + emails verified — test-card lifecycle works; transactional emails land (9–10/10).
- 🤖 Failures triaged — every failure tagged CRITICAL / MEDIUM / LOW.
- 🤖 Debugger review done — Telescope/Sentry/logs mined; slow queries fixed.
- 🤖 No unmerged fixes — every fix commit confirmed on
develop. - 🔀 SHOULD suites run (staging) — cross-browser (4 browsers), mobile (4 breakpoints + real devices), and load test all pass on staging.
- 🤖 ZajModules verified or skipped —
zajmodsys:listclean, or nopackages/ZajModSys/so skipped.