Commit Graph

7 Commits (d5f7f24e0a5180f7f000ae0ec0f25a1f7ce1b525)

Author SHA1 Message Date
Haewon Kam d5f7f24e0a feat: clinic registry DB + pipeline audit P0 fixes
## Clinic Registry
- data/clinic-registry/clinic_registry_working.csv — 91개 병원 채널 마스터 DB
- data/clinic-registry/INFINITH_Outbound_List.csv — BD팀 아웃바운드 리스트 (17컬럼)
- data/clinic-registry/update_csv.py — 안전 CSV 업데이트 스크립트 (빈 필드만 채움)
- data/clinic-registry/extract_place_ids.py — 네이버 플레이스 ID 추출기
- scripts/import-registry.ts — CSV → Supabase clinic_registry 테이블 임포트
- supabase/migrations/20260406_clinic_registry.sql — clinic_registry 테이블 스키마

## Pipeline P0 Bug Fixes (전수 감사 후)
- fix(collect-channel-data): 강남언니 rating 0-10 스케일 오변환 제거
  - 기존: rating ≤ 5이면 ×2 → 4.8/10을 9.6/10으로 잘못 변환
  - 수정: Firecrawl 프롬프트가 이미 0-10 지시 → rawValue 직접 신뢰
- fix(generate-report): Perplexity 단일 fetch → fetchWithRetry 교체
  - maxRetries:2, backoffMs:[5000,15000], timeoutMs:90s
  - 기존: 타임아웃/429 시 리포트 생성 전체 실패
  - 수정: 자동 재시도로 일시적 API 오류 극복

## Docs
- docs/PIPELINE_IMPROVEMENT_PLAN.md — Sprint 0/1/2 완료 표시 + 전수 감사 결과 추가
- docs/REGISTRY_FUNCTIONAL_SPECS.md, DB_SCHEMA_V3.md 외 기획 문서 다수 추가

## New Components & Features
- supabase/functions/generate-content-plan, adjust-strategy — 콘텐츠 플랜/전략 조정
- src/components/plan/EditEntryModal, StrategyAdjustmentSection — 플랜 편집 UI
- supabase/functions/_shared/dataQuality, foundingYearExtractor, urlClassifier — 데이터 품질 유틸

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 09:33:25 +09:00
Haewon Kam 79950925a1 fix: add Authorization header to all Edge Function calls + fix Vision Analysis
- All fetch calls to Supabase Edge Functions now include
  Authorization: Bearer <anon_key> (was missing → 401 errors)
- Fix Firecrawl screenshot API: remove invalid screenshotOptions,
  use "screenshot@fullPage" format (v2 API compatibility)
- Fix screenshot response handling: v2 returns URL not base64,
  now downloads and converts to base64 for Gemini Vision
- Add about page to Vision Analysis capture targets
- Add retry utility, channel error tracking, pipeline resume,
  enrichment retry, EmptyState improvements (Sprint 2-3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-05 10:08:03 +09:00
Haewon Kam 7fe3ff82c9 feat: DB V3 dual-write — clinics + analysis_runs + channel_snapshots
Phase 2-4 of SaaS schema migration. All Edge Functions now write to
BOTH legacy marketing_reports AND new V3 tables:

discover-channels:
  - UPSERT clinics (url-based dedup)
  - INSERT analysis_runs (status: discovering)

collect-channel-data:
  - INSERT channel_snapshots (one per channel — time-series!)
  - INSERT screenshots (evidence rows)
  - UPDATE analysis_runs (raw_channel_data, vision_analysis)

generate-report:
  - UPDATE analysis_runs (report, status: complete)
  - UPDATE clinics (last_analyzed_at, established_year)

Frontend passes clinicId + runId through all 3 phases.
Legacy marketing_reports still written for backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:51:11 +09:00
Haewon Kam 7557ef774c feat: Pipeline V2 — 3-phase analysis with verified channel discovery
Restructured the entire analysis pipeline from AI-guessing social
handles to deterministic 3-phase discovery + collection + generation.

Phase 1 (discover-channels): 3-source channel discovery
  - Firecrawl scrape: extract social links from HTML
  - Perplexity search: find handles via web search
  - URL regex parsing: deterministic link extraction
  - Handle verification: HEAD requests + YouTube API
  - DB: creates row with verified_channels + scrape_data

Phase 2 (collect-channel-data): 9 parallel data collectors
  - Instagram (Apify), YouTube (Data API v3), Facebook (Apify)
  - 강남언니 (Firecrawl), Naver Blog + Place (Naver API)
  - Google Maps (Apify), Market analysis (Perplexity 4x parallel)
  - DB: stores ALL raw data in channel_data column

Phase 3 (generate-report): AI report from real data
  - Reads channel_data + analysis_data from DB
  - Builds channel summary with real metrics
  - AI generates report using only verified data
  - V1 backwards compatibility preserved (url-based flow)

Supporting changes:
  - DB migration: status, verified_channels, channel_data columns
  - _shared/extractSocialLinks.ts: regex-based social link parser
  - _shared/verifyHandles.ts: multi-platform handle verifier
  - AnalysisLoadingPage: real 3-phase progress + channel panel
  - useReport: channel_data column support + V2 enrichment merge
  - 강남언니 rating: auto-correct 5→10 scale + search fallback
  - KPIDashboard: navigate() instead of <a href>
  - Loading text: 20-30초 → 1-2분

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 21:49:13 +09:00
Haewon Kam a7d8aeeddc feat: Facebook page data collection via Apify scraper
- enrich-channels: add Facebook Pages Scraper (apify~facebook-pages-scraper)
- Collects: pageName, followers, likes, categories, email, phone, website, intro, rating
- transformReport: merge Facebook data into facebookAudit.pages[] (auto-shows section)
- Frontend: pass facebookHandle through enrichment pipeline
- EnrichChannelsRequest: add facebookHandle parameter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:16:37 +09:00
Haewon Kam 72ea8f4a2d feat: Naver Search API + multi-account Instagram + button UX fix
- Naver Blog search: collect blog post results for clinic name (total count + top 10 posts)
- Naver Place search: collect place info (name, category, address, telephone)
- Multi-account Instagram: AI prompt requests all IG accounts (국내/해외)
- enrich-channels: process multiple IG handles with fallback per handle
- transformReport: merge multiple IG accounts into instagramAudit.accounts[]
- generate-report: socialHandles.instagram now array of handles
- Hero/CTA: transition-all → transition-shadow for instant click response
- Hero/CTA: disabled state when URL is empty (opacity-50 + cursor-not-allowed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:34:10 +09:00
Haewon Kam 60cd055042 feat: real API integration + YouTube Data API v3 + progressive loading
- Replace mock useReport() with real Supabase API data pipeline
- Add transformReport.ts to map API responses to MarketingReport type
- Add useEnrichment() hook for background channel data enrichment
- Replace Apify YouTube scraper with YouTube Data API v3
- Add mergeEnrichment() for progressive data loading
- Add EmptyState component for graceful empty data handling
- Add socialHandles to generate-report metadata
- Graceful empty data in ClinicSnapshot, YouTube, Instagram, Facebook
- Add Supabase Edge Functions and DB migrations
- Add developer handoff documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:57:14 +09:00