Commit Graph

20 Commits (e64d168d349bf74bc0a386412414e4459d76a428)

Author SHA1 Message Date
Haewon Kam e64d168d34 feat: Perplexity sonar-pro research agent with structured online presence analysis
Replaced simple "find handles" prompt with comprehensive research agent:
- Model: sonar → sonar-pro (advanced multi-step web search)
- System prompt: full research methodology with 2-3 keyword searches,
  URL fetching, quantitative data extraction
- Output: structured JSON with channels (handles + follower counts +
  subscriber counts) + platforms (강남언니 rating, reviews)
- Research results saved to scrape_data.onlinePresenceResearch for
  downstream use in collect-channel-data and generate-report

Added _shared/researchPrompt.ts with prompt template + builder.
Updated agent documentation in doc/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:31:00 +09:00
Haewon Kam c74832d764 feat: Perplexity Online Presence 종합 분석 + Apify Instagram 검색
B4 Perplexity: rewrote from narrow "find social accounts" to broad
"Online Presence 종합 분석" — finds Instagram, YouTube, Facebook,
TikTok, Naver, Kakao, 강남언니, 바비톡 in one query.

B5 Apify Instagram: generates handle candidates from clinic name
(english name, domain, _official, _ps, _clinic variants) and directly
checks each via Apify instagram-profile-scraper. Finds accounts that
web search misses.

Removed redundant B4b (platform presence) — now merged into B4.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:24:56 +09:00
Haewon Kam 64669888c2 fix: type-safe string handling in extractSocialLinks/mergeSocialLinks
API results may contain null, numbers, or objects instead of strings.
Now coerces all values to strings before processing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:17:49 +09:00
Haewon Kam f224d1788c feat: API-first channel discovery — YouTube API + Naver API + Firecrawl Search + Perplexity
Replaced Perplexity-only approach with 5 parallel direct API searches:

B1. YouTube Data API: search?type=channel&q={clinicName} → find channel
B2a. Naver Blog API: search blog.json → find official Naver blog
B2b. Naver Web API: search webkr.json → find Instagram/YouTube/Facebook URLs
B3. Firecrawl Search: web search → extract social URLs from results
B4. Perplexity: supplement — catch what direct APIs missed

All 5 sources run in parallel after Stage A (Firecrawl scrape for clinicName).
Results merged + deduplicated + verified. Perplexity is now a fallback,
not the primary source.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:15:49 +09:00
Haewon Kam 25aece2366 fix: Perplexity prompt rewrite + clinicName fallback via AI
Perplexity prompts changed from "find verified accounts" (returns all
null) to "search and report what you find" (returns actual handles).
Added clinicName resolution: Firecrawl Korean → English → Perplexity
URL-to-name lookup → domain fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:07:04 +09:00
Haewon Kam 122b1915f0 fix: 2-stage discovery — Firecrawl first for clinicName, then Perplexity
Previously Firecrawl and Perplexity ran in parallel, so Perplexity
received raw URL instead of clinic name → poor search results.

Now:
Stage A: Firecrawl scrape+map (parallel) → extract clinicName from HTML
Stage B: Perplexity searches using extracted clinicName → finds Instagram,
  YouTube, Facebook handles that Firecrawl HTML parsing missed
Stage C: Merge 3 sources + verify all handles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:02:30 +09:00
Haewon Kam df8f84c3b9 fix: YouTube channel ID (UC...) handling + handle-to-channelId resolution
discover-channels: extractHandle('youtube') now detects UC* channel IDs
and returns them without @ prefix (previously @UC... caused verify fail)

verifyHandles: verifyYouTube uses cleanHandle for UC* check, requests
part=id,snippet for richer data

collect-channel-data: if channelId missing but handle present, resolves
via forHandle/forUsername lookup or direct UC* detection before skipping

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 01:00:21 +09:00
Haewon Kam f65f0e85b3 fix: robust handle extraction — reject non-platform URLs, fix type safety
discover-channels: new extractHandle() validates each handle belongs to
its platform (rejects hospital-internal URLs like /idtube/view being
treated as YouTube). Extracts handles from full URLs correctly.

collect-channel-data: explicit Record<string,unknown> typing for DB JSON
fields — fixes TypeScript property access on VerifiedChannels from DB.

verifyHandles: fix TikTok double-URL concatenation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 00:03:26 +09:00
Haewon Kam 5239ad7382 chore: add deno.json for new Edge Functions
Required by Supabase deploy to resolve @supabase/functions-js import.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:04:13 +09:00
Haewon Kam 7557ef774c feat: Pipeline V2 — 3-phase analysis with verified channel discovery
Restructured the entire analysis pipeline from AI-guessing social
handles to deterministic 3-phase discovery + collection + generation.

Phase 1 (discover-channels): 3-source channel discovery
  - Firecrawl scrape: extract social links from HTML
  - Perplexity search: find handles via web search
  - URL regex parsing: deterministic link extraction
  - Handle verification: HEAD requests + YouTube API
  - DB: creates row with verified_channels + scrape_data

Phase 2 (collect-channel-data): 9 parallel data collectors
  - Instagram (Apify), YouTube (Data API v3), Facebook (Apify)
  - 강남언니 (Firecrawl), Naver Blog + Place (Naver API)
  - Google Maps (Apify), Market analysis (Perplexity 4x parallel)
  - DB: stores ALL raw data in channel_data column

Phase 3 (generate-report): AI report from real data
  - Reads channel_data + analysis_data from DB
  - Builds channel summary with real metrics
  - AI generates report using only verified data
  - V1 backwards compatibility preserved (url-based flow)

Supporting changes:
  - DB migration: status, verified_channels, channel_data columns
  - _shared/extractSocialLinks.ts: regex-based social link parser
  - _shared/verifyHandles.ts: multi-platform handle verifier
  - AnalysisLoadingPage: real 3-phase progress + channel panel
  - useReport: channel_data column support + V2 enrichment merge
  - 강남언니 rating: auto-correct 5→10 scale + search fallback
  - KPIDashboard: navigate() instead of <a href>
  - Loading text: 20-30초 → 1-2분

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 21:49:13 +09:00
Haewon Kam e32b8766de feat: prototype gap closure — enrichment diagnosis + brand extraction + plan assets
Phase 1: Data Pipeline Fixes
- Plan page: connect enrichment data for Asset Collection + YouTube Repurpose
- mergeEnrichment: generate 15-20 data-driven diagnosis items from enrichment
  (YouTube Shorts check, IG engagement, FB activity, 강남언니 ratings, GMaps)
- ClinicSnapshot: fill staffCount, nearestStation, certifications from enrichment

Phase 2: AI + Brand Enhancement
- AI prompt: per-channel diagnosis[] array (5-7 items), established, nameEn, newChannelProposals
- scrape-website: Firecrawl branding extraction (colors, fonts, logo, tagline)
- transformPlan: BrandGuide colors/fonts from scraped branding data
- transformPlan: cross-channel brand consistency analysis (name, phone mismatches)
- transformPlan: channel branding rules from enrichment (YT, IG, FB profiles)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:09:15 +09:00
Haewon Kam a7d8aeeddc feat: Facebook page data collection via Apify scraper
- enrich-channels: add Facebook Pages Scraper (apify~facebook-pages-scraper)
- Collects: pageName, followers, likes, categories, email, phone, website, intro, rating
- transformReport: merge Facebook data into facebookAudit.pages[] (auto-shows section)
- Frontend: pass facebookHandle through enrichment pipeline
- EnrichChannelsRequest: add facebookHandle parameter

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:16:37 +09:00
Haewon Kam ad625e08ee fix: enrichment pipeline reliability + loading page gradient + button click area
- generate-report: filter empty strings from AI social handles, add saveError logging
- useReport: 3-level fallback for social handles (report > clinicInfo > scrape_data)
- useEnrichment: always trigger enrichment if clinicName exists (not just IG/YT handles)
- Hero: pointer-events-none on decorative blobs (were blocking button clicks)
- AnalysisLoadingPage: warm gradient on INFINITH logo text (#fff3eb → #e4cfff → #f5f9ff)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 16:05:33 +09:00
Haewon Kam 72ea8f4a2d feat: Naver Search API + multi-account Instagram + button UX fix
- Naver Blog search: collect blog post results for clinic name (total count + top 10 posts)
- Naver Place search: collect place info (name, category, address, telephone)
- Multi-account Instagram: AI prompt requests all IG accounts (국내/해외)
- enrich-channels: process multiple IG handles with fallback per handle
- transformReport: merge multiple IG accounts into instagramAudit.accounts[]
- generate-report: socialHandles.instagram now array of handles
- Hero/CTA: transition-all → transition-shadow for instant click response
- Hero/CTA: disabled state when URL is empty (opacity-50 + cursor-not-allowed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:34:10 +09:00
Haewon Kam cf482d1bd7 feat: 강남언니 real-time data collection via Firecrawl scraping
- enrich-channels: add 강남언니 scraping module (search + structured JSON extraction)
- Collects: rating/10, reviews, doctors with ratings, procedures, certifications
- transformReport: merge 강남언니 data into clinicSnapshot + otherChannels
- Updates lead doctor info, certifications, and review counts from real data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:51:47 +09:00
Haewon Kam e5399486f7 fix: Instagram data collection pipeline — handle normalization + DB persistence
- enrich-channels: Instagram fallback — auto-try _ps, .ps, _clinic suffixes when <100 followers
- enrich-channels: YouTube URL normalization via normalizeYouTubeChannel (handles /c/, /user/, @handle)
- enrich-channels: Google Maps multi-query search for better hit rate
- generate-report: AI-found social handles prioritized over Firecrawl scrape
- generate-report: Added socialMedia field to AI prompt for accurate handle discovery
- normalizeHandles: Added normalizeYouTubeChannel for /c/, /user/, /channel/, @handle URLs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:45:00 +09:00
Haewon Kam 200497fa1e feat: P1-5/6/7 — AI KPI targets, website tech audit, dynamic clinic profile
- P1-5: Add kpiTargets schema to AI prompt, use AI-generated goals instead of hardcoded multipliers
- P1-6: Extend website channelAnalysis with trackingPixels, snsLinksOnSite, additionalDomains, mainCTA
- P1-7: ClinicProfilePage fetches data from DB by report ID instead of hardcoded VIEW clinic data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:30:03 +09:00
Haewon Kam 7ea9972c7e feat: P1-4 Brand Identity tab — AI-generated brand analysis
- generate-report: add brandIdentity schema to AI prompt (logo, message, tone, positioning, hashtags, channel consistency)
- transformReport: map API brandIdentity array to TransformationProposal component
- ApiReport type: add brandIdentity field

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 14:10:28 +09:00
Haewon Kam bd7bc45192 fix: Instagram data collection pipeline — handle normalization + DB persistence
- Add normalizeInstagramHandle() utility (Edge + browser) to strip URLs, @ prefixes
- generate-report: normalize handles before saving, persist socialHandles in report JSONB
- enrich-channels: normalize Instagram handle before Apify call (defense in depth)
- useReport: recover socialHandles + channelEnrichment from DB on direct URL access
- ReportPage: skip redundant enrichment when data already exists in DB

Fixes: Instagram enrichment failing due to URL-format handles passed to Apify

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 13:34:54 +09:00
Haewon Kam 60cd055042 feat: real API integration + YouTube Data API v3 + progressive loading
- Replace mock useReport() with real Supabase API data pipeline
- Add transformReport.ts to map API responses to MarketingReport type
- Add useEnrichment() hook for background channel data enrichment
- Replace Apify YouTube scraper with YouTube Data API v3
- Add mergeEnrichment() for progressive data loading
- Add EmptyState component for graceful empty data handling
- Add socialHandles to generate-report metadata
- Graceful empty data in ClinicSnapshot, YouTube, Instagram, Facebook
- Add Supabase Edge Functions and DB migrations
- Add developer handoff documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 10:57:14 +09:00