11 KiB

Raw Blame History

API 타임아웃 및 재시도 로직 개선 계획

개요

외부 API 호출 시 타임아웃 미설정 및 재시도 로직 부재로 인한 안정성 문제를 해결합니다.

현재 상태

모듈	외부 API	타임아웃	재시도
Lyric	ChatGPT (OpenAI)	❌ 미설정 (SDK 기본 ~600초)	❌ 없음
Song	Suno API	✅ 30-120초	❌ 없음
Video	Creatomate API	✅ 30-60초	❌ 없음

수정 계획

1. ChatGPT API 타임아웃 설정

파일: app/utils/chatgpt_prompt.py

현재 코드:

class ChatgptService:
    def __init__(self):
        self.client = AsyncOpenAI(api_key=apikey_settings.CHATGPT_API_KEY)

수정 코드:

class ChatgptService:
    # 타임아웃 설정 (초)
    DEFAULT_TIMEOUT = 60.0  # 전체 타임아웃
    CONNECT_TIMEOUT = 10.0  # 연결 타임아웃

    def __init__(self):
        self.client = AsyncOpenAI(
            api_key=apikey_settings.CHATGPT_API_KEY,
            timeout=httpx.Timeout(
                self.DEFAULT_TIMEOUT,
                connect=self.CONNECT_TIMEOUT,
            ),
        )

필요한 import 추가:

import httpx

2. 재시도 유틸리티 함수 생성

파일: app/utils/retry.py (새 파일)

"""
API 호출 재시도 유틸리티

지수 백오프(Exponential Backoff)를 사용한 재시도 로직을 제공합니다.
"""

import asyncio
import logging
from functools import wraps
from typing import Callable, Tuple, Type

logger = logging.getLogger(__name__)


class RetryExhaustedError(Exception):
    """모든 재시도 실패 시 발생하는 예외"""
    def __init__(self, message: str, last_exception: Exception):
        super().__init__(message)
        self.last_exception = last_exception


async def retry_async(
    func: Callable,
    max_retries: int = 3,
    base_delay: float = 1.0,
    max_delay: float = 30.0,
    exponential_base: float = 2.0,
    retry_on: Tuple[Type[Exception], ...] = (Exception,),
    on_retry: Callable[[int, Exception], None] | None = None,
):
    """
    비동기 함수 재시도 실행

    Args:
        func: 실행할 비동기 함수 (인자 없음)
        max_retries: 최대 재시도 횟수 (기본: 3)
        base_delay: 첫 번째 재시도 대기 시간 (초)
        max_delay: 최대 대기 시간 (초)
        exponential_base: 지수 백오프 배수 (기본: 2.0)
        retry_on: 재시도할 예외 타입들
        on_retry: 재시도 시 호출될 콜백 (attempt, exception)

    Returns:
        함수 실행 결과

    Raises:
        RetryExhaustedError: 모든 재시도 실패 시

    Example:
        result = await retry_async(
            lambda: api_call(),
            max_retries=3,
            retry_on=(httpx.TimeoutException, httpx.HTTPStatusError),
        )
    """
    last_exception = None

    for attempt in range(max_retries + 1):
        try:
            return await func()
        except retry_on as e:
            last_exception = e

            if attempt == max_retries:
                break

            # 지수 백오프 계산
            delay = min(base_delay * (exponential_base ** attempt), max_delay)

            logger.warning(
                f"[retry_async] 시도 {attempt + 1}/{max_retries + 1} 실패, "
                f"{delay:.1f}초 후 재시도: {type(e).__name__}: {e}"
            )

            if on_retry:
                on_retry(attempt + 1, e)

            await asyncio.sleep(delay)

    raise RetryExhaustedError(
        f"최대 재시도 횟수({max_retries + 1}회) 초과",
        last_exception,
    )


def with_retry(
    max_retries: int = 3,
    base_delay: float = 1.0,
    max_delay: float = 30.0,
    retry_on: Tuple[Type[Exception], ...] = (Exception,),
):
    """
    재시도 데코레이터

    Args:
        max_retries: 최대 재시도 횟수
        base_delay: 첫 번째 재시도 대기 시간 (초)
        max_delay: 최대 대기 시간 (초)
        retry_on: 재시도할 예외 타입들

    Example:
        @with_retry(max_retries=3, retry_on=(httpx.TimeoutException,))
        async def call_api():
            ...
    """
    def decorator(func: Callable):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            return await retry_async(
                lambda: func(*args, **kwargs),
                max_retries=max_retries,
                base_delay=base_delay,
                max_delay=max_delay,
                retry_on=retry_on,
            )
        return wrapper
    return decorator

3. Suno API 재시도 로직 적용

파일: app/utils/suno.py

수정 대상 메서드:

generate() - 노래 생성 요청
get_task_status() - 상태 조회
get_lyric_timestamp() - 타임스탬프 조회

수정 예시 (generate 메서드):

# 상단 import 추가
import httpx
from app.utils.retry import retry_async

# 재시도 대상 예외 정의
RETRY_EXCEPTIONS = (
    httpx.TimeoutException,
    httpx.ConnectError,
    httpx.ReadError,
)

async def generate(
    self,
    prompt: str,
    genre: str | None = None,
    callback_url: str | None = None,
) -> str:
    # ... 기존 payload 구성 코드 ...

    async def _call_api():
        async with httpx.AsyncClient() as client:
            response = await client.post(
                f"{self.BASE_URL}/generate",
                headers=self.headers,
                json=payload,
                timeout=30.0,
            )
            response.raise_for_status()
            return response.json()

    # 재시도 로직 적용
    data = await retry_async(
        _call_api,
        max_retries=3,
        base_delay=1.0,
        retry_on=RETRY_EXCEPTIONS,
    )

    # ... 기존 응답 처리 코드 ...

4. Creatomate API 재시도 로직 적용

파일: app/utils/creatomate.py

수정 대상:

_request() 메서드 (모든 API 호출의 기반)

수정 코드:

# 상단 import 추가
from app.utils.retry import retry_async

# 재시도 대상 예외 정의
RETRY_EXCEPTIONS = (
    httpx.TimeoutException,
    httpx.ConnectError,
    httpx.ReadError,
)

async def _request(
    self,
    method: str,
    url: str,
    timeout: float = 30.0,
    max_retries: int = 3,
    **kwargs,
) -> httpx.Response:
    """HTTP 요청을 수행합니다 (재시도 로직 포함)."""
    logger.info(f"[Creatomate] {method} {url}")

    async def _call():
        client = await get_shared_client()
        if method.upper() == "GET":
            response = await client.get(
                url, headers=self.headers, timeout=timeout, **kwargs
            )
        elif method.upper() == "POST":
            response = await client.post(
                url, headers=self.headers, timeout=timeout, **kwargs
            )
        else:
            raise ValueError(f"Unsupported HTTP method: {method}")
        response.raise_for_status()
        return response

    response = await retry_async(
        _call,
        max_retries=max_retries,
        base_delay=1.0,
        retry_on=RETRY_EXCEPTIONS,
    )

    logger.info(f"[Creatomate] Response - Status: {response.status_code}")
    return response

5. ChatGPT API 재시도 로직 적용

파일: app/utils/chatgpt_prompt.py

수정 코드:

# 상단 import 추가
import httpx
from openai import APITimeoutError, APIConnectionError, RateLimitError
from app.utils.retry import retry_async

# 재시도 대상 예외 정의
RETRY_EXCEPTIONS = (
    APITimeoutError,
    APIConnectionError,
    RateLimitError,
)

class ChatgptService:
    DEFAULT_TIMEOUT = 60.0
    CONNECT_TIMEOUT = 10.0
    MAX_RETRIES = 3

    def __init__(self):
        self.client = AsyncOpenAI(
            api_key=apikey_settings.CHATGPT_API_KEY,
            timeout=httpx.Timeout(
                self.DEFAULT_TIMEOUT,
                connect=self.CONNECT_TIMEOUT,
            ),
        )

    async def _call_structured_output_with_response_gpt_api(
        self, prompt: str, output_format: dict, model: str
    ) -> dict:
        content = [{"type": "input_text", "text": prompt}]

        async def _call():
            response = await self.client.responses.create(
                model=model,
                input=[{"role": "user", "content": content}],
                text=output_format,
            )
            return json.loads(response.output_text) or {}

        return await retry_async(
            _call,
            max_retries=self.MAX_RETRIES,
            base_delay=2.0,  # OpenAI Rate Limit 대비 더 긴 대기
            retry_on=RETRY_EXCEPTIONS,
        )

타임아웃 설정 권장값

API	용도	권장 타임아웃	재시도 횟수	재시도 간격
ChatGPT	가사 생성	60초	3회	2초 → 4초 → 8초
Suno	노래 생성 요청	30초	3회	1초 → 2초 → 4초
Suno	상태 조회	30초	2회	1초 → 2초
Suno	타임스탬프	120초	2회	2초 → 4초
Creatomate	템플릿 조회	30초	2회	1초 → 2초
Creatomate	렌더링 요청	60초	3회	1초 → 2초 → 4초
Creatomate	상태 조회	30초	2회	1초 → 2초

구현 순서

1단계: retry.py 유틸리티 생성
- 재사용 가능한 재시도 로직 구현
- 단위 테스트 작성
2단계: ChatGPT 타임아웃 설정
- 가장 시급한 문제 (현재 600초 기본값)
- 타임아웃 + 재시도 동시 적용
3단계: Suno API 재시도 적용
- generate(), get_task_status(), get_lyric_timestamp()
4단계: Creatomate API 재시도 적용
- _request() 메서드 수정으로 전체 적용

테스트 체크리스트

각 수정 후 확인 사항:

정상 요청 시 기존과 동일하게 동작
타임아웃 발생 시 지정된 시간 내 예외 발생
일시적 오류 시 재시도 후 성공
모든 재시도 실패 시 적절한 에러 메시지 반환
로그에 재시도 시도 기록 확인

롤백 계획

문제 발생 시:

retry.py 사용 코드 제거 (기존 직접 호출로 복구)
ChatGPT 타임아웃 설정 제거 (SDK 기본값으로 복구)

참고 사항

OpenAI SDK는 내부적으로 일부 재시도 로직이 있으나, 커스텀 제어가 제한적
httpx의 TimeoutException은 ConnectTimeout, ReadTimeout, WriteTimeout, PoolTimeout을 포함
Rate Limit 에러(429)는 재시도 시 더 긴 대기 시간 필요 (Retry-After 헤더 참고)

11 KiB Raw Blame History

API 타임아웃 및 재시도 로직 개선 계획

개요

현재 상태

수정 계획

1. ChatGPT API 타임아웃 설정

2. 재시도 유틸리티 함수 생성

3. Suno API 재시도 로직 적용

4. Creatomate API 재시도 로직 적용

5. ChatGPT API 재시도 로직 적용

타임아웃 설정 권장값

구현 순서

테스트 체크리스트

롤백 계획

참고 사항

11 KiB

Raw Blame History