Rate Limiting

1. 구현된 Rate Limiting 시스템

1.1 SimpleRateLimiter 구현

public class SimpleRateLimiter {
    private final int maxRequests;
    private final long timeWindowMillis;
    private final AtomicInteger currentRequests;
    private final AtomicLong windowStartTime;

    public SimpleRateLimiter(int maxRequests, Duration timeWindow) {
        this.maxRequests = maxRequests;
        this.timeWindowMillis = timeWindow.toMillis();
        this.currentRequests = new AtomicInteger(0);
        this.windowStartTime = new AtomicLong(System.currentTimeMillis());
    }

    public boolean tryAcquire() {
        long now = System.currentTimeMillis();
        long windowStart = windowStartTime.get();

        if (now - windowStart >= timeWindowMillis) {
            // 새로운 시간 윈도우 시작
            windowStartTime.set(now);
            currentRequests.set(1);
            return true;
        }

        return currentRequests.incrementAndGet() <= maxRequests;
    }

    public void waitForPermit() throws InterruptedException {
        while (!tryAcquire()) {
            Thread.sleep(100); // 100ms 대기 후 재시도
        }
    }
}

1.2 Rate Limiting 인터셉터

@Slf4j
public class RateLimitingInterceptor implements ClientHttpRequestInterceptor {
    private final SimpleRateLimiter rateLimiter;

    public RateLimitingInterceptor(double requestsPerMinute) {
        this.rateLimiter = new SimpleRateLimiter((int)requestsPerMinute, Duration.ofMinutes(1));
    }

    @Override
    public ClientHttpResponse intercept(HttpRequest request, byte[] body,
                                      ClientHttpRequestExecution execution) throws IOException {
        try {
            rateLimiter.waitForPermit();

            long startTime = System.currentTimeMillis();
            ClientHttpResponse response = execution.execute(request, body);
            long duration = System.currentTimeMillis() - startTime;

            log.debug("OpenAI API call completed in {}ms", duration);
            return response;

        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new IOException("Rate limiting interrupted", e);
        }
    }
}

1.3 OpenAI API 통합에서의 사용

@Configuration
public class OpenAiConfig {
    @Value("${openai.api.rate-limit.requests-per-minute}")
    private double requestsPerMinute;

    @Bean
    public OpenAiApi openAiApi(@Qualifier("openAiRestTemplate") RestTemplate openaiRestTemplate) {
        return OpenAiApi.builder()
                .requestsPerMinute(requestsPerMinute)
                .build();
    }
}

프로젝트에서 Rate Limiting을 구현한 이유와 방식

비용 관리
- API 호출 횟수 제한으로 비용 관리
- 예측 가능한 리소스 사용

@Value("${openai.api.rate-limit.requests-per-minute}") private double requestsPerMinute; // 설정 가능한 제한값

구현 방식
- 슬라이딩 윈도우 방식 사용
- 원자적 연산으로 동시성 보장
- 설정 가능한 시간 윈도우와 요청 제한

public boolean tryAcquire() {
long now = System.currentTimeMillis(); 
    if (now - windowStart >= timeWindowMillis) {
    	windowStartTime.set(now); 
        currentRequests.set(1); 
        return true; 
        } 
    return currentRequests.incrementAndGet() <= maxRequests; 
}

Rate Limiting에서 발생할 수 있는 동시성 이슈 해결 방법

원자적 연산 사용
- AtomicInteger, AtomicLong 사용
- 경쟁 상태 방지

private final AtomicInteger currentRequests; 
private final AtomicLong windowStartTime; 
return currentRequests.incrementAndGet() <= maxRequests;

2. 스레드 안전한 설계

동시 요청 처리 보장
인터럽트 처리 구현

public void waitForPermit() throws InterruptedException {
	while (!tryAcquire()) {
		Thread.sleep(100); 
		} 
}

Rate Limit 초과 시 대응

대기 및 재시도
- 적절한 대기 시간 설정
- 점진적 재시도

public void waitForPermit() throws InterruptedException {
	while (!tryAcquire()) {
    	Thread.sleep(100); // 100ms 대기 
        } 
}

2. 예외 처리

명확한 예외 전달
클라이언트에 상태 전달

Rate Limiting 구현의 한계점과 개선 가능성

현재 한계
- 메모리 내 카운터 사용
- 서버 재시작시 카운터 초기화
- 분산 환경 미고려

private final AtomicInteger currentRequests;

'기술 트렌드 & 새로운 학습' 카테고리의 다른 글

코사인 유사도 검색 (1)	2024.11.18
IVFFlat (0)	2024.11.17
CompletableFuture (0)	2024.11.14

DoR

Rate Limiting

1. 구현된 Rate Limiting 시스템

1.1 SimpleRateLimiter 구현

1.2 Rate Limiting 인터셉터

1.3 OpenAI API 통합에서의 사용

프로젝트에서 Rate Limiting을 구현한 이유와 방식

Rate Limiting에서 발생할 수 있는 동시성 이슈 해결 방법

Rate Limit 초과 시 대응

'기술 트렌드 & 새로운 학습' 카테고리의 다른 글

티스토리툴바

Rate Limiting

1. 구현된 Rate Limiting 시스템

1.1 SimpleRateLimiter 구현

1.2 Rate Limiting 인터셉터

1.3 OpenAI API 통합에서의 사용

프로젝트에서 Rate Limiting을 구현한 이유와 방식

Rate Limiting에서 발생할 수 있는 동시성 이슈 해결 방법

Rate Limit 초과 시 대응

'기술 트렌드 & 새로운 학습' 카테고리의 다른 글

관련글

티스토리툴바