1. Allow (with limits): robots.txt allow + rate limiting (60-120 req/min) 2. Crawl-delay: 1-2 saniye: Server protection + reasonable crawl speed 3. Selective blocking: Public content allow, proprietary block 4. Monitor server impact: CPU, memory, bandwidth, TTFB 5. Platform priority: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot > Bytespider 6. Bandwidth manageable: Typical blog için <$100/year (AI bots için)

Week 1: - [ ] robots.txt audit (AI bots allow/block check) - [ ] Crawl-delay ekle (1-2 saniye) - [ ] Server log analysis (current AI bot traffic) Week 2: - [ ] Rate limiting implement (CloudFlare or nginx) - [ ] IP range verification setup (anti-spoofing) - [ ] Monitoring dashboard (AI bot metrics) Week 3: - [ ] Bandwidth cost calculate - [ ] Compression enable (gzip/brotli) - [ ] CDN caching optimize Week 4: - [ ] Performance test (TTFB with AI bot traffic) - [ ] Adjust rate limits based on data - [ ] Document configuration (future reference) Sonraki adım: robots.txt optimize edin, rate limiting kurun, AI bot impact monitor edin. AI bot erişimini optimize etmeye ek olarak schema markup ile AI görünürlüğü rehberine bakarak crawler'ların içeriğinizi daha iyi anlamasını sağlayabilirsiniz. --- İlgili Yazılar: - ChatGPT Citation Optimization: GPTBot ve SearchGPT Stratejileri - Perplexity AI Optimization: PerplexityBot Configuration - Server Performance Optimization: TTFB ve Page Load Speed

AI Bot Yönetimi: GPTBot, PerplexityBot, ClaudeBot Configuration Rehberi

Q: AI botları allow etmeliyim mi yoksa block mu?

Çoğu case için allow (with rate limiting). Allow Avantajları: - 800M+ haftalık ChatGPT kullanıcısı - 100M+ Perplexity query/month - Citation = brand awareness + traffic Block scenarios (rare): - Paywall content - Proprietary data - Severe server constraints Recommendation: Allow + Crawl-delay: 1-2 + server-level rate limiting.

Q: Crawl-delay ne kadar olmalı? 1 saniye yeterli mi?

1-2 saniye optimal. Test data: - 0 saniye: Server load spike riski - 1 saniye: %99 durum için yeterli - 2 saniye: Conservative (extra safety) - 5+ saniye: Too slow (citation delay riski) Exception: Bytespider için 3-5 saniye (very aggressive).

Q: GPTBot vs OAI-SearchBot: İkisine de allow vermeliyim?

Evet, ikisine de allow. GPTBot: - Model training + knowledge base - Long-term citation potential OAI-SearchBot: - SearchGPT real-time search - Immediate citation potential Not: İkisi ayrı crawlers, ayrı ayrı configure edin.

Q: Perplexity'nin stealth crawler controversy'si hala var mı?

Azaldı ama rate limiting hala önemli. History: - 2024 mid: Perplexity, undeclared crawlers kullandığı report edildi (robots.txt bypass) - Community backlash: Perplexity, compliance improve etti - Current (2025): Mostly honors robots.txt ama rate limiting still recommended Best practice: robots.txt allow + server-level rate limit (60-120 req/min).

Q: CloudFlare kullanıyorum. AI botları nasıl manage ederim?

CloudFlare WAF + Rate Limiting Rules: Setup: 1. Dashboard → Security → WAF → Rate Limiting Rules 2. Create rule: - Name: "AI Bots Rate Limit" - Match: User-Agent contains "GPTBot|PerplexityBot|ClaudeBot" - Action: Allow 120 requests/minute - If exceeded: Challenge (CAPTCHA) veya Block 10 min Monitoring: - Analytics → Traffic: AI bot request counts - Security Events: Rate limit triggers Advantage: CloudFlare, IP verification otomatik yapabiliyor (spoofed bots detect eder).

Q: AI bot traffic, Google Analytics'te görünüyor mu?

Hayır (bot filtering). GA4, bots filter ediyor (Settings → Data Filters → Bot Filtering). But: AI botların impact'ini görebilirsiniz: Indirect metrics: - Referral traffic: Source = chatgpt.com, perplexity.ai - Direct traffic spike: Citation sonrası brand search artışı - Organic search: AI platforms'ta visibility → Google'da brand search Server logs: AI bot activity'yi görmek için nginx/Apache logs analiz edin.

Q: Hangi AI bot en çok bandwidth tüketiyor?

Test data (average B2B SaaS blog): | Bot | % of AI Bot Traffic | Aggressiveness | |-----|-------------------|---------------| | Bytespider | 40-50% | Very High | | GPTBot | 25-30% | Moderate | | PerplexityBot | 15-20% | Moderate-High | | ClaudeBot | 5-10% | Low | | Google-Extended | 5-10% | Low-Moderate | Insight: Bytespider = bandwidth hog (özellikle Asia-Pacific sites için). Recommendation: Bytespider için aggressive rate limiting veya block (TikTok/China market değilse). AI botları, 2025'te critical opportunity ama doğru management gerekiyor:

Q: Action Plan:

Week 1: - [ ] robots.txt audit (AI bots allow/block check) - [ ] Crawl-delay ekle (1-2 saniye) - [ ] Server log analysis (current AI bot traffic) Week 2: - [ ] Rate limiting implement (CloudFlare or nginx) - [ ] IP range verification setup (anti-spoofing) - [ ] Monitoring dashboard (AI bot metrics) Week 3: - [ ] Bandwidth cost calculate - [ ] Compression enable (gzip/brotli) - [ ] CDN caching optimize Week 4: - [ ] Performance test (TTFB with AI bot traffic) - [ ] Adjust rate limits based on data - [ ] Document configuration (future reference) Sonraki adım: robots.txt optimize edin, rate limiting kurun, AI bot impact monitor edin. AI bot erişimini optimize etmeye ek olarak schema markup ile AI görünürlüğü rehberine bakarak crawler'ların içeriğinizi daha iyi anlamasını sağlayabilirsiniz. --- İlgili Yazılar: - ChatGPT Citation Optimization: GPTBot ve SearchGPT Stratejileri - Perplexity AI Optimization: PerplexityBot Configuration - Server Performance Optimization: TTFB ve Page Load Speed

2025'te AI crawlerlar, Googlebot trafiğinin %20'sine ulaştı.

GPTBot (OpenAI): Mayıs 2024'ten Mayıs 2025'e %305 growth (share: %5 → %30).

Challenge: AI botları allow ederek citation opportunity kazanın, ama server yükünü manage edin.

AI bot yönetimi ile beraber platform-spesifik optimizasyon stratejileri için teknik SEO ve AI platformlar rehberine bakabilirsiniz.

AI bot landscape (2025):

GPTBot (OpenAI - ChatGPT)
OAI-SearchBot (OpenAI - SearchGPT)
PerplexityBot (Perplexity AI)
ClaudeBot (Anthropic - Claude)
Google-Extended (Google - Gemini + Bard)
Applebot-Extended (Apple Intelligence)
Bytespider (ByteDance - TikTok/Doubao)

Bu rehberde, AI botları optimize etmenin, robots.txt configuration'ın, rate limiting stratejilerinin ve server resource management'ın best practices'lerini derinlemesine inceleyeceğiz.

AI Crawlers: Overview ve User Agents

1. OpenAI Bots

GPTBot

Purpose: Model training + knowledge base.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)

Crawl frequency: 2-4x/week (active sites için).

Impact: ChatGPT citations, knowledge base updates.

ChatGPT özelinde citation optimizasyon için ChatGPT citation optimizasyon rehberine bakabilirsiniz.

OAI-SearchBot

Purpose: Real-time web search (SearchGPT feature).

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot)

Crawl frequency: Real-time (query-driven).

Impact: SearchGPT citation potential.

ChatGPT-User

Purpose: Browser/API-based user interactions.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 ChatGPT-User

Crawl frequency: On-demand (user query triggers).

2. Perplexity Bot

Purpose: Real-time answer engine indexing.

User-agent:

Mozilla/5.0 (compatible; PerplexityBot/1.0; +https://perplexity.ai/bot)

Crawl frequency: 2-7x/week (fresh content için daha sık).

Impact: Perplexity citation (real-time search results).

Perplexity özelinde citation stratejileri için Perplexity AI citation stratejileri rehberine bakabilirsiniz.

Note: Perplexity controversy - stealth crawlers kullandığı report edildi (robots.txt bypass etmek için). Bu davranış, community backlash sonrası azaldı ama rate limiting hala önemli.

3. Anthropic ClaudeBot

Purpose: Claude model training + knowledge base.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claude.ai/bot)

Crawl frequency: 1-2x/week (daha az agresif).

Impact: Claude citations, technical/academic queries için.

Claude özelinde citation stratejileri ve teknik içerik optimizasyonu için Claude AI citation optimizasyon rehberine bakabilirsiniz.

Characteristic: Peer-reviewed content, .edu domains önceliklendiriyor.

4. Google-Extended

Purpose: Gemini + Bard model training (Google Search indexing'den ayrı).

User-agent:

Google-Extended

Crawl frequency: Weekly (GoogleBot'tan bağımsız).

Impact: Gemini citations, Google AI Overviews.

Critical: Google-Extended block etmek, Google Search ranking etkilemiyor (ayrı crawler).

5. Applebot-Extended

Purpose: Apple Intelligence training.

User-agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 Applebot-Extended/1.0

Crawl frequency: Monthly (daha az agresif).

Impact: Apple Intelligence (Siri, iOS AI features).

6. Bytespider

Purpose: ByteDance AI products (TikTok, Doubao LLM).

User-agent:

Mozilla/5.0 (compatible; Bytespider; spider-feedback@bytedance.com)

Crawl frequency: Daily (çok agresif).

Impact: TikTok content recommendations, Doubao LLM (China).

Caution: Bytespider, en aggressive crawler'lardan biri. Rate limiting critical.

Robots.txt Configuration: Allow/Block Strategies

Strategy A: Allow All (Maximum AI Visibility)

Use case: AI citations maximize etmek istiyorsunuz, server capacity yeterli.

robots.txt:

txt
# OpenAI (ChatGPT + SearchGPT)
User-agent: GPTBot
Allow: /
Crawl-delay: 1

User-agent: OAI-SearchBot
Allow: /
Crawl-delay: 1

User-agent: ChatGPT-User
Allow: /

# Perplexity AI
User-agent: PerplexityBot
Allow: /
Crawl-delay: 1

# Anthropic (Claude)
User-agent: ClaudeBot
Allow: /
Crawl-delay: 2

# Google (Gemini)
User-agent: Google-Extended
Allow: /
Crawl-delay: 1

# Apple Intelligence
User-agent: Applebot-Extended
Allow: /
Crawl-delay: 3

# ByteDance (if targeting China/TikTok)
User-agent: Bytespider
Allow: /
Crawl-delay: 2

Avantajları:

Maximum AI citation potential
All platforms accessible

Dezavantajları:

High server load (özellikle Bytespider)
Bandwidth consumption

Strategy B: Selective Allow (Balanced)

Use case: Önemli AI platformları allow, daha az relevant olanları block.

robots.txt:

txt
# Priority AI Bots (Allow)
User-agent: GPTBot
Allow: /
Crawl-delay: 1

User-agent: OAI-SearchBot
Allow: /
Crawl-delay: 1

User-agent: PerplexityBot
Allow: /
Crawl-delay: 1

User-agent: ClaudeBot
Allow: /
Crawl-delay: 2

User-agent: Google-Extended
Allow: /
Crawl-delay: 1

# Lower Priority (Block or Aggressive Rate Limit)
User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /
# Or: Crawl-delay: 5 (allow but slow)

Avantajları:

Balance AI visibility + server load
Target relevant platforms

Dezavantajları:

Missed opportunities (blocked platforms)

Strategy C: Block All (AI Training Opt-Out)

Use case: Proprietary content, paywall, AI training opt-out.

robots.txt:

txt
# Block all AI bots
User-agent: GPTBot
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

Use cases:

News publishers (paywall content)
Proprietary research (trade secrets)
Copyright-sensitive content

Note: Block etmek, AI citations completely eliminate ediyor.

Strategy D: Partial Allow (Specific Paths)

Use case: Public content allow, private/premium content block.

robots.txt:

txt
User-agent: GPTBot
Allow: /blog/
Allow: /public/
Disallow: /premium/
Disallow: /members-only/
Crawl-delay: 1

User-agent: PerplexityBot
Allow: /blog/
Allow: /docs/
Disallow: /customer-portal/
Crawl-delay: 1

Use cases:

SaaS documentation (public docs allow, customer portal block)
Media sites (free articles allow, premium block)

Crawl-Delay Optimization

What is Crawl-Delay?

Definition: Minimum seconds between successive requests from same bot.

Example:

txt
User-agent: GPTBot
Crawl-delay: 2

Meaning: GPTBot, her request arasında minimum 2 saniye bekleyecek.

Recommended Crawl-Delay Values

Bot	Recommended Crawl-Delay	Reasoning
GPTBot	1-2	Moderate frequency, high value
OAI-SearchBot	1	Real-time search, fast needed
PerplexityBot	1-2	Frequent crawls, balance needed
ClaudeBot	2-3	Less frequent, can be slower
Google-Extended	1	Google infrastructure, can handle fast
Bytespider	3-5	Very aggressive, slow down

Performance Impact

Test data (1000-page site):

Crawl-Delay	Pages/Hour	Server Load	Impact on Citations
0 (no delay)	3600	Very High	Fastest indexing
1 second	3600	Moderate	Fast (recommended)
2 seconds	1800	Low	Acceptable
5 seconds	720	Very Low	Slow (risky)

Recommendation: 1-2 seconds = sweet spot (server protection + reasonable crawl speed).

Dynamic Crawl-Delay (Advanced)

Tactic: Server load'a göre crawl-delay ayarla.

Implementation (nginx + Lua):

nginx
location / {
    access_by_lua_block {
        local user_agent = ngx.var.http_user_agent
        local server_load = get_server_load() -- Custom function

        if user_agent:match("GPTBot") then
            if server_load > 80 then
                ngx.sleep(5) -- High load: 5s delay
            elseif server_load > 50 then
                ngx.sleep(2) -- Medium load: 2s delay
            else
                ngx.sleep(1) -- Low load: 1s delay
            end
        end
    }
}

Avantajları: Adaptive (low traffic = fast crawl, high traffic = slow crawl).

Rate Limiting: Server-Level Protection

Why Rate Limiting?

Robots.txt limitations:

Honor-based system (bots can ignore)
No enforcement (rely on bot compliance)

Rate limiting benefits:

Enforced limits (server-level)
Protect from malicious/broken bots
Guarantee server stability

CloudFlare Rate Limiting

Rule örneği (GPTBot için):

CloudFlare Dashboard → Security → WAF → Rate Limiting Rules:

Rule name: GPTBot Rate Limit
If incoming requests match:
  - User Agent contains "GPTBot"
Then:
  - Allow 60 requests per minute
  - Block for 10 minutes if exceeded

Advanced rule (multiple bots):

Rule: AI Bots Aggregate Rate Limit
If incoming requests match:
  - User Agent matches regex: (GPTBot|PerplexityBot|ClaudeBot|Google-Extended)
Then:
  - Allow 120 requests per minute (total)
  - Challenge (CAPTCHA) if exceeded

Nginx Rate Limiting

nginx.conf:

nginx
# Define rate limit zones
limit_req_zone $binary_remote_addr zone=gptbot:10m rate=60r/m;
limit_req_zone $binary_remote_addr zone=perplexitybot:10m rate=60r/m;

server {
    location / {
        # Apply rate limits based on user agent
        if ($http_user_agent ~* "GPTBot") {
            set $bot_limit gptbot;
        }
        if ($http_user_agent ~* "PerplexityBot") {
            set $bot_limit perplexitybot;
        }

        limit_req zone=$bot_limit burst=10 nodelay;

        # ... rest of config
    }
}

Parameters:

rate=60r/m: 60 requests/minute
burst=10: Allow bursts up to 10 requests
nodelay: Immediate response (no queuing)

Apache Rate Limiting

.htaccess (mod_ratelimit):

apache
<IfModule mod_ratelimit.c>
    <If "%{HTTP_USER_AGENT} =~ /GPTBot/">
        SetOutputFilter RATE_LIMIT
        SetEnv rate-limit 100
        # 100 KB/s limit
    </If>
</IfModule>

Or use mod_evasive (request-based):

apache
<IfModule mod_evasive20.c>
    DOSHashTableSize 3097
    DOSPageCount 10
    DOSSiteCount 100
    DOSPageInterval 1
    DOSSiteInterval 1
    DOSBlockingPeriod 600
</IfModule>

Server Resource Monitoring

Key Metrics

Monitor these for AI bot impact:

Metric	Tool	Alert Threshold
CPU usage	htop, CloudWatch	> 80%
Memory usage	free, CloudWatch	> 85%
Bandwidth	vnstat, GA4	> daily budget
Request rate	nginx logs, CloudFlare	> 1000 req/min
TTFB	Pingdom, GTmetrix	> 500ms

Log Analysis

Nginx access log parsing:

bash
# Count requests by bot
awk '{print $12}' /var/log/nginx/access.log | grep -E "(GPTBot|PerplexityBot|ClaudeBot)" | sort | uniq -c | sort -rn

# Output example:
# 1247 GPTBot/1.2
#  892 PerplexityBot/1.0
#  234 ClaudeBot/1.0

# Check bot request distribution over time
grep "GPTBot" /var/log/nginx/access.log | awk '{print $4}' | cut -d: -f1-2 | uniq -c

# Identify most crawled pages
grep "GPTBot" /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -20

CloudFlare Analytics

CloudFlare Dashboard → Analytics → Traffic:

Filters:

User Agent: GPTBot, PerplexityBot vb.
Time range: Last 7 days

Metrics:

Requests
Bandwidth consumed
Countries (geographic distribution)

Google Analytics 4 (Indirect)

Bot traffic GA4'te görünmüyor (bot filtering) ama impact görebilirsiniz:

Segment:

Source = "chatgpt.com"
Source = "perplexity.ai"

Correlation: Bot crawl artışı → 2-4 hafta sonra referral traffic artışı.

AI Bot Access: Strategic Decisions

When to Allow

Allow AI bots eğer:

✅ AI visibility goal: Citations, brand awareness
✅ Public content: Blog, documentation, resources
✅ Server capacity: Yeterli altyapı
✅ B2B/Technical audience: ChatGPT/Claude/Perplexity kullanıcıları

Expected ROI:

Citation frequency: 5-15% (top content)
AI referral traffic: 10-30% of organic search (6-12 ayda)

When to Block

Block AI bots eğer:

❌ Paywall content: Subscription-based
❌ Proprietary data: Trade secrets, competitive intelligence
❌ Server constraints: Limited capacity, high costs
❌ Copyright concerns: AI training opt-out

Use cases:

NYTimes (sued OpenAI - blocks GPTBot)
Scientific journals (paywall)
Internal documentation (employee-only)

Selective Blocking

Block specific paths:

robots.txt:

txt
User-agent: GPTBot
Allow: /blog/
Allow: /resources/
Disallow: /customer-data/
Disallow: /api-docs/

Use cases:

Allow: Marketing content (blog, guides)
Block: Sensitive/proprietary (customer data, internal APIs)

Advanced: Bot Detection ve Enforcement

User-Agent Spoofing Detection

Problem: Bazı crawlers, user-agent spoof edebilir (örn. Perplexity controversy).

Server-side detection (nginx + Lua):

lua
-- Detect suspicious patterns
local user_agent = ngx.var.http_user_agent
local remote_addr = ngx.var.remote_addr

-- Check if user-agent claims to be GPTBot
if user_agent:match("GPTBot") then
    -- Verify IP is from OpenAI ranges
    local openai_ips = {"66.249.64.0/19", "66.249.64.1/20"} -- Example ranges
    if not ip_in_range(remote_addr, openai_ips) then
        ngx.log(ngx.WARN, "Spoofed GPTBot detected: " .. remote_addr)
        ngx.exit(403) -- Block
    end
end

Official IP ranges:

OpenAI: https://openai.com/gptbot (JSON endpoint)
Perplexity: https://docs.perplexity.ai/guides/bots
Anthropic: https://claude.ai/bot

Note: IP ranges değişebilir. Regularly update edin.

Reverse DNS Verification

Verify bot legitimacy:

bash
# Check GPTBot claim
host 66.249.66.1
# Should return: *.openai.com

# Reverse check
host *.openai.com
# Should match original IP

Automated check (Python):

python
import socket

def verify_bot(ip, expected_domain):
    try:
        hostname = socket.gethostbyaddr(ip)[0]
        return expected_domain in hostname
    except:
        return False

# Usage
is_legit = verify_bot("66.249.66.1", "openai.com")

Bandwidth Cost Management

Calculating AI Bot Cost

Example scenario:

Site: 10,000 pages, avg 500 KB/page
AI bots: GPTBot + PerplexityBot + ClaudeBot
Crawl frequency: Weekly (each)

Calculation:

Total data/week = 10,000 pages × 500 KB × 3 bots = 15 GB/week
Annual bandwidth = 15 GB × 52 weeks = 780 GB/year

Cost (AWS CloudFront pricing):

First 10 TB: $0.085/GB
780 GB × $0.085 = $66.30/year

Insight: AI bot bandwidth, manageable (typical blog için).

Reducing Bandwidth

Tactics:

1. Compression (gzip/brotli):

nginx
gzip on;
gzip_types text/html text/css application/javascript;
gzip_comp_level 6;

Impact: %60-70 bandwidth reduction.

2. Conditional requests (ETag):

nginx
etag on;

Benefit: Bot re-crawl'da, unchanged content için 304 Not Modified (no data transfer).

3. CDN caching:

nginx
location ~* \.(jpg|png|css|js)$ {
    expires 30d;
    add_header Cache-Control "public, immutable";
}

Benefit: Static assets CDN'den serve (origin server load azalır).

Sık Sorulan Sorular (FAQ)

AI botları allow etmeliyim mi yoksa block mu?

Çoğu case için allow (with rate limiting).

Allow Avantajları:

800M+ haftalık ChatGPT kullanıcısı
100M+ Perplexity query/month
Citation = brand awareness + traffic

Block scenarios (rare):

Paywall content
Proprietary data
Severe server constraints

Recommendation: Allow + Crawl-delay: 1-2 + server-level rate limiting.

Crawl-delay ne kadar olmalı? 1 saniye yeterli mi?

1-2 saniye optimal.

Test data:

0 saniye: Server load spike riski
1 saniye: %99 durum için yeterli
2 saniye: Conservative (extra safety)
5+ saniye: Too slow (citation delay riski)

Exception: Bytespider için 3-5 saniye (very aggressive).

GPTBot vs OAI-SearchBot: İkisine de allow vermeliyim?

Evet, ikisine de allow.

GPTBot:

Model training + knowledge base
Long-term citation potential

OAI-SearchBot:

SearchGPT real-time search
Immediate citation potential

Not: İkisi ayrı crawlers, ayrı ayrı configure edin.

Perplexity'nin stealth crawler controversy'si hala var mı?

Azaldı ama rate limiting hala önemli.

History:

2024 mid: Perplexity, undeclared crawlers kullandığı report edildi (robots.txt bypass)
Community backlash: Perplexity, compliance improve etti
Current (2025): Mostly honors robots.txt ama rate limiting still recommended

Best practice: robots.txt allow + server-level rate limit (60-120 req/min).

CloudFlare kullanıyorum. AI botları nasıl manage ederim?

CloudFlare WAF + Rate Limiting Rules:

Setup:

Dashboard → Security → WAF → Rate Limiting Rules
Create rule:
- Name: "AI Bots Rate Limit"
- Match: User-Agent contains "GPTBot|PerplexityBot|ClaudeBot"
- Action: Allow 120 requests/minute
- If exceeded: Challenge (CAPTCHA) veya Block 10 min

Monitoring:

Analytics → Traffic: AI bot request counts
Security Events: Rate limit triggers

Advantage: CloudFlare, IP verification otomatik yapabiliyor (spoofed bots detect eder).

AI bot traffic, Google Analytics'te görünüyor mu?

Hayır (bot filtering).

GA4, bots filter ediyor (Settings → Data Filters → Bot Filtering).

But: AI botların impact'ini görebilirsiniz:

Indirect metrics:

Referral traffic: Source = chatgpt.com, perplexity.ai
Direct traffic spike: Citation sonrası brand search artışı
Organic search: AI platforms'ta visibility → Google'da brand search

Server logs: AI bot activity'yi görmek için nginx/Apache logs analiz edin.

Hangi AI bot en çok bandwidth tüketiyor?

Test data (average B2B SaaS blog):

Bot	% of AI Bot Traffic	Aggressiveness
Bytespider	40-50%	Very High
GPTBot	25-30%	Moderate
PerplexityBot	15-20%	Moderate-High
ClaudeBot	5-10%	Low
Google-Extended	5-10%	Low-Moderate

Insight: Bytespider = bandwidth hog (özellikle Asia-Pacific sites için).

Recommendation: Bytespider için aggressive rate limiting veya block (TikTok/China market değilse).

Sonuç: AI Bots = Strategic Asset (with Management)

AI botları, 2025'te critical opportunity ama doğru management gerekiyor:

Temel Çıkarımlar:

Allow (with limits): robots.txt allow + rate limiting (60-120 req/min)
Crawl-delay: 1-2 saniye: Server protection + reasonable crawl speed
Selective blocking: Public content allow, proprietary block
Monitor server impact: CPU, memory, bandwidth, TTFB
Platform priority: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot > Bytespider
Bandwidth manageable: Typical blog için <$100/year (AI bots için)

Action Plan:

Week 1:

robots.txt audit (AI bots allow/block check)
Crawl-delay ekle (1-2 saniye)
Server log analysis (current AI bot traffic)

Week 2:

Rate limiting implement (CloudFlare or nginx)
IP range verification setup (anti-spoofing)
Monitoring dashboard (AI bot metrics)

Week 3:

Bandwidth cost calculate
Compression enable (gzip/brotli)
CDN caching optimize

Week 4:

Performance test (TTFB with AI bot traffic)
Adjust rate limits based on data
Document configuration (future reference)

Sonraki adım: robots.txt optimize edin, rate limiting kurun, AI bot impact monitor edin.

AI bot erişimini optimize etmeye ek olarak schema markup ile AI görünürlüğü rehberine bakarak crawler'ların içeriğinizi daha iyi anlamasını sağlayabilirsiniz.

İlgili Yazılar:

ChatGPT Citation Optimization: GPTBot ve SearchGPT Stratejileri
Perplexity AI Optimization: PerplexityBot Configuration
Server Performance Optimization: TTFB ve Page Load Speed

AI Bot Yönetimi: GPTBot, PerplexityBot, ClaudeBot Configuration Rehberi

2025'te AI crawlerlar, Googlebot trafiğinin %20'sine ulaştı.

GPTBot (OpenAI): Mayıs 2024'ten Mayıs 2025'e %305 growth (share: %5 → %30).

Challenge: AI botları allow ederek citation opportunity kazanın, ama server yükünü manage edin.

AI bot yönetimi ile beraber platform-spesifik optimizasyon stratejileri için teknik SEO ve AI platformlar rehberine bakabilirsiniz.

AI bot landscape (2025):

GPTBot (OpenAI - ChatGPT)
OAI-SearchBot (OpenAI - SearchGPT)
PerplexityBot (Perplexity AI)
ClaudeBot (Anthropic - Claude)
Google-Extended (Google - Gemini + Bard)
Applebot-Extended (Apple Intelligence)
Bytespider (ByteDance - TikTok/Doubao)

Bu rehberde, AI botları optimize etmenin, robots.txt configuration'ın, rate limiting stratejilerinin ve server resource management'ın best practices'lerini derinlemesine inceleyeceğiz.

AI Crawlers: Overview ve User Agents

1. OpenAI Bots

GPTBot

Purpose: Model training + knowledge base.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)

Crawl frequency: 2-4x/week (active sites için).

Impact: ChatGPT citations, knowledge base updates.

ChatGPT özelinde citation optimizasyon için ChatGPT citation optimizasyon rehberine bakabilirsiniz.

OAI-SearchBot

Purpose: Real-time web search (SearchGPT feature).

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot)

Crawl frequency: Real-time (query-driven).

Impact: SearchGPT citation potential.

ChatGPT-User

Purpose: Browser/API-based user interactions.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 ChatGPT-User

Crawl frequency: On-demand (user query triggers).

2. Perplexity Bot

Purpose: Real-time answer engine indexing.

User-agent:

Mozilla/5.0 (compatible; PerplexityBot/1.0; +https://perplexity.ai/bot)

Crawl frequency: 2-7x/week (fresh content için daha sık).

Impact: Perplexity citation (real-time search results).

Perplexity özelinde citation stratejileri için Perplexity AI citation stratejileri rehberine bakabilirsiniz.

Note: Perplexity controversy - stealth crawlers kullandığı report edildi (robots.txt bypass etmek için). Bu davranış, community backlash sonrası azaldı ama rate limiting hala önemli.

3. Anthropic ClaudeBot

Purpose: Claude model training + knowledge base.

User-agent:

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claude.ai/bot)

Crawl frequency: 1-2x/week (daha az agresif).

Impact: Claude citations, technical/academic queries için.

Claude özelinde citation stratejileri ve teknik içerik optimizasyonu için Claude AI citation optimizasyon rehberine bakabilirsiniz.

Characteristic: Peer-reviewed content, .edu domains önceliklendiriyor.

4. Google-Extended

Purpose: Gemini + Bard model training (Google Search indexing'den ayrı).

User-agent:

Google-Extended

Crawl frequency: Weekly (GoogleBot'tan bağımsız).

Impact: Gemini citations, Google AI Overviews.

Critical: Google-Extended block etmek, Google Search ranking etkilemiyor (ayrı crawler).

5. Applebot-Extended

Purpose: Apple Intelligence training.

User-agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 Applebot-Extended/1.0

Crawl frequency: Monthly (daha az agresif).

Impact: Apple Intelligence (Siri, iOS AI features).

6. Bytespider

Purpose: ByteDance AI products (TikTok, Doubao LLM).

User-agent:

Mozilla/5.0 (compatible; Bytespider; spider-feedback@bytedance.com)

Crawl frequency: Daily (çok agresif).

Impact: TikTok content recommendations, Doubao LLM (China).

Caution: Bytespider, en aggressive crawler'lardan biri. Rate limiting critical.

Robots.txt Configuration: Allow/Block Strategies

Strategy A: Allow All (Maximum AI Visibility)

Use case: AI citations maximize etmek istiyorsunuz, server capacity yeterli.

robots.txt:

txt
# OpenAI (ChatGPT + SearchGPT)
User-agent: GPTBot
Allow: /
Crawl-delay: 1

User-agent: OAI-SearchBot
Allow: /
Crawl-delay: 1

User-agent: ChatGPT-User
Allow: /

# Perplexity AI
User-agent: PerplexityBot
Allow: /
Crawl-delay: 1

# Anthropic (Claude)
User-agent: ClaudeBot
Allow: /
Crawl-delay: 2

# Google (Gemini)
User-agent: Google-Extended
Allow: /
Crawl-delay: 1

# Apple Intelligence
User-agent: Applebot-Extended
Allow: /
Crawl-delay: 3

# ByteDance (if targeting China/TikTok)
User-agent: Bytespider
Allow: /
Crawl-delay: 2

Avantajları:

Maximum AI citation potential
All platforms accessible

Dezavantajları:

High server load (özellikle Bytespider)
Bandwidth consumption

Strategy B: Selective Allow (Balanced)

Use case: Önemli AI platformları allow, daha az relevant olanları block.

robots.txt:

txt
# Priority AI Bots (Allow)
User-agent: GPTBot
Allow: /
Crawl-delay: 1

User-agent: OAI-SearchBot
Allow: /
Crawl-delay: 1

User-agent: PerplexityBot
Allow: /
Crawl-delay: 1

User-agent: ClaudeBot
Allow: /
Crawl-delay: 2

User-agent: Google-Extended
Allow: /
Crawl-delay: 1

# Lower Priority (Block or Aggressive Rate Limit)
User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /
# Or: Crawl-delay: 5 (allow but slow)

Avantajları:

Balance AI visibility + server load
Target relevant platforms

Dezavantajları:

Missed opportunities (blocked platforms)

Strategy C: Block All (AI Training Opt-Out)

Use case: Proprietary content, paywall, AI training opt-out.

robots.txt:

txt
# Block all AI bots
User-agent: GPTBot
Disallow: /

User-agent: OAI-SearchBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: Applebot-Extended
Disallow: /

User-agent: Bytespider
Disallow: /

Use cases:

News publishers (paywall content)
Proprietary research (trade secrets)
Copyright-sensitive content

Note: Block etmek, AI citations completely eliminate ediyor.

Strategy D: Partial Allow (Specific Paths)

Use case: Public content allow, private/premium content block.

robots.txt:

txt
User-agent: GPTBot
Allow: /blog/
Allow: /public/
Disallow: /premium/
Disallow: /members-only/
Crawl-delay: 1

User-agent: PerplexityBot
Allow: /blog/
Allow: /docs/
Disallow: /customer-portal/
Crawl-delay: 1

Use cases:

SaaS documentation (public docs allow, customer portal block)
Media sites (free articles allow, premium block)

Crawl-Delay Optimization

What is Crawl-Delay?

Definition: Minimum seconds between successive requests from same bot.

Example:

txt
User-agent: GPTBot
Crawl-delay: 2

Meaning: GPTBot, her request arasında minimum 2 saniye bekleyecek.

Recommended Crawl-Delay Values

Bot	Recommended Crawl-Delay	Reasoning
GPTBot	1-2	Moderate frequency, high value
OAI-SearchBot	1	Real-time search, fast needed
PerplexityBot	1-2	Frequent crawls, balance needed
ClaudeBot	2-3	Less frequent, can be slower
Google-Extended	1	Google infrastructure, can handle fast
Bytespider	3-5	Very aggressive, slow down

Performance Impact

Test data (1000-page site):

Crawl-Delay	Pages/Hour	Server Load	Impact on Citations
0 (no delay)	3600	Very High	Fastest indexing
1 second	3600	Moderate	Fast (recommended)
2 seconds	1800	Low	Acceptable
5 seconds	720	Very Low	Slow (risky)

Recommendation: 1-2 seconds = sweet spot (server protection + reasonable crawl speed).

Dynamic Crawl-Delay (Advanced)

Tactic: Server load'a göre crawl-delay ayarla.

Implementation (nginx + Lua):

nginx
location / {
    access_by_lua_block {
        local user_agent = ngx.var.http_user_agent
        local server_load = get_server_load() -- Custom function

        if user_agent:match("GPTBot") then
            if server_load > 80 then
                ngx.sleep(5) -- High load: 5s delay
            elseif server_load > 50 then
                ngx.sleep(2) -- Medium load: 2s delay
            else
                ngx.sleep(1) -- Low load: 1s delay
            end
        end
    }
}

Avantajları: Adaptive (low traffic = fast crawl, high traffic = slow crawl).

Rate Limiting: Server-Level Protection

Why Rate Limiting?

Robots.txt limitations:

Honor-based system (bots can ignore)
No enforcement (rely on bot compliance)

Rate limiting benefits:

Enforced limits (server-level)
Protect from malicious/broken bots
Guarantee server stability

CloudFlare Rate Limiting

Rule örneği (GPTBot için):

CloudFlare Dashboard → Security → WAF → Rate Limiting Rules:

Rule name: GPTBot Rate Limit
If incoming requests match:
  - User Agent contains "GPTBot"
Then:
  - Allow 60 requests per minute
  - Block for 10 minutes if exceeded

Advanced rule (multiple bots):

Rule: AI Bots Aggregate Rate Limit
If incoming requests match:
  - User Agent matches regex: (GPTBot|PerplexityBot|ClaudeBot|Google-Extended)
Then:
  - Allow 120 requests per minute (total)
  - Challenge (CAPTCHA) if exceeded

Nginx Rate Limiting

nginx.conf:

nginx
# Define rate limit zones
limit_req_zone $binary_remote_addr zone=gptbot:10m rate=60r/m;
limit_req_zone $binary_remote_addr zone=perplexitybot:10m rate=60r/m;

server {
    location / {
        # Apply rate limits based on user agent
        if ($http_user_agent ~* "GPTBot") {
            set $bot_limit gptbot;
        }
        if ($http_user_agent ~* "PerplexityBot") {
            set $bot_limit perplexitybot;
        }

        limit_req zone=$bot_limit burst=10 nodelay;

        # ... rest of config
    }
}

Parameters:

rate=60r/m: 60 requests/minute
burst=10: Allow bursts up to 10 requests
nodelay: Immediate response (no queuing)

Apache Rate Limiting

.htaccess (mod_ratelimit):

apache
<IfModule mod_ratelimit.c>
    <If "%{HTTP_USER_AGENT} =~ /GPTBot/">
        SetOutputFilter RATE_LIMIT
        SetEnv rate-limit 100
        # 100 KB/s limit
    </If>
</IfModule>

Or use mod_evasive (request-based):

apache
<IfModule mod_evasive20.c>
    DOSHashTableSize 3097
    DOSPageCount 10
    DOSSiteCount 100
    DOSPageInterval 1
    DOSSiteInterval 1
    DOSBlockingPeriod 600
</IfModule>

Server Resource Monitoring

Key Metrics

Monitor these for AI bot impact:

Metric	Tool	Alert Threshold
CPU usage	htop, CloudWatch	> 80%
Memory usage	free, CloudWatch	> 85%
Bandwidth	vnstat, GA4	> daily budget
Request rate	nginx logs, CloudFlare	> 1000 req/min
TTFB	Pingdom, GTmetrix	> 500ms

Log Analysis

Nginx access log parsing:

bash
# Count requests by bot
awk '{print $12}' /var/log/nginx/access.log | grep -E "(GPTBot|PerplexityBot|ClaudeBot)" | sort | uniq -c | sort -rn

# Output example:
# 1247 GPTBot/1.2
#  892 PerplexityBot/1.0
#  234 ClaudeBot/1.0

# Check bot request distribution over time
grep "GPTBot" /var/log/nginx/access.log | awk '{print $4}' | cut -d: -f1-2 | uniq -c

# Identify most crawled pages
grep "GPTBot" /var/log/nginx/access.log | awk '{print $7}' | sort | uniq -c | sort -rn | head -20

CloudFlare Analytics

CloudFlare Dashboard → Analytics → Traffic:

Filters:

User Agent: GPTBot, PerplexityBot vb.
Time range: Last 7 days

Metrics:

Requests
Bandwidth consumed
Countries (geographic distribution)

Google Analytics 4 (Indirect)

Bot traffic GA4'te görünmüyor (bot filtering) ama impact görebilirsiniz:

Segment:

Source = "chatgpt.com"
Source = "perplexity.ai"

Correlation: Bot crawl artışı → 2-4 hafta sonra referral traffic artışı.

AI Bot Access: Strategic Decisions

When to Allow

Allow AI bots eğer:

✅ AI visibility goal: Citations, brand awareness
✅ Public content: Blog, documentation, resources
✅ Server capacity: Yeterli altyapı
✅ B2B/Technical audience: ChatGPT/Claude/Perplexity kullanıcıları

Expected ROI:

Citation frequency: 5-15% (top content)
AI referral traffic: 10-30% of organic search (6-12 ayda)

When to Block

Block AI bots eğer:

❌ Paywall content: Subscription-based
❌ Proprietary data: Trade secrets, competitive intelligence
❌ Server constraints: Limited capacity, high costs
❌ Copyright concerns: AI training opt-out

Use cases:

NYTimes (sued OpenAI - blocks GPTBot)
Scientific journals (paywall)
Internal documentation (employee-only)

Selective Blocking

Block specific paths:

robots.txt:

txt
User-agent: GPTBot
Allow: /blog/
Allow: /resources/
Disallow: /customer-data/
Disallow: /api-docs/

Use cases:

Allow: Marketing content (blog, guides)
Block: Sensitive/proprietary (customer data, internal APIs)

Advanced: Bot Detection ve Enforcement

User-Agent Spoofing Detection

Problem: Bazı crawlers, user-agent spoof edebilir (örn. Perplexity controversy).

Server-side detection (nginx + Lua):

lua
-- Detect suspicious patterns
local user_agent = ngx.var.http_user_agent
local remote_addr = ngx.var.remote_addr

-- Check if user-agent claims to be GPTBot
if user_agent:match("GPTBot") then
    -- Verify IP is from OpenAI ranges
    local openai_ips = {"66.249.64.0/19", "66.249.64.1/20"} -- Example ranges
    if not ip_in_range(remote_addr, openai_ips) then
        ngx.log(ngx.WARN, "Spoofed GPTBot detected: " .. remote_addr)
        ngx.exit(403) -- Block
    end
end

Official IP ranges:

OpenAI: https://openai.com/gptbot (JSON endpoint)
Perplexity: https://docs.perplexity.ai/guides/bots
Anthropic: https://claude.ai/bot

Note: IP ranges değişebilir. Regularly update edin.

Reverse DNS Verification

Verify bot legitimacy:

bash
# Check GPTBot claim
host 66.249.66.1
# Should return: *.openai.com

# Reverse check
host *.openai.com
# Should match original IP

Automated check (Python):

python
import socket

def verify_bot(ip, expected_domain):
    try:
        hostname = socket.gethostbyaddr(ip)[0]
        return expected_domain in hostname
    except:
        return False

# Usage
is_legit = verify_bot("66.249.66.1", "openai.com")

Bandwidth Cost Management

Calculating AI Bot Cost

Example scenario:

Site: 10,000 pages, avg 500 KB/page
AI bots: GPTBot + PerplexityBot + ClaudeBot
Crawl frequency: Weekly (each)

Calculation:

Total data/week = 10,000 pages × 500 KB × 3 bots = 15 GB/week
Annual bandwidth = 15 GB × 52 weeks = 780 GB/year

Cost (AWS CloudFront pricing):

First 10 TB: $0.085/GB
780 GB × $0.085 = $66.30/year

Insight: AI bot bandwidth, manageable (typical blog için).

Reducing Bandwidth

Tactics:

1. Compression (gzip/brotli):

nginx
gzip on;
gzip_types text/html text/css application/javascript;
gzip_comp_level 6;

Impact: %60-70 bandwidth reduction.

2. Conditional requests (ETag):

nginx
etag on;

Benefit: Bot re-crawl'da, unchanged content için 304 Not Modified (no data transfer).

3. CDN caching:

nginx
location ~* \.(jpg|png|css|js)$ {
    expires 30d;
    add_header Cache-Control "public, immutable";
}

Benefit: Static assets CDN'den serve (origin server load azalır).

Sık Sorulan Sorular (FAQ)

AI botları allow etmeliyim mi yoksa block mu?

Çoğu case için allow (with rate limiting).

Allow Avantajları:

800M+ haftalık ChatGPT kullanıcısı
100M+ Perplexity query/month
Citation = brand awareness + traffic

Block scenarios (rare):

Paywall content
Proprietary data
Severe server constraints

Recommendation: Allow + Crawl-delay: 1-2 + server-level rate limiting.

Crawl-delay ne kadar olmalı? 1 saniye yeterli mi?

1-2 saniye optimal.

Test data:

0 saniye: Server load spike riski
1 saniye: %99 durum için yeterli
2 saniye: Conservative (extra safety)
5+ saniye: Too slow (citation delay riski)

Exception: Bytespider için 3-5 saniye (very aggressive).

GPTBot vs OAI-SearchBot: İkisine de allow vermeliyim?

Evet, ikisine de allow.

GPTBot:

Model training + knowledge base
Long-term citation potential

OAI-SearchBot:

SearchGPT real-time search
Immediate citation potential

Not: İkisi ayrı crawlers, ayrı ayrı configure edin.

Perplexity'nin stealth crawler controversy'si hala var mı?

Azaldı ama rate limiting hala önemli.

History:

2024 mid: Perplexity, undeclared crawlers kullandığı report edildi (robots.txt bypass)
Community backlash: Perplexity, compliance improve etti
Current (2025): Mostly honors robots.txt ama rate limiting still recommended

Best practice: robots.txt allow + server-level rate limit (60-120 req/min).

CloudFlare kullanıyorum. AI botları nasıl manage ederim?

CloudFlare WAF + Rate Limiting Rules:

Setup:

Dashboard → Security → WAF → Rate Limiting Rules
Create rule:
- Name: "AI Bots Rate Limit"
- Match: User-Agent contains "GPTBot|PerplexityBot|ClaudeBot"
- Action: Allow 120 requests/minute
- If exceeded: Challenge (CAPTCHA) veya Block 10 min

Monitoring:

Analytics → Traffic: AI bot request counts
Security Events: Rate limit triggers

Advantage: CloudFlare, IP verification otomatik yapabiliyor (spoofed bots detect eder).

AI bot traffic, Google Analytics'te görünüyor mu?

Hayır (bot filtering).

GA4, bots filter ediyor (Settings → Data Filters → Bot Filtering).

But: AI botların impact'ini görebilirsiniz:

Indirect metrics:

Referral traffic: Source = chatgpt.com, perplexity.ai
Direct traffic spike: Citation sonrası brand search artışı
Organic search: AI platforms'ta visibility → Google'da brand search

Server logs: AI bot activity'yi görmek için nginx/Apache logs analiz edin.

Hangi AI bot en çok bandwidth tüketiyor?

Test data (average B2B SaaS blog):

Bot	% of AI Bot Traffic	Aggressiveness
Bytespider	40-50%	Very High
GPTBot	25-30%	Moderate
PerplexityBot	15-20%	Moderate-High
ClaudeBot	5-10%	Low
Google-Extended	5-10%	Low-Moderate

Insight: Bytespider = bandwidth hog (özellikle Asia-Pacific sites için).

Recommendation: Bytespider için aggressive rate limiting veya block (TikTok/China market değilse).

Sonuç: AI Bots = Strategic Asset (with Management)

AI botları, 2025'te critical opportunity ama doğru management gerekiyor:

Temel Çıkarımlar:

Allow (with limits): robots.txt allow + rate limiting (60-120 req/min)
Crawl-delay: 1-2 saniye: Server protection + reasonable crawl speed
Selective blocking: Public content allow, proprietary block
Monitor server impact: CPU, memory, bandwidth, TTFB
Platform priority: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot > Bytespider
Bandwidth manageable: Typical blog için <$100/year (AI bots için)

Action Plan:

Week 1:

robots.txt audit (AI bots allow/block check)
Crawl-delay ekle (1-2 saniye)
Server log analysis (current AI bot traffic)

Week 2:

Rate limiting implement (CloudFlare or nginx)
IP range verification setup (anti-spoofing)
Monitoring dashboard (AI bot metrics)

Week 3:

Bandwidth cost calculate
Compression enable (gzip/brotli)
CDN caching optimize

Week 4:

Performance test (TTFB with AI bot traffic)
Adjust rate limits based on data
Document configuration (future reference)

Sonraki adım: robots.txt optimize edin, rate limiting kurun, AI bot impact monitor edin.

AI bot erişimini optimize etmeye ek olarak schema markup ile AI görünürlüğü rehberine bakarak crawler'ların içeriğinizi daha iyi anlamasını sağlayabilirsiniz.

İlgili Yazılar:

ChatGPT Citation Optimization: GPTBot ve SearchGPT Stratejileri
Perplexity AI Optimization: PerplexityBot Configuration
Server Performance Optimization: TTFB ve Page Load Speed

AI Bot Yönetimi: GPTBot, PerplexityBot, ClaudeBot, Google-Extended Configuration Rehberi (robots.txt, Rate Limiting, Crawl Optimization)

AI Bot Yönetimi: GPTBot, PerplexityBot, ClaudeBot Configuration Rehberi

AI Crawlers: Overview ve User Agents

1. OpenAI Bots

GPTBot

OAI-SearchBot

ChatGPT-User

2. Perplexity Bot

3. Anthropic ClaudeBot

4. Google-Extended

5. Applebot-Extended

6. Bytespider

Robots.txt Configuration: Allow/Block Strategies

Strategy A: Allow All (Maximum AI Visibility)

Strategy B: Selective Allow (Balanced)

Strategy C: Block All (AI Training Opt-Out)

Strategy D: Partial Allow (Specific Paths)

Crawl-Delay Optimization

What is Crawl-Delay?

Recommended Crawl-Delay Values

Performance Impact

Dynamic Crawl-Delay (Advanced)

Rate Limiting: Server-Level Protection

Why Rate Limiting?

CloudFlare Rate Limiting

Nginx Rate Limiting

Apache Rate Limiting

Server Resource Monitoring

Key Metrics

Log Analysis

CloudFlare Analytics

Google Analytics 4 (Indirect)

AI Bot Access: Strategic Decisions

When to Allow

When to Block

Selective Blocking

Advanced: Bot Detection ve Enforcement

User-Agent Spoofing Detection

Reverse DNS Verification

Bandwidth Cost Management

Calculating AI Bot Cost

Reducing Bandwidth

Sık Sorulan Sorular (FAQ)

AI botları allow etmeliyim mi yoksa block mu?

Crawl-delay ne kadar olmalı? 1 saniye yeterli mi?

GPTBot vs OAI-SearchBot: İkisine de allow vermeliyim?

Perplexity'nin stealth crawler controversy'si hala var mı?

CloudFlare kullanıyorum. AI botları nasıl manage ederim?

AI bot traffic, Google Analytics'te görünüyor mu?

Hangi AI bot en çok bandwidth tüketiyor?

Sonuç: AI Bots = Strategic Asset (with Management)

Temel Çıkarımlar:

Action Plan:

AI SEO Optimizer Ekibi

İlgili Yazılar

Technical SEO for AI Platforms 2025: GPTBot, PerplexityBot, ClaudeBot Crawl Optimization ve Core Web Vitals

Schema Markup ile AI Görünürlüğü: 2025 Kapsamlı Rehber (JSON-LD, Entity Recognition, Knowledge Graph)

Claude AI Citation Nasıl Alınır? Anthropic Claude için Technical Accuracy, Peer-Review ve Balanced Content Stratejileri (2025)

AI Bot Yönetimi: GPTBot, PerplexityBot, ClaudeBot Configuration Rehberi

AI Crawlers: Overview ve User Agents

1. OpenAI Bots

GPTBot

OAI-SearchBot

ChatGPT-User

2. Perplexity Bot

3. Anthropic ClaudeBot

4. Google-Extended

5. Applebot-Extended

6. Bytespider

Robots.txt Configuration: Allow/Block Strategies

Strategy A: Allow All (Maximum AI Visibility)

Strategy B: Selective Allow (Balanced)

Strategy C: Block All (AI Training Opt-Out)

Strategy D: Partial Allow (Specific Paths)

Crawl-Delay Optimization

What is Crawl-Delay?

Recommended Crawl-Delay Values

Performance Impact

Dynamic Crawl-Delay (Advanced)

Rate Limiting: Server-Level Protection