robots.txt Validator

Paste your robots.txt to validate syntax, check which crawlers are allowed or blocked, and catch common mistakes that could accidentally block Google from indexing your site.

Enter any domain

What is robots.txt?

robots.txt is a plain text file at the root of your website (e.g. yoursite.com/robots.txt) that tells search engine crawlers which pages they should and shouldn't visit. It's the first file bots check when they arrive at your domain. Getting it wrong can accidentally de-index your entire site — or fail to protect private areas from crawlers.

Common robots.txt Mistakes

— Disallow: / — this blocks ALL bots from your entire site. A common deployment accident.
— No User-agent: * wildcard — bots with no specific rule may crawl uncontrolled.
— Missing Sitemap directive — always declare your sitemap URL here for faster indexing.
— Blocking /api/ routes that your frontend depends on — can break Google's rendering.
— Using robots.txt to hide sensitive data — it's public! Use authentication instead.

robots.txt for Next.js / Vercel

In Next.js App Router, create a app/robots.ts file to generate your robots.txt dynamically:

import { MetadataRoute } from 'next'

export default function robots(): MetadataRoute.Robots {
  return {
    rules: {
      userAgent: '*',
      allow: '/',
      disallow: ['/admin/', '/api/'],
    },
    sitemap: 'https://yoursite.com/sitemap.xml',
  }
}

Popular Bot User-Agents

GooglebotGoogle's primary web crawler for Search

Googlebot-ImageGoogle image search crawler

BingbotMicrosoft Bing search crawler

GPTBotOpenAI's training data crawler

Claude-WebAnthropic's web crawler

CCBotCommon Crawl — used for AI training datasets

robots.txt Validator

What is robots.txt?

Common robots.txt Mistakes

robots.txt for Next.js / Vercel

Popular Bot User-Agents

Keydar

TheFastestWeb

Submitwell