Robots.txt Best Practices 2026: Safe Rules for SEO and AI Crawlers
A robots.txt file lives at the root of your domain and tells search engine bots which pages they can or cannot crawl. It’s an incredibly powerful rulebook, but also a dangerous one. A simple Disallow: / will completely erase your site from Google.
Best Practices
- Never block CSS or JS: Googlebot renders your page like a real browser. If you block CSS, it assumes your site is not mobile-friendly.
- Point to your Sitemap: Always include
Sitemap: https://yourdomain.com/sitemap.xmlat the bottom. - Block admin routes:
Disallow: /wp-admin/orDisallow: /api/saves your server from unnecessary bot traffic.
AI Scraper Bots
In the generative AI era, you may want to block AI bots from stealing your content for training data.
User-agent: GPTBot
Disallow: /
User-agent: CCBot
Disallow: /
Instead of writing it blindly, use our Robots.txt Generator to safely construct the file, and then run it through our Robots.txt Validator to ensure no critical SEO routes are accidentally blocked.
Safe deployment pattern
Version-control robots rules and deploy changes through pull requests. Add a pre-release check that compares blocked routes against your indexed URL inventory to prevent accidental de-indexing.
GEO extension for AI crawlers
If you restrict AI bots, document why and which bots are affected. Transparent policy pages can reduce confusion for partners and content teams.
Related Reading
Continue with the next most relevant guides in this topical cluster.
Technical SEO for Europe (2026): Hreflang, x-default, and DACH Targeting
Learn how to deploy hreflang correctly across EU markets, avoid duplicate intent issues, and improve geo-targeted visibility.
AIAI Readiness for Websites: GEO Checklist for ChatGPT and Perplexity
A practical readiness checklist for crawlability, schema, and semantic content design to improve AI engine discoverability.
AIAnswer Engine Optimization (AEO): How to Rank in ChatGPT and Perplexity
Format your pages for answer extraction with clear definitions, structured comparisons, and citation-friendly content blocks.