What is a Robots.txt File?
A robots.txt file is a critical SEO tool that instructs search engine crawlers which pages or sections of your website they can or cannot access. Located at the root of your domain (e.g., https://example.com/robots.txt), it's one of the first files crawlers check before indexing your site.
Why Do You Need a Robots.txt File?
Control Crawling
Prevent bots from accessing sensitive areas like admin panels, login pages, or internal APIs
Save Crawl Budget
Direct crawlers to your important pages instead of wasting resources on low-value URLs
Block AI Scrapers
Stop AI bots like GPTBot and CCBot from using your content for training data
Improve SEO
Help search engines focus on the pages you want ranked in search results
How to Use This Generator
Quick Start with Presets
Click one of the preset buttons at the top to load a common configuration instantly:
Standard
Allow All
Block All
Block AI Bots
E-commerce
Blank
Building Custom Rules
Choose a User-Agent
Select from the dropdown (e.g., Googlebot, Bingbot) or enter a custom bot name to target specific crawlers
Add Rules
Set paths as Allow or Disallow. Use the path suggestions for common directories like /admin/, /wp-content/, or /api/
Set Crawl-Delay
Optionally specify how many seconds a bot should wait between requests to reduce server load
Add Sitemaps
Enter your sitemap URLs (e.g., https://example.com/sitemap.xml) to help crawlers discover your content efficiently
Copy or Download
Use the Copy button to paste directly into your file, or download the ready-to-upload robots.txt file
Import an Existing File
Click Import Existing, paste your current robots.txt content, and click Apply. The tool will parse it into editable rule groups so you can make changes visually without manual syntax editing.
Features
Visual Rule Builder
Build robots.txt rules through an intuitive, no-code interface. Each rule group includes a user-agent selector, allow/disallow paths, and an optional crawl-delay setting. Add or remove groups and rules with a single click — no syntax knowledge required.
Manual Editing
- Memorize syntax rules
- Risk of formatting errors
- Time-consuming testing
- Difficult to visualize structure
Visual Interface
- Point-and-click configuration
- Automatic syntax validation
- Real-time preview
- Clear rule organization
Comprehensive Bot Library
Choose from 19 pre-configured user-agents covering major search engines, social platforms, and AI crawlers:
Major Search Engines
- Googlebot — Google's primary web crawler
- Bingbot — Microsoft Bing's crawler
- Yandex — Russia's leading search engine
- Baiduspider — China's dominant search crawler
- DuckDuckBot — Privacy-focused search engine
- Slurp — Yahoo's web crawler
Social Media Crawlers
- facebookexternalhit — Facebook link preview crawler
- Twitterbot — Twitter card and preview bot
- LinkedInBot — LinkedIn content crawler
AI Training Bots
- GPTBot — OpenAI's web crawler for ChatGPT training
- ChatGPT-User — ChatGPT browsing feature bot
- Google-Extended — Google's AI training crawler
- CCBot — Common Crawl data collection bot
- anthropic-ai — Anthropic's Claude AI crawler
- Claude-Web — Claude web browsing bot
- Bytespider — ByteDance's AI training crawler
Specialized Crawlers
- Googlebot-Image — Google's image indexing bot
- Googlebot-News — Google News crawler
- Custom bot name — Enter any user-agent string
Smart Path Suggestions
When typing a path, the tool suggests common directories and patterns to speed up configuration. Supports wildcard patterns for advanced control.
Common Directories
/admin/, /wp-admin/, /api/, /cart/, /checkout/, /private/, /tmp/Wildcard Patterns
/*.pdf$ (block PDFs), /search?* (block search queries), /*?sort=* (block sorting parameters)Live Preview with Syntax Highlighting
See your robots.txt output update in real-time as you make changes. Directives, values, and sitemap URLs are color-coded for easy reading and validation. Instantly spot errors or formatting issues before downloading.
Import and Edit
Paste an existing robots.txt file to parse it into visual rule groups. Edit the rules in the user-friendly interface, add new directives, or reorganize existing ones. Export the updated version when you're done — perfect for maintaining and optimizing existing configurations.
Frequently Asked Questions
Where do I put the robots.txt file?
Upload it to the root directory of your website so it's accessible at https://yourdomain.com/robots.txt. Search engines check this specific URL before crawling your site.
robots.txt (lowercase) and placed in the root directory — not in a subdirectory or with a different name.Does robots.txt block pages from appearing in search results?
Not exactly. Robots.txt prevents crawlers from accessing a page, but the URL can still appear in search results if other pages link to it. To fully block a page from search results, use a noindex meta tag or X-Robots-Tag HTTP header instead.
If you want to block a URL from Google Search results, use noindex. Don't use robots.txt for this purpose, as it may still appear in search results without a description.
— Google Search Central Documentation
What does "Disallow: /" mean?
It tells the specified bot not to crawl any page on your site. Use this carefully — it effectively hides your entire site from that crawler.
User-agent: *
Disallow: /
Warning: This configuration blocks all search engines from crawling your entire website. Only use during development or for private sites.
What is Crawl-delay?
Crawl-delay tells a bot to wait a specified number of seconds between requests. This can reduce server load from aggressive crawlers.
User-agent: Bingbot
Crawl-delay: 10
How do I block AI bots from scraping my content?
Use the Block AI Bots preset, which creates Disallow rules for GPTBot, ChatGPT-User, Google-Extended, CCBot, anthropic-ai, Claude-Web, and Bytespider while still allowing regular search engines to crawl your site.
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: Bytespider
Disallow: /
This configuration protects your content from being used for AI training while maintaining visibility in search engines.
Can I use wildcards in robots.txt paths?
Yes, most modern crawlers support * (matches any sequence) and $ (matches end of URL). These wildcards enable powerful pattern matching for complex rules.
| Pattern | Meaning | Example |
|---|---|---|
* |
Matches any sequence of characters | /search?* blocks all search queries |
$ |
Matches the end of the URL | /*.pdf$ blocks all PDF files |
*$ |
Combined pattern | /*?sort=*$ blocks URLs with sort parameters |
Is my data safe?
100% Private: This tool runs entirely in your browser using client-side JavaScript. No data is sent to any server — your robots.txt content stays on your device.
- No server uploads or data transmission
- No tracking or analytics on your content
- No storage of your configuration
- Complete privacy and security
No comments yet. Be the first to comment!