What is a robots.txt file?

A robots.txt file is a text file placed at the root of your website that tells search engine crawlers which pages or sections they can and cannot access.

Where should I place my robots.txt file?

The robots.txt file must be placed in the root directory of your domain - for example, https://www.example.com/robots.txt.

Does robots.txt prevent pages from being indexed?

Not directly. Robots.txt tells crawlers not to crawl certain pages, but if other pages link to a blocked URL, Google may still index it. Use a noindex meta tag to fully prevent indexing.

What is the crawl-delay directive?

Crawl-delay tells search engine crawlers how many seconds to wait between requests. Google does not support crawl-delay, but Bing and other crawlers do.

Should I include my sitemap URL in robots.txt?

Yes, it is a best practice to include your sitemap URL using the Sitemap: directive. This helps search engines find your sitemap even before you submit it in Search Console.

Crawl control

Robots.txt Generator

Generate robots.txt with crawl rules, sitemap pointers, and AI bot blocking presets.

Quick Presets:

User AgentUse commas for multiple crawler groups, e.g. GPTBot, CCBot, Bytespider.Sitemap URLCrawl Delay

Rules

robots.txt

User-agent: *
Allow: /
Disallow: /account/
Disallow: /checkout/

Sitemap: https://www.example.com/sitemap.xml

Mastering Robots.txt for Technical SEO

The robots.txt file is the gatekeeper of your website. Utilizing the Robots Exclusion Protocol, it is the absolute first file any well-behaved web crawler (like Googlebot, Bingbot, or AhrefsBot) looks for when arriving at your domain. If you configure it incorrectly, you can accidentally block your entire site from appearing on Google.

When to Use Allow vs. Disallow

Disallow: Use this to prevent crawlers from accessing sensitive or useless directories. Common examples include /admin/, /checkout/, /cart/, internal search result pages (/?s=), or staging environments. This saves your "Crawl Budget" for important content.
Allow: Use this when you have disallowed a parent directory, but want to make a specific sub-directory crawlable. For example, you might Disallow: /assets/ but Allow: /assets/public-images/.

The Danger of robots.txt for Hiding Content

A common SEO mistake is using robots.txt to hide private pages (like a PDF or a secret landing page). Robots.txt is public. Anyone can view it by appending /robots.txt to your domain. Furthermore, if an external site links to your disallowed page, Google may still index the URL itself. If you need to keep a page out of search engines securely, use a <meta name="robots" content="noindex"> tag or password protection instead.

How to Use the Robots.txt Generator

Step-by-step guide

Set User Agent
Use * for all crawlers, or specify a particular bot like Googlebot, Bingbot, or GPTBot.
Add Allow/Disallow Rules
Add rules for paths you want to allow or block. Use /account/ or /checkout/ to block private sections.
Configure Sitemap & Delay
Enter your sitemap URL and optionally set a crawl delay to control how fast bots crawl your site.
Copy or Download
Copy the generated robots.txt or download the file. Upload it to the root of your website domain.

Frequently Asked Questions

About the Robots.txt Generator

A robots.txt file is a simple text file placed in your website's root directory that tells search engine crawlers (like Googlebot) which pages or files they can or cannot request from your site. It is the first thing a crawler checks before accessing your content.

No! A robots.txt file prevents crawling, but it does NOT guarantee a page won't be indexed. If other sites link to your disallowed page, Google might still index the URL (though it won't know the content). To prevent indexing, use a "noindex" meta tag instead.

The User-agent directive specifies which crawler the rules apply to. An asterisk (*) means the rules apply to all web crawlers. You can specify "Googlebot" to target only Google, or "Bingbot" for Bing.

Including the absolute URL to your XML sitemap in your robots.txt file is a best practice. It acts as a beacon, immediately showing any visiting crawler exactly where to find your site map of all important pages.

The crawl delay directive tells search engines to wait a certain number of seconds between requests. This is useful for large sites on slow servers to prevent the crawler from overloading the server. Note: Googlebot largely ignores the crawl-delay directive (they use Search Console for rate limiting), but other bots like Bingbot respect it.

It must be placed in the top-level root directory of your website domain. For example, it must be accessible at https://www.yourdomain.com/robots.txt. If you put it in a subdirectory, crawlers will not find it.

Related Workflows

Guides, tools, and template pages to continue the workflow

Crawl control workflow

Robots.txt vs noindex guideUnderstand why Disallow is not a security or deindexing control.Sitemap generatorGenerate a sitemap URL before adding it to robots.txt.Technical SEO checklistReview crawl, index, canonical, and internal-link basics.

Robots.txt Generator

Rules

robots.txt

Known AI Crawler Bot Names

Mastering Robots.txt for Technical SEO

When to Use Allow vs. Disallow

The Danger of robots.txt for Hiding Content

How to Use the Robots.txt Generator

Set User Agent

Add Allow/Disallow Rules

Configure Sitemap & Delay

Copy or Download

Frequently Asked Questions

What is a robots.txt file?

Does robots.txt stop my pages from being indexed?

What is the "User-agent" directive?

Why should I add my Sitemap URL?

What is a Crawl Delay?

Where do I put my robots.txt file?

Schema Markup Generator

Meta Tags Preview

Sitemap Generator

SEO Slug Generator

Typing Speed Test

HTML to Markdown Editor

JSON to TOON Converter

Word Counter

Password Generator

Related Workflows

Crawl control workflow

Robots.txt Generator

Rules

robots.txt

Known AI Crawler Bot Names

Mastering Robots.txt for Technical SEO

When to Use Allow vs. Disallow

The Danger of robots.txt for Hiding Content

How to Use the Robots.txt Generator

Set User Agent

Add Allow/Disallow Rules

Configure Sitemap & Delay

Copy or Download

Frequently Asked Questions

What is a robots.txt file?

Does robots.txt stop my pages from being indexed?

What is the "User-agent" directive?

Why should I add my Sitemap URL?

What is a Crawl Delay?

Where do I put my robots.txt file?

More Free SEO Tools

Schema Markup Generator

Meta Tags Preview

Sitemap Generator

SEO Slug Generator

Typing Speed Test

HTML to Markdown Editor

JSON to TOON Converter

Word Counter

Password Generator

Related Workflows

Crawl control workflow