SEO

robots.txt

A file at the site root that tells crawlers which paths they may and may not crawl.

Definition

robots.txt is a plain-text file at a site's root that gives crawlers rules about which paths to crawl or avoid, and points to the sitemap. It is the oldest crawler-control convention on the web.

Why it matters

robots.txt keeps private or pointless paths out of the index and, increasingly, decides whether AI crawlers are welcome. Many sites block AI crawlers by default; explicitly allowing them is a choice that affects how often a business is cited by AI assistants.

How we think about it

We explicitly invite the major AI crawlers in robots.txt, because being citable by AI assistants is a channel, not a threat. Owner-only surfaces are disallowed, and the sitemap is always referenced so discovery is complete.

Related terms.

Go deeper.

Want this on your site, not in a glossary?

We use analytics to understand which pages help, with PII redacted and session inputs masked. Your form submissions always reach us regardless of this choice.