Livv Logo
01Home
02About
03Work
04Services
05Products
06Blog
Get in touch
01Home
02About
03Work
04Services
Creative EngineeringProduct Strategy & UIMotion & Narrative
05Products
06Blog
Get in touch
Home/Blog/Webflow SEO
Webflow SEO

Webflow Sitemap and Robots.txt: What You Need to Know

How Webflow generates your sitemap and robots.txt, what the defaults get wrong, and how to fix them for better crawl efficiency.

L
LIVV Studio
February 5, 20268 min read
webflowsitemaprobots.txtcrawlabilitygoogle search consoletechnical seo

How Webflow Handles Sitemaps Automatically

Webflow generates a sitemap.xml file automatically at yourdomain.com/sitemap.xml. It updates every time you publish. The sitemap includes all published pages and CMS collection items. This is convenient, but it also means utility pages, password-protected pages, and 'draft' pages that are technically published can end up in your sitemap—telling Google to crawl and index pages you never intended to rank.

Auditing Your Webflow Sitemap

  1. Open yourdomain.com/sitemap.xml in your browser.
  2. Check every URL. Are there style guide pages, 404 pages, or test pages listed?
  3. For unwanted pages, go to Page Settings in Webflow and toggle 'Exclude this page from sitemap'.
  4. For CMS items you do not want indexed, use a conditional visibility trick: set the item to 'Draft' status, or use a toggle field to control sitemap inclusion programmatically.
  5. After cleanup, republish and verify the sitemap no longer contains excluded URLs.

Understanding Webflow’s Default Robots.txt

Webflow generates a default robots.txt that allows all crawlers access to all pages. On staging subdomains (yoursite.webflow.io), Webflow adds a 'Disallow: /' directive to prevent indexing. Once you connect a custom domain, the robots.txt switches to permissive. You cannot directly edit robots.txt in Webflow’s UI—there is no file editor for it. However, you can override it by hosting a custom robots.txt file via a reverse proxy, Cloudflare Workers, or by redirecting /robots.txt to a CMS-hosted text page.

If your Webflow staging site (yoursite.webflow.io) is indexed in Google, it means the noindex directive was not in place or Google cached it before the directive was added. Submit a removal request in Search Console for the staging domain.

Custom Robots.txt with Cloudflare Workers

If you need to block specific crawlers or directories, the cleanest approach is a Cloudflare Worker that intercepts requests to /robots.txt and returns your custom content. This takes about 10 minutes to set up. Create a Worker that checks if the request URL path is /robots.txt, and if so, returns a new Response with your desired directives. Otherwise, pass the request through to Webflow’s origin.

javascript
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

async function handleRequest(request) {
  const url = new URL(request.url)
  if (url.pathname === '/robots.txt') {
    const robotsTxt = `User-agent: *
Allow: /
Disallow: /style-guide
Disallow: /utility-pages/
Sitemap: https://yoursite.com/sitemap.xml`
    return new Response(robotsTxt, {
      headers: { 'Content-Type': 'text/plain' }
    })
  }
  return fetch(request)
}

Submitting Your Sitemap to Google Search Console

After connecting your custom domain and publishing, go to Google Search Console, select your property, navigate to Sitemaps in the left menu, and submit yourdomain.com/sitemap.xml. Google will crawl it and report any errors. Check back after 48 hours to verify all URLs are discovered. If you see 'Couldn’t fetch' errors, verify your custom domain DNS is properly configured and SSL is active.

Sitemap Best Practices for Webflow CMS

  • Keep your sitemap under 50,000 URLs (Webflow will not hit this limit, but it is the Google maximum).
  • Use descriptive slugs on CMS items—these become the URL paths in the sitemap.
  • Update content regularly so Google sees fresh lastmod dates when recrawling.
  • If using multiple CMS collections, verify all collections appear in the sitemap.

Need help with crawl management on your Webflow site? We handle sitemaps, robots.txt, and Search Console setup.

Talk to Our SEO Team→

On this page

  • How Webflow Handles Sitemaps Automatically
  • Auditing Your Webflow Sitemap
  • Understanding Webflow’s Default Robots.txt
  • Custom Robots.txt with Cloudflare Workers
  • Submitting Your Sitemap to Google Search Console
  • Sitemap Best Practices for Webflow CMS

Need help optimizing your Webflow site for SEO? Let's talk.

Get in Touch→

You might also like

The Complete Webflow SEO Guide for 2026
Webflow SEO14 min read

The Complete Webflow SEO Guide for 2026

Everything you need to rank a Webflow site: from crawlability and Core Web Vitals to structured data, image optimization, and ongoing audits.

January 10, 2026Read more →
Webflow SEO Settings You’re Probably Ignoring
Webflow SEO9 min read

Webflow SEO Settings You’re Probably Ignoring

Most Webflow users configure the basics. Here are the settings that actually move the needle—and that most teams skip.

February 12, 2026Read more →
How to Add Schema Markup (JSON-LD) to Any Webflow Site
Webflow SEO9 min read

How to Add Schema Markup (JSON-LD) to Any Webflow Site

Step-by-step instructions for adding Organization, Article, FAQ, and Breadcrumb schema to Webflow—no plugins needed.

January 28, 2026Read more →
Get in Touch

Let's work together

Goodfirms Badge

Have a project in mind? We'd love to hear about it.

hola@livv.systems

Socials

Designed by LivvRebuilt in Next.jsBy Antigravity
Privacy PolicyCurrent Status: Online
Footer Gradient