An XML sitemap is a file that lists every important URL on your website in a structured, machine-readable format. It acts as a roadmap for search engine crawlers like Googlebot, Bingbot, and others, telling them which pages exist, when they were last updated, and how important they are relative to other pages on your site.
The XML sitemap protocol was jointly developed by Google, Yahoo, and Microsoft in 2006 and is defined at sitemaps.org. It uses standard XML syntax, which makes it both human-readable and easy for crawlers to parse.
A basic sitemap is typically named sitemap.xml and placed in the root directory of your website, for example: https://example.com/sitemap.xml.
Search engines can discover pages through links, but relying on link-based crawling alone has limitations. XML sitemaps solve several critical problems:
When you publish a new page or post, it may take days or weeks for search engines to discover it through link crawling. A sitemap signals the existence of new URLs immediately, especially when combined with ping mechanisms or Google Search Console submission.
Pages that have few or no internal links pointing to them (orphan pages) are difficult for crawlers to find. Your sitemap ensures these pages are still discoverable. This is particularly important for large e-commerce sites where product pages may not be well-linked from category structures.
Search engines allocate a finite crawl budget to each website. Your sitemap helps them prioritise which pages to crawl, so they spend time on your most valuable content rather than wandering through low-value or duplicate pages.
The <lastmod> tag tells crawlers when a page was last updated. If you refresh an old article with new information, updating the lastmod date in your sitemap prompts search engines to re-crawl and re-index that content.
Sites with thousands of pages, deep navigation structures, or content behind JavaScript rendering benefit enormously from sitemaps. Without one, search engines may only discover a fraction of your content.
An XML sitemap follows a strict structure defined by the sitemap protocol. If you're new to XML, our XML tutorial covers the fundamentals of elements, attributes, and document structure. Here is a complete example:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://example.com/</loc>
<lastmod>2026-03-01</lastmod>
<changefreq>daily</changefreq>
<priority>1.0</priority>
</url>
<url>
<loc>https://example.com/about</loc>
<lastmod>2026-02-15</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>https://example.com/blog/xml-guide</loc>
<lastmod>2026-03-06</lastmod>
<changefreq>weekly</changefreq>
<priority>0.9</priority>
</url>
</urlset>Let's break down each element:
<urlset> — The Root ElementWraps the entire sitemap and declares the sitemap namespace. The xmlns attribute is required and must reference the sitemaps.org schema.
<url> — URL EntryEach <url> block represents a single page on your site. A sitemap can contain up to 50,000 URL entries.
<loc> — Page URL (Required)The full, canonical URL of the page. This is the only required child element. URLs must include the protocol (https://) and should match the canonical version of the URL you want indexed.
<lastmod> — Last Modified DateThe date the page was last meaningfully updated, in W3C Datetime format. Use YYYY-MM-DD for daily precision or the full ISO 8601 format for more granularity. Google has confirmed that accurate lastmod values are one of the most useful sitemap signals.
<changefreq> — Change FrequencyA hint to crawlers about how often the page content changes. Valid values are: always, hourly, daily, weekly, monthly, yearly, and never. Note that Google has stated it largely ignores this value, but other search engines may still use it.
<priority> — Relative PriorityA value from 0.0 to 1.0 indicating the importance of a URL relative to other URLs on your site. The default is 0.5. This is a hint only and does not affect how your pages rank against other websites. Like changefreq, Google largely ignores this, but it can help you organise your own sitemap priorities.
While the standard URL sitemap is the most common, the sitemap protocol supports several specialised types:
Lists web page URLs. This is what most people mean when they say “sitemap” and is suitable for the majority of websites.
Extends the standard sitemap with image-specific tags. Helps Google Image Search discover images that might not be found through standard crawling, such as images loaded via JavaScript or CSS.
<url>
<loc>https://example.com/gallery</loc>
<image:image>
<image:loc>https://example.com/images/photo1.jpg</image:loc>
<image:caption>Mountain landscape at sunset</image:caption>
</image:image>
</url>Provides metadata about video content including title, description, duration, and thumbnail URL. Essential for sites that host video content and want it to appear in Google Video search results.
Designed for news publishers, this format includes publication name, language, and publication date. Google News uses these to discover and rank recent news articles. Articles should only appear in a news sitemap for 48 hours after publication.
For multilingual sites, you can use xhtml:link elements within your sitemap to declare language and regional variants of each page, replacing or supplementing hreflang tags in your HTML.
There are several approaches to creating sitemaps, depending on your platform and needs:
Most modern content management systems offer sitemap generation out of the box or through plugins. WordPress users can use Yoast SEO, Rank Math, or the built-in WordPress sitemap (available since WordPress 5.5). Shopify, Squarespace, and Wix generate sitemaps automatically.
Frameworks like Next.js, Gatsby, Hugo, and Astro all provide sitemap generation plugins or built-in support. For Next.js, you can use the app/sitemap.ts convention to dynamically generate sitemaps at build time.
If you need to generate a sitemap quickly, you can use our free XML Sitemap Generator to build and download a valid sitemap.xml file. We also offer a Sitemap Validator to check your sitemap for errors.
For small sites, you can write a sitemap by hand. It's just XML. Use our XML Formatter to check that your hand-crafted sitemap is properly structured.
A single sitemap file can contain a maximum of 50,000 URLs and must not exceed 50 MB (uncompressed). For larger sites, you need to split your URLs across multiple sitemaps and reference them from a sitemap index file.
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2026-03-06</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-posts.xml</loc>
<lastmod>2026-03-05</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-products.xml</loc>
<lastmod>2026-03-04</lastmod>
</sitemap>
</sitemapindex>A sitemap index can reference up to 50,000 individual sitemaps, giving you a theoretical maximum of 2.5 billion URLs. Organise your child sitemaps logically by content type (pages, posts, products, categories) for easier management.
The most reliable method is through Google Search Console:
sitemap.xml) and click Submit.After submission, Search Console will show you how many URLs were discovered, how many are indexed, and flag any issues with individual URLs.
Bing offers a similar interface through Bing Webmaster Tools. Submit your sitemap under the “Sitemaps” section. Bing also supports the IndexNow protocol for instant URL submission.
Add a Sitemap directive to your robots.txt file so that any crawler can find your sitemap automatically:
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xmlThis is a simple, universal method that works for all search engines that respect the robots.txt standard.
noindex meta tags or pages blocked by robots.txt. Sending conflicting signals confuses search engines.lastmod date when the page content actually changes in a meaningful way. Artificially inflating dates erodes trust with search engines.<loc> values must be fully qualified URLs, not relative paths..xml.gz files to reduce bandwidth. Search engines handle gzipped sitemaps natively.If https://example.com/page redirects to https://example.com/page/, only include the final destination URL. Listing redirect URLs wastes crawl budget and generates warnings in Search Console.
Setting every page to today's date or never updating the dates at all makes the lastmod signal useless. Google may stop trusting your sitemap's dates entirely.
Sitemaps over 50,000 URLs or 50 MB will be rejected. Use a sitemap index to split large sitemaps. Also consider that while 50 MB is the protocol limit, keeping individual sitemaps under 10 MB improves parsing speed.
If a page has a noindex directive, listing it in your sitemap sends contradictory signals. Search Console will flag these as issues. Remove noindex pages from your sitemap.
URLs in your sitemap that return 404, 500, or other error codes waste crawl budget and reduce the overall trust signal of your sitemap. Regularly audit your sitemap to remove dead URLs. Use our Sitemap to CSV Converter to export your URLs for a spreadsheet-based audit.
Redesigns, URL migrations, and content restructuring often leave sitemaps pointing to old URLs. Always regenerate your sitemap after major site changes.
For sites with frequently changing content, generate your sitemap dynamically rather than serving a static file. In Next.js, for example, you can create a sitemap.ts file that queries your database or CMS and returns fresh sitemap data on each request:
// app/sitemap.ts (Next.js example)
import type { MetadataRoute } from "next";
export default async function sitemap(): Promise<MetadataRoute.Sitemap> {
const posts = await fetch("https://api.example.com/posts")
.then((res) => res.json());
return [
{
url: "https://example.com",
lastModified: new Date(),
changeFrequency: "daily",
priority: 1,
},
...posts.map((post) => ({
url: `https://example.com/posts/${post.slug}`,
lastModified: new Date(post.modified),
changeFrequency: "weekly" as const,
priority: 0.7,
})),
];
}Splitting sitemaps by content type (blog posts, product pages, category pages) lets you track indexation rates per content type in Google Search Console. This is invaluable for diagnosing indexing issues on large sites.
The IndexNow protocol (supported by Bing, Yandex, and others) lets you notify search engines instantly when content changes. Using IndexNow alongside sitemaps gives you the best of both worlds: real-time notifications plus a complete URL inventory.
For multilingual sites, declaring hreflang relationships in your sitemap is often cleaner than adding link elements to every page's HTML head:
<url>
<loc>https://example.com/about</loc>
<xhtml:link rel="alternate" hreflang="en"
href="https://example.com/about" />
<xhtml:link rel="alternate" hreflang="es"
href="https://example.com/es/about" />
<xhtml:link rel="alternate" hreflang="fr"
href="https://example.com/fr/about" />
</url>Creating a sitemap is not a one-time task. Regular maintenance ensures search engines continue to trust and use your sitemap effectively.
After submitting your sitemap, check the Sitemaps report regularly. Look for:
Add sitemap validation to your CI/CD pipeline or use our XML Validator to confirm well-formed markup. After each deployment, validate that your sitemap is well-formed XML, doesn't exceed size limits, and only contains URLs that return 200 status codes. You can use our Sitemap Validator to check sitemaps manually, or build validation into your automated testing.
At minimum, audit your sitemap quarterly. Check for:
Technically, no. Search engines can discover pages through links. But practically, every website benefits from having one. Google's own documentation recommends sitemaps for sites that are large, new (with few backlinks), use rich media, or have pages that aren't well cross-linked.
A sitemap does not directly improve rankings. It improves crawling and indexing, which are prerequisites for ranking. If search engines can't find your pages, they certainly can't rank them.
Ideally, your sitemap should update automatically whenever content changes. If you manage it manually, update it every time you add, remove, or significantly change a page.
Include every page you want search engines to index. Exclude pages with noindex tags, duplicate content, thin pages, utility pages (login, cart, search results), and any pages you don't want appearing in search results.
Yes. Use a sitemap index file to reference multiple sitemaps. This is the recommended approach for large sites and makes it easier to organise and monitor your URLs by content type.
An XML sitemap is for search engines — it's a structured data file that crawlers parse. An HTML sitemap is a human-readable page that lists links to help visitors navigate your site. Both serve different purposes and you can (and should) have both.
No. Search engines don't interpret the order of URLs in a sitemap as a ranking signal. However, organising URLs logically makes the file easier for humans to review and maintain.
Use our free tools to generate, validate, and convert XML sitemaps.