Blog Sections Open

Splitting a Large Sitemap into Multiple XML Files

A practical Evolution CMS pattern for generating a sitemap index and multiple sitemap files when one XML sitemap is no longer enough.

This article adapts an older community example into a cleaner English reference.

Once a site becomes large enough, a single XML sitemap can turn into an awkward bottleneck. A better approach is to output a sitemap index and split the actual URLs across multiple sitemap files.

The idea

The pattern has three pieces:

  1. a snippet that either returns a sitemap index or a single sitemap page
  2. a document with the alias sitemap.xml and the content type text/xml
  3. an OnPageNotFound plugin that resolves requests like sitemap1.xml, sitemap2.xml, and so on

Snippet example

The original note used a snippet that switches between index mode and page mode depending on the sitemapIndex placeholder:

<?php
// SitemapIndex
if (!empty($modx->getPlaceholder('sitemapIndex'))) {
    $offset = ($modx->getPlaceholder('sitemapIndex') - 1) * (int)$params['display'];
    $params['offset'] = $offset >= 0 ? $offset : 0;

    return $modx->runSnippet('DLSitemap', $params);
} else {
    $params['config'] = 'sitemap:core';
    $params['returnDLObject'] = 1;
    $dl = $modx->runSnippet('DocLister', $params);
    $count = ceil((int)$dl->getChildrenCount() / (int)$params['display']);
    $out = '';
    for ($i = 1; $i <= $count; $i++) {
        $out .= '<sitemap><loc>' . $modx->getConfig('site_url') . 'sitemap' . $i . '.xml</loc></sitemap>';
    }
    if ($out) {
        return '<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . $out . '</sitemapindex>';
    }
}

Document setup

Create a document with the alias sitemap.xml, set its content type to text/xml, and use a minimal blank template. The document content can be as simple as:

[[SitemapIndex? &display=`100`]]

Opening that document should produce links like:

https://example.com/sitemap1.xml
https://example.com/sitemap2.xml
https://example.com/sitemap3.xml

Plugin for child sitemap files

To make those numbered sitemap files resolve correctly, the legacy example used an OnPageNotFound plugin:

if ($modx->event->name == 'OnPageNotFound') {
    preg_match_all('/\/sitemap(\d+)\.xml$/', $_SERVER['REQUEST_URI'], $matches);
    if (isset($matches[1][0])) {
        $index = (int)$matches[1][0];
        $modx->systemCacheKey = 'sitemap' . $index;
        $modx->setPlaceholder('sitemapIndex', $index);
        $modx->sendForward(53); // id of the sitemap.xml document
    }
}

Why this works well

  • the public entry point stays predictable at sitemap.xml
  • the sitemap index remains small and easy for crawlers to process
  • each numbered file can hold a limited batch of URLs
  • the pattern scales without changing the public sitemap structure later

What to review before using it

  • choose a sensible display size for the project
  • make sure your sitemap snippet excludes unpublished, hidden, or non-indexable resources where appropriate
  • verify that the generated URLs are fully canonical
  • test both the index file and numbered files in a crawler and in Search Console
  • if the site already has sitemap routing, merge carefully instead of stacking duplicate logic

Final note

A split sitemap is not only a technical trick. It is a safer long-term structure for large sites. Once the project grows beyond a small document tree, moving from one file to an indexed sitemap layout helps keep crawling predictable and maintenance simpler.

Newer post

Choosing a Lightweight Admin Layer for Single-Page Sites

How to think about lightweight admin interfaces for landing pages and when Evolution CMS is or is not the right fit.

Older post

Keeping a Form Visible After Successful Submission in FormLister

Show the success message below a FormLister form without replacing the form itself after submit.