Blog Sections Open
Splitting a Large Sitemap into Multiple XML Files
A practical Evolution CMS pattern for generating a sitemap index and multiple sitemap files when one XML sitemap is no longer enough.
This article adapts an older community example into a cleaner English reference.
Once a site becomes large enough, a single XML sitemap can turn into an awkward bottleneck. A better approach is to output a sitemap index and split the actual URLs across multiple sitemap files.
The idea
The pattern has three pieces:
- a snippet that either returns a sitemap index or a single sitemap page
- a document with the alias
sitemap.xmland the content typetext/xml - an
OnPageNotFoundplugin that resolves requests likesitemap1.xml,sitemap2.xml, and so on
Snippet example
The original note used a snippet that switches between index mode and page mode depending on the sitemapIndex placeholder:
<?php
// SitemapIndex
if (!empty($modx->getPlaceholder('sitemapIndex'))) {
$offset = ($modx->getPlaceholder('sitemapIndex') - 1) * (int)$params['display'];
$params['offset'] = $offset >= 0 ? $offset : 0;
return $modx->runSnippet('DLSitemap', $params);
} else {
$params['config'] = 'sitemap:core';
$params['returnDLObject'] = 1;
$dl = $modx->runSnippet('DocLister', $params);
$count = ceil((int)$dl->getChildrenCount() / (int)$params['display']);
$out = '';
for ($i = 1; $i <= $count; $i++) {
$out .= '<sitemap><loc>' . $modx->getConfig('site_url') . 'sitemap' . $i . '.xml</loc></sitemap>';
}
if ($out) {
return '<?xml version="1.0" encoding="UTF-8"?><sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . $out . '</sitemapindex>';
}
}
Document setup
Create a document with the alias sitemap.xml, set its content type to text/xml, and use a minimal blank template. The document content can be as simple as:
[[SitemapIndex? &display=`100`]]
Opening that document should produce links like:
https://example.com/sitemap1.xml
https://example.com/sitemap2.xml
https://example.com/sitemap3.xml
Plugin for child sitemap files
To make those numbered sitemap files resolve correctly, the legacy example used an OnPageNotFound plugin:
if ($modx->event->name == 'OnPageNotFound') {
preg_match_all('/\/sitemap(\d+)\.xml$/', $_SERVER['REQUEST_URI'], $matches);
if (isset($matches[1][0])) {
$index = (int)$matches[1][0];
$modx->systemCacheKey = 'sitemap' . $index;
$modx->setPlaceholder('sitemapIndex', $index);
$modx->sendForward(53); // id of the sitemap.xml document
}
}
Why this works well
- the public entry point stays predictable at
sitemap.xml - the sitemap index remains small and easy for crawlers to process
- each numbered file can hold a limited batch of URLs
- the pattern scales without changing the public sitemap structure later
What to review before using it
- choose a sensible
displaysize for the project - make sure your sitemap snippet excludes unpublished, hidden, or non-indexable resources where appropriate
- verify that the generated URLs are fully canonical
- test both the index file and numbered files in a crawler and in Search Console
- if the site already has sitemap routing, merge carefully instead of stacking duplicate logic
Final note
A split sitemap is not only a technical trick. It is a safer long-term structure for large sites. Once the project grows beyond a small document tree, moving from one file to an indexed sitemap layout helps keep crawling predictable and maintenance simpler.
Choosing a Lightweight Admin Layer for Single-Page Sites
How to think about lightweight admin interfaces for landing pages and when Evolution CMS is or is not the right fit.
Keeping a Form Visible After Successful Submission in FormLister
Show the success message below a FormLister form without replacing the form itself after submit.