-->
Published on 19 Mar 2023 / Technical
Astro is amazing. It’s easy to install, use, and deploy. I love it! It’s also easy to extend. For example, you may need a sitemap generator. Just head to the library of integrations and install the official sitemap integration!
The sitemap integration has one major issue you should know about, though. It does not check for the draft status, and hapilly includes drafts from the final sitemap.xml.
That makes the integration almost borderline useless, as if drafts get included, they will be crawled by Google and other search engines and can mess up your website reputation. Not to mention that your competitors or other interested parties may use it for “competitive intelligence” purposes.
If you look at the documentation page for Astro Sitemap, you can of course manually filter out items, but manual solutions equal bad solutions.
So here is a tutorial how to programatically exclude drafts from the Astro Sitemap Integration.
First, we need to install gray-matter, a JavaScript library for parsing frontmatter. We’ll use this library to read the frontmatter of each MDX file in our /blog directory. There is definitely a better way to do this, because Astro parses mdx, but for the purpose of showing how to exclude drafts from the sitemap, this is a good enough solution.
pnpm install gray-matter
# or
yarn add gray-matter
# or
npm install gray-matter
First, we need to create a script that scans your /blog directory, reads the frontmatter of each MDX file, and identifies posts with draft: true.
import fs from "fs";
import path from "path";
import matter from "gray-matter";
async function readBlogDirectory(directory) {
const entries = await fs.promises.readdir(directory, { withFileTypes: true });
const files = await Promise.all(
entries.map(async (entry) => {
const entryPath = path.join(directory, entry.name);
if (entry.isDirectory()) {
return readBlogDirectory(entryPath);
} else {
return entryPath;
}
})
);
return Array.prototype.concat(...files);
}
async function isDraftPage(pagePath) {
const content = await fs.promises.readFile(pagePath, "utf8");
const { data } = matter(content);
return data.draft === true;
}
export async function listDraftBlogPosts(): Promise<string[]> {
const drafts = [];
const blogDirectory = "./src/content/blog";
const blogFiles = await readBlogDirectory(blogDirectory);
const mdxFiles = blogFiles.filter((file) => {
return file.endsWith(".mdx");
});
for (const mdxFile of mdxFiles) {
if (await isDraftPage(mdxFile)) {
drafts.push(mdxFile);
}
}
return drafts;
}
Next, you’ll need to import the required functions from listDraftBlogPosts.js into your astro.config.mjs file. Add the following imports to your astro.config.mjs:
import { listDraftBlogPosts } from "./listDraftBlogPosts";
const blogDrafts = await listDraftBlogPosts();
Finally, we’ll modify the sitemap configuration to use the filter function we imported from listDraftBlogPosts.js. This function will help us exclude draft blog posts from the sitemap. Update your sitemap configuration in astro.config.mjs as shown in the placeholder below:
sitemap({
filter: page => {
let include = true
const isDraftBlogPostPage = Boolean(
blogDrafts.find(fileName => fileName.split('.mdx')[0].endsWith(page.split('https://yoursite.com').pop()))
)
if (isDraftBlogPostPage) {
console.log(`⛔️ "${page}" has been excluded from the sitemap`)
include = false
}
return include
},
}),
By following these three steps, you can effectively exclude drafts from your sitemap, ensuring that your sitemap integration remains useful and doesn’t harm your website’s SEO. With this improvement in place, you can be confident that your sitemap only includes the content you want search engines to index.