Creating a sitemap is an important yet overlooked part of a website’s SEO strategy. For search engines like Google, a sitemap serves as a roadmap leading them through the intricacies of your site. The easier they can index your site, the easier it is for you to be found through search.
An Extensible Markup Language (XML) sitemap is essentially a list of pages on a site, typically expressed as URLs. Sitemaps provide additional metadata, such as when the page was last updated and the importance of pages in your site hierarchy. This information allows Google to crawl the contents of your sites in order to provide the best result. The more thorough the crawl, the more likely you’ll receive a bump in search rankings.
Along with SEO benefits, sitemaps help site owners organize their content in such a way that it promotes the business goals in a logical way. This includes eliminating redundant content and anticipating how users will navigate the site. You can learn all about the additional benefits of a site map here.
However, a sitemap can only benefit a website if it’s done correctly. While a number of sites offer sitemap creation tools, these sitemaps may contain errors that confuse Google and can ruin your SEO strategy. Rule number one: Never confuse Google.
There are some pitfalls to avoid in order to get the most out of your sitemap. Here are just a few of the most common sitemap mistakes and some tips for making the search engines love you.
1. Not Submitting Sitemap
It might sound obvious, but in order for search engines to crawl your site, you need to submit your sitemap. After all, you can’t win if you don’t play.
While submitting a sitemap doesn’t provide a guaranteed SEO boost, it serves to cover your bases in case search engines aren’t able to crawl your pages properly. Google tells us that,
“If your site’s pages are properly linked, our web crawlers can usually discover most of your site. Even so, a sitemap can improve the crawling of your site.”
Furthermore, Google advises that a sitemap is especially valuable for:
- Large sites
- Sites not featuring lots of internal links
- Sites that lack external sources linking to the site
To submit your sitemap to Google, you’ll first need to sign in or register with Google Search Console. Next:
- Either select your website in the sidebar, or enter your website as a New Property
- Click on “Sitemaps” under the “Index” section
- Remove any outdated sitemaps in the “Submitted Sitemaps” section, if necessary
- Enter “sitemap_index.xml” in the “Add a New Sitemap” field
2. Crawl Issues
One of the most common errors facing sitemap submissions are crawl issues, a frustratingly vague term indicating some type of unspecified error. Google will tell you if something is wrong, but they won’t tell you exactly what the problem is.
Crawl issues require you to reanalyze your sitemap for any undetected errors. Typically crawl issues include, but are not limited to:
- Too many 301 redirects
- Pages loading too slowly
- Javascript or CSS blocked by search engines
- Error pages other than 404, such as a 403 “Forbidden”
To address crawl issues, you can either perform a crawl audit yourself or use a service like Screaming Frog or Botify. Once you’ve solved the issues, check to see if your page loads correctly before resubmitting to Google Search Console.
You can resubmit your URL through Google’s Sitemap Manager or by selecting “Inspect URL” then “Request Indexing” in Google Search Console.
After you’ve submitted your URL, check back regularly to make sure the error message no longer appears.
3. Not Updating Sitemap
Site migration is a heavy lift, so it’s easy to forget small details, like updating your sitemap.
If a website is only making minor changes that won’t impact desired search results, then a sitemap update may not be necessary. But if the site makes substantial changes, such as full redesign, an update is a must.
For example, if you regularly post a blog once per week, you want to make sure search engines can find these new pages. While Google may be able to automatically crawl your page, you can ensure the page is crawled properly by updating through Google Search Console.
Keep in mind that Google does prioritize sites with dynamic content, so you should be updating your site regularly for good practice. Just make sure these changes are crawled.
The good news is that many platforms can automatically keep your sitemaps updated. For example, Shopify automatically creates a sitemap for you, while WordPress and WooCommerce have plenty of plugins to make sitemap management a breeze.
4. Duplicate Content
Google says it best to: Avoid creating duplicate content.
Repeat: Avoid creating duplicate content.
Duplicate content includes pages that are identical or near-identical to other pages on your site. This can include:
- Printer-only versions of a page
- Discussion forums that generate both regular and stripped-down pages
- Items in an online store that use multiple distinct URLs
Google’s web crawlers often interpret duplicate content as a method to manipulate search engines, which they don’t like for obvious reasons.
If your sitemap contains duplicate content, there are a few steps you can take. First off, you can use a canonical page, which consolidates duplicate content into a single URL and tells the search engine which page to index.
For example, if you have a printer-only page, you would canonicalize the main page for search engines to index. You can canonicalize a page by giving it a tag link attribute of “rel=canonical” which will block the less useful pages from being indexed. The key is to keep your URL listing consistent
5. Indexing Utility Pages
Speaking of consistency, you’ll also want to ensure that only high-quality content pages are included in your sitemap. Every website contains two types of pages:
- Utility Pages – Pages useful to visitors but not intended to be a search engine landing page, including header and footer content, privacy policies, wishlists, and other navigational elements
- Content Pages – Pages intended to be a search engine landing page
To ensure search engines crawl your site effectively, provide NOINDEX tags to all utility pages not integral for search. By decluttering your sitemap, you’ll make it easier for search engines to crawl your site and understand how it can help its users.
6. Sitemap Size
For relatively simple sites, sitemap size restrictions may not matter, but as you grow you should be mindful of how big is too big. According to Google, your sitemap must:
- Be no longer than 50MB
- Contain no more than 50,000 URLs
- Contain no more than 1,000 images per URL
While this might look like more space than you need, it’s surprising how large and complex a site can grow.
For sites exceeding Google’s limits, separate sitemaps can be created. Google allows up to 500 sitemaps, all with the same 50MB, 50,000 URL limits, and up to 1,250,000,000,000 URLs in total.
Break sitemaps up as necessary, but keeping each sitemap to around 10,000 URLs usually better ensures a thorough crawl. As opposed to a larger sitemap, fewer URLs usually means fewer opportunities for error.
The Direction of Trust
Keeping sitemaps optimized for search engines can help make your business more visible by potentially ranking higher in search results. While it’s not a guaranteed method, it certainly puts you on Google’s good side, which is where you want to be.
Submitting clean, high-quality sitemaps also goes toward the ultimate goal of gaining more trust from your users. People generally place more trust with businesses that rank higher in search results. Consequently, search engines want to promote sites that adopt best practices and who follow their guidelines. Don’t make it difficult for them.
Sitemaps are merely one way to increase your site’s digital trust with your users. More than ever, users are concerned with a site’s reputation and rely on search engines like Google to vet websites for legitimacy and usefulness. By allowing Google and other search engines to easily crawl your site, you’re allowing them to match up their users’ needs with your services. It pays to make yourself easy to find.