XML Sitemap: What It Is & How to Generate One

   日期:2024-12-26    作者:dgxingtian 移动:http://mip.riyuangf.com/mobile/quote/39086.html

An XML sitemap is a file that tells search engines like Google which URLs on your website should be indexed (added to its database of possible search results).

XML Sitemap: What It Is & How to Generate One

It may also provide additional information about each URL, including:

  • When the page was last modified
  • How often the page is updated
  • The relative importance of the page

This information can help search engines crawl (explore) your site more effectively and efficiently. And better match your pages with relevant search queries.

That’s why XML sitemaps are important in search engine optimization (SEO).

An XML sitemap (or sitemap.xml file) looks something like this:

If you’re interested in the details, the main tags used are:

  • <urlset>: Encloses all the tags for each sitemap
    • <url>: Encloses all the tags for each URL
      • <loc>: Specifies the page’s complete URL
      • <lastmod>: Specifies when the page was last updated (optional) 
      • <changefreq>: Specifies how frequently the page is likely to change (optional) 
      • <priority>: Specifies the relative importance of the page from 0.0 to 1.0 (optional) 

Webmasters can also create dedicated image, video, and news sitemaps. To help search engines understand these specific types of content.

If you need to create more than one sitemap, you need a sitemap index. Which essentially acts as a sitemap for your sitemaps.

An XML sitemap is highly recommended if you want your pages to show in search engine results.

If you don’t provide an XML sitemap, search engines have to rely on hyperlinks (on your own site or elsewhere) to discover pages on your site. This is inefficient and it can lead to pages being missed.

Now, let’s learn how to create an XML sitemap.

It’s likely that the platform you use to manage your website’s content automatically generates and updates your XML sitemap. 

You may be able to find yours by going to yourdomain.com/sitemap.xml in your browser.

Like this:

Otherwise, refer to the help center for your website builder or content management system (CMS). Or contact your platform’s support team.

If your platform doesn’t provide an XML sitemap, you can use a sitemap generator tool. 

These tools can also prove helpful if you want more control over your sitemap. For example, you can customize your WordPress sitemap with the Yoast SEO plugin.

If you use a tool outside of your platform to create a sitemap, make sure to publish it to your site to make it live.

It’s best practice to submit your sitemap to Google. (Rather than waiting for Google’s website crawlers to discover the file on their own.)

But first, make sure there are no issues with your XML sitemap.

With Semrush’s Site Audit tool, you can check whether your sitemap.xml file:

  • Can’t be found
  • Has formatting errors
  • Contains non-canonical or non-200 URLs
  • Isn’t specified in robots.txt
  • Is too large
  • Contains HTTP rather than HTTPS URLs

The tool also checks whether your SEO sitemap contains orphaned pages—URLs that aren’t linked to from anywhere on your site. (It’s best practice to add internal links to pages that should be indexed.)

Simply go to the “Issues” report after setting up your audit. And enter “sitemap” into the search bar.

Rerun the audit after implementing any fixes. So you can check they’re working correctly.

And go to “Indexing” > “Sitemaps.”

And click “Submit” when you’re done.

When Google has crawled your sitemap, you’ll see a “Success” notice in the “Status” column.

But if you make major changes that you want to be discovered quickly, you can re-submit your sitemap with a new request.

If you’re using a sitemap.xml file generated by your website platform or a specialized tool, it’ll probably meet XML sitemap best practices.

But if you want to make sure, read and understand these guidelines.

First, your sitemap should only reference URLs that:

  • You want to be indexed. For example, you shouldn’t include pages from your staging environment. Or the URL for an order confirmation page.
  • Return a 200 status code. You shouldn’t attempt to index pages that return other http status codes. Such as 301 redirects (which indicate permanent redirects) or 404 errors (which indicate a page can’t be found).
  • Are fully qualified and absolute. In other words, make sure to specify the entire URL with the scheme, authority, and path (e.g., “https://www.semrush.com/blog/”).
  • Are canonicals. Canonical URLs represent the sole version of a page or the primary version of a duplicated page. 

And your sitemap file should:

  • Be UTF-8 encoded. This is a system that ensures search engines can understand all the characters you’re using. For example, you’ll need to use  (without the space) in place of a "&" symbol.
  • Be less than 50MB or 50,000 URLs. If necessary, you can create multiple sitemaps and a sitemap index file.
  • Specify the correct namespace. A namespace is like a label that tells the search engine what kinds of rules the sitemap follows. Most sitemaps use the “http://www.sitemaps.org/schemas/sitemap/0.9” namespace to show that the file conforms to standards set by sitemaps.org.
  • Include language and region variants for each URL (where applicable). You can learn more in this resource from Google.

Lastly, make sure to link to your sitemap from your robots.txt file. This is a website file that tells search engines which pages they should and shouldn’t crawl.

With Semrush’s Site Audit, you can easily check for issues related to your XML sitemap.

The tool also checks for dozens of other issues that can harm your SEO results.


特别提示:本信息由相关用户自行提供,真实性未证实,仅供参考。请谨慎采用,风险自负。


举报收藏 0评论 0
0相关评论
相关最新动态
推荐最新动态
点击排行
{
网站首页  |  关于我们  |  联系方式  |  使用协议  |  隐私政策  |  版权隐私  |  网站地图  |  排名推广  |  广告服务  |  积分换礼  |  网站留言  |  RSS订阅  |  违规举报  |  鄂ICP备2020018471号