Sitemaps allow a webmaster to inform search engines about URLs on a website that are available for crawling. A Sitemap is an XML file that lists the URLs for a website. It allows webmasters to include additional information about each URL such as when it was last updated, how often the content changes, and the priority in relation to other URLs on the site.
This allows search engines to crawl the site more intelligently. Sitemaps are particularly beneficial on websites where some areas of the website are not available through the browser interface. Since Search Engines such as Google, Bing, and Yahoo make use of sitemaps, this helps get your web pages indexed in their systems. Sitemaps are used as a supplement and do not replace the existing crawl-based mechanisms that search engines already use to discover URLs.
Keep in mind that using sitemaps does not guarantee that web pages will be included in search indexes, nor does it influence the way that pages are ranked in search results. If you submit your sitemaps to multiple search engines, you’ll notice that your pages are not indexed equally. Again, while sitemaps are helpful, they do not guarantee any specific results. They generally improve your chances to get your pages indexed.
Sitemap XML File Format
Here is an example of how a sitemap XML file should be formatted.
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.itgeared.com/articles/1229-how-to-create-a-sitemap-xml-file-for-your-website</loc>
<lastmod>2012-01-26</lastmod>
<changefreq>Monthly</changefreq>
<priority>1.0</priority>
</url>
</urlset>
Sitemap Elements
Element | Description |
---|---|
urlset | The document-level element for the Sitemap. The rest of the document must be contained in this section. |
url | Parent element for each entry. The remaining elements are children of this. |
loc | URL of the page, including the protocol (e.g. HTTP). This value must be less than 2,048 characters. |
lastmod | This can display the full date and time or, if desired, may simply be the date in the format YYYY-MM-DD . |
changefreq | How frequently the page may change; values are: always, hourly, daily, weekly, monthly, yearly, never. Always is used to denote documents that change each time that they are accessed. Never is used to denote files that will not be changed again. This is used only as a guide for crawlers and is not used to determine how frequently pages are indexed. |
priority | Priority of that URL relative to other URLs on the site. This allows webmasters to suggest to crawlers which pages are considered more important. The valid range is from 0.0 to 1.0, with 1.0 being the most important. |
Sitemap files have a limit of 50,000 URLs and 10 MB per sitemap file. Sitemaps can be compressed using gzip though. Multiple sitemap files are supported so if you exceed the limit, submit multiple sitemap files.
As with all XML files, any data values including URLs, must use entity escape codes for the characters ampersand (&
), single quote ('
), double quote ("
), less than (<
), and greater than (>
).
Generate a Sitemap File
You have a few options with regard to generating your own sitemap file. The first is to purchase software or use software that your hosting provider has available. For instance, some hosting providers provide you with the ability to generate a sitemap file that you can submit to search engines at no additional cost.
Another method is to use a free/paid online website that provides this service. There are many on the Internet. A quick search will result in many websites to choose from. A third option is to create one manually. However, if you have many pages, this is not a task that I would recommend.
Finally, if you have some programming skills, you can generate a sitemap file using asp.net and/or vb.net that builds the sitemap from information stored in a database table. If you have an asp.net web application, create a new asp.net page, and add the following code listed below to the code-behind page.
When you open the page in a browser, the sitemap file will be created. Of course, you do not have to use an asp.net web application. You can modify this code and use it in a stand-alone vb.net application.
Imports System.Xml
Partial Class Threads_generateSiteMap
Inherits System.Web.UI.Page
Protected Sub Page_Load(sender As Object, e As System.EventArgs) Handles Me.Load
Dim writer As XmlTextWriter
Try
writer = New XmlTextWriter(Server.MapPath("sitemap.xml"), System.Text.Encoding.UTF8)
writer.WriteStartDocument(True)
writer.Formatting = Formatting.Indented
writer.Indentation = 4
writer.WriteStartElement("urlset", "http://www.sitemaps.org/schemas/sitemap/0.9")
Dim genSiteMap As System.Data.IDbConnection = New System.Data.SqlClient.SqlConnection_
(ConfigurationManager.ConnectionStrings("DBName").ConnectionString)
Dim CommandgenSiteMap As New System.Data.SqlClient.SqlCommand
Dim ReadergenSiteMap As System.Data.IDataReader
genSiteMap.Open()
CommandgenSiteMap.Connection = genSiteMap
CommandgenSiteMap.CommandText = "select pubdate, URL from ARTICLES order by threadID desc"
ReadergenSiteMap = CommandgenSiteMap.ExecuteReader
While ReadergenSiteMap.Read()
createNode("http://www.MYSITE.com/articles/" & ReadergenSiteMap.GetString(1),_
ReadergenSiteMap.GetDateTime(0).ToString("yyyy-MM-dd"), "Monthly", "1.0", writer)
End While
ReadergenSiteMap.Close()
genSiteMap.Close()
writer.WriteEndElement()
writer.WriteEndDocument()
Catch ex As Exception
Response.Write("Error accessing XML file")
Finally
writer.Close()
Response.Write("Finished processing")
End Try
End Sub
Private Sub createNode(ByVal loc As String, ByVal lastmod As String, ByVal changefreq As String,_
ByVal priority As String, ByVal writer As XmlTextWriter)
writer.WriteStartElement("url")
writer.WriteStartElement("loc")
writer.WriteString(loc)
writer.WriteEndElement()
writer.WriteStartElement("lastmod")
writer.WriteString(lastmod)
writer.WriteEndElement()
writer.WriteStartElement("changefreq")
writer.WriteString(changefreq)
writer.WriteEndElement()
writer.WriteStartElement("priority")
writer.WriteString(priority)
writer.WriteEndElement()
writer.WriteEndElement()
End Sub
End Class
Sitemaps are a great way to help move the indexing process along. As long as you have great content and backlinks located on well ranked sites, the search engines will eventually find your website and index your pages regardless of whether or not you have a sitemap submitted with them. Having a sitemap, just helps their indexing bots locate your pages faster.