Tampilkan postingan dengan label sitemaps. Tampilkan semua postingan
Tampilkan postingan dengan label sitemaps. Tampilkan semua postingan

What’s new with Sitemaps

Webmaster level: All

Sitemaps are a way to tell Google about pages on your site. Webmaster Tools’ Sitemaps feature gives you feedback on your submitted Sitemaps, such as how many Sitemap URLs have been indexed, or whether your Sitemaps have any errors. Recently, we’ve added even more information! Let’s check it out:


The Sitemaps page displays details based on content-type. Now statistics from Web, Videos, Images and News are featured prominently. This lets you see how many items of each type were submitted (if any), and for some content types, we also show how many items have been indexed. With these enhancements, the new Sitemaps page replaces the Video Sitemaps Labs feature, which will be retired.

Another improvement is the ability to test a Sitemap. Unlike an actual submission, testing does not submit your Sitemap to Google as it only checks it for errors. Testing requires a live fetch by Googlebot and usually takes a few seconds to complete. Note that the initial testing is not exhaustive and may not detect all issues; for example, errors that can only be identified once the URLs are downloaded are not be caught by the test.

In addition to on-the-spot testing, we’ve got a new way of displaying errors which better exposes what types of issues a Sitemap contains. Instead of repeating the same kind of error many times for one Sitemap, errors and warnings are now grouped, and a few examples are given. Likewise, for Sitemap index files, we’ve aggregated errors and warnings from the child Sitemaps that the Sitemap index encloses. No longer will you need to click through each child Sitemap one by one.

Finally, we’ve changed the way the “Delete” button works. Now, it removes the Sitemap from Webmaster Tools, both from your account and the accounts of the other owners of the site. Be aware that a Sitemap may still be read or processed by Google even if you delete it from Webmaster Tools. For example if you reference a Sitemap in your robots.txt file search engines may still attempt to process the Sitemap. To truly prevent a Sitemap from being processed, remove the file from your server or block it via robots.txt.

For more information on Sitemaps in Webmaster Tools and how Sitemaps work, visit our Help Center. If you have any questions, go to Webmaster Help Forum.

Tag Your TV Shows!

Webmaster Level: All

If your website is the authoritative source for the video of a particular TV show, make sure we know about it! Hopefully, you already submit Video Sitemaps or mRSS feeds to inform us about video content on your website. We now support additional fields in both video Sitemaps and mRSS feeds where you can specify metadata specific to television or episodic content. This includes the series’ title, the season and episode numbers for the video in question, the premiere date, as well as other additional information. The metadata from your video feed helps us provide more detailed, relevant results to users wanting to view your show.

Here’s an example Video Sitemap entry that includes all the required and some optional TV metadata in the <video:tvshow> element:

<video:video>
  <video:title>The Sample Show, Season 1, Episode 2</video:title>
  <!-- other required root level video tags omitted -->
  <video:tvshow>
    <video:show_title>The Sample Show</video:show_title>
    <video:video_type>full</video:video_type>
    <video:episode_title>A Sample Episode Title</video:episode_title>
    <video:season_number>1</video:season_number>
    <video:episode_number>2</video:episode_number>
  </video:tvshow>
</video:video>


The full documentation for the tags for both mRSS and Video Sitemaps can be found in our Webmaster Tools Help Center. As always, if you have any questions about Video Sitemaps or mRSS feeds, feel free to reach out to us in the Sitemaps section of the Webmaster Help Forum.

Making Websites Mobile Friendly

Webmaster level: Intermediate

We’ve noticed a rise in the number of questions from webmasters about how best to structure a website for mobile phones and how websites can best interact with Googlebot-Mobile. In this post we’ll explain the current situation and give you specific recommendations you can implement now.

Some Background

Let’s start with a simple question: what do we mean by “mobile phone” when talking about mobile-friendly websites?

A good way to answer this question is to think about the capabilities of the mobile phone’s web browser, especially in relation to the capabilities of modern desktop browsers. To simplify matters, we can break mobile phones into a few classifications:

  1. Traditional mobile phones: Phones with browsers that cannot render normal desktop webpages. This includes browsers for cHTML (iMode), WML, WAP, and the like.
  2. Smartphones: Phones with browsers that render normal desktop pages, at least to some extent. This category includes a diversity of devices, such Windows Phone 7, Blackberry devices, iPhones, and Android phones, and also tablets and eBook readers.

    We can further break down this category by support for HTML5:

    • Devices with browsers that do not support HTML5
    • Devices with browsers that support HTML5

Once upon a time, mobile phones connected to the Internet using browsers with limited rendering capabilities; but this is clearly a changing situation with the fast rise of smartphones which have browsers that rival the full desktop experience. As such, it’s important to note that the distinction we are making here is based on the current situation as we see it and might change in the future.

Googlebot and Mobile Content

Google has two crawlers relevant to this topic: Googlebot and Googlebot-Mobile. Googlebot crawls desktop-browser type of webpages and content embedded in them and Googlebot-Mobile crawls mobile content. The questions we’re seeing more of can be summed up as follows:

Given the diversity of capabilities of mobile web browsers, what kind of content should I serve to Googlebot-Mobile?

The answer lies in the User-agent that Googlebot-Mobile supplies when crawling. There are several User-agent strings in use by Googlebot-Mobile, all of which use this format:

[Phone name(s)] (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)

To decide which content to serve, assess which content your website has that best serves the phone(s) in the User-agent string. A full list of Googlebot-Mobile User-agents can be found here.

Notice that we currently do not crawl with Googlebot-Mobile using a smartphone User-agent string. Thus at the current time, a correctly-configured content serving system will serve Googlebot-Mobile content only for the traditional phones described above, because that’s what the User-agent strings in use today dictate. This may change in the future, and if so, it may mean there would be a new Googlebot-Mobile User-agent string.

For now, we expect smartphones to handle desktop experience content so there is no real need for mobile-specific effort from webmasters. However, for many websites it may still make sense for the content to be formatted differently for smartphones, and the decision to do so should be based on how you can best serve your users.

URL Structure for Mobile Content

The next set of questions ask about the URLs mobile content should be served from. Let’s look in detail at some common use cases.

Websites with only Desktop Experience Content

Most websites currently have only one version of their content, namely in HTML that is designed for desktop web browsers. This means all browsers access the content from the same URL.

These websites may not be serving traditional mobile phone users. The quality experienced by their smartphone users depends on the mobile browser they are using and it could be as good as browsing from the desktop.

If you serve only desktop experience content for all User Agents, you should do so for Googlebot-Mobile too; that is, treat Googlebot-Mobile as you treat all other or unknown User Agents. In these cases, Google may modify your webpages for an improved mobile experience.

Websites with Dedicated Mobile Content

Many websites have content specifically optimized for mobile users. The content could be simply reformatted for the typically smaller mobile displays, or it could be in a different format (e.g., served using WAP, etc.).

A very common question we see is: Does it matter if the different types of content are served from the same URL or from different URLs? For example, some websites have www.example.com as the URL desktop browsers are meant to access and have m.example.com or wap.example.com for the different mobile devices. Other websites serve all types of content from just one URL structure like www.example.com.

For Googlebot and Googlebot-Mobile, it does not matter what the URL structure is as long as it returns exactly what a user sees too. For example, if you redirect mobile users from www.example.com to m.example.com, that will be recognized by Googlebot-Mobile and both websites will be crawled and added to the correct index. In this case, use a 301 redirect for both users and Googlebot-Mobile.

If you serve all types of content from www.example.com, i.e. serving desktop-optimized content or mobile-optimized content from the same URL depending on the User-agent, this will also lead to correct crawling by Googlebot and Googlebot-Mobile. This is not considered cloaking by Google.

It is worth repeating that regardless of URL structure, you must correctly detect the User-agent as given by your users and Googlebot-Mobile, and serve both the same content. Don’t forget to keep the default content, the desktop-optimized content, for when an unknown User-agent requests it.

Mobile Sitemaps in Webmaster Tools

Finally, we receive many questions about what URLs to put in Mobile Sitemaps. As explained in our Mobile Sitemaps Help Center articles, you should include only mobile content URLs in Mobile Sitemaps, even if these URLs also return non-mobile content when accessed by a non-mobile User-agent.

More Questions?

A good place to start is our Mobile Sites Help Center articles and the relevant sections in our Search Engine Optimization Starter Guide. We also created a thread in our forums for you to ask questions about this post.

Sending Video Sitemaps Q&A holiday cheer

Webmaster Level: Intermediate to Advanced

To the fabulous, savvy audience that attended our Video Sitemap webinar several months ago, please accept our re-gift: a summary of your questions from the Video Sitemaps Q&A!

To those who were unable to attend the webinar, please enjoy our gift of the summarized Q&A -- it’s like new!

Either way, happy holidays from all of us on the Webmaster Central Team. :)


Our entire webinar covers the basics of Video Sitemaps and best practices -- nearly everything you’d need to know when submitting a video feed.

  1. Can the source/content of the video (perhaps a third-party vendor) be hosted on another site? For example, can I host my videos on YouTube and still be eligible for Video Search traffic?

    Yes, you can use a third party to host videos. Only the play page--the URL within the <loc> tag--needs to be on your site. <video:content_loc> and <video:player_loc> can list URLs on a different site or subdomain.

    For example, here’s a snippet from a valid Video Sitemap that shows content hosted on a different subdomain from the play page:

    <url>
      <loc>http://www.example.com/videos/some_video_landing_page.html</loc>
        <video:video>
          <video:thumbnail_loc>http://www.example.com/thumbs/123.jpg</video:thumbnail_loc>
          <video:title>Grilling steaks for summer</video:title>
          <video:description>Alkis shows you how to get perfectly done steaks every time</video:description>
          <video:content_loc>http://video-hoster.example.com/video123.flv</video:content_loc>
          <video:player_loc allow_embed="yes" autoplay="ap=1">http://www.example.com/videoplayer.swf?video=123</video:player_loc>
        </video:video>
    </url>


  2. If I’m using YouTube to host my videos, can Google verify that I’m the legitimate owner?

    Currently, there doesn’t exist functionality that allows you, as the uploader, to verify that you’re the owner of a video. The issue of authorship is a hard problem on the web, not just for videos, but nearly all types of content.

  3. Because Google owns YouTube, should users who embed YouTube videos still submit Video Sitemaps or is it unnecessary?

    Google treats YouTube as just another source for video content -- though you don’t need to submit a Video Sitemap if you only want your YouTube-hosted videos indexed. If, however, you’re using YouTube as a online video platform (i.e., with play pages on your own site), then we do recommend Sitemap submission.

  4. How long does it take for Google to accept and verify a Video Sitemap?

    Video Sitemap submission is a two-part process:

    1. We fetch the Sitemap and parse it for syntax errors. This happens within minutes.

    2. We fetch the assets referenced in the Sitemap, perform checks, validate metadata, do more cool stuff, and last, index the video. This step can require varied amounts of time depending on your site and our system load.

  5. What tags and categories are most important in Video Sitemaps or mRSS? Should I create my own categories or is there a list that I should conform to?

    Currently, the most important metadata to include is title and description -- both are required. The category tag is optional, and there isn’t a list from which to select.

  6. Do I have to use HTML5 to use Video Sitemaps?
    Does HTML5 help with discovery?
    Or, if my site is HTML5 compliant, do I still need to submit a Video Sitemap?


    None of the Video Search principles change with HTML5. We still recommend using a Video Sitemap regardless of the markup on your site. HTML5 can be helpful, though, because tags like <video> make it easier for our systems to verify that video exists on the page.

  7. If I use an iframe rather than embedding my videos, can Google still find it?

    We do not recommend using iframes to embed video content on your pages.

  8. Can I have multiple videos on one URL?

    You can. We’ve found, however, that users may not consider it the best experience. When users click on a video search result, they most often don’t like being forced to locate the correct video among multiple videos on the resulting page.

  9. Do I need to specifically create a robots.txt file that allows Googlebot, or do I just need to make sure Googlebot isn’t blocked?

    Just make sure that Googlebot isn’t blocked.

  10. I provided a thumbnail, but it’s not being used. Does Google create their own thumbnails from my videos?

    We try to use the thumbnail you provide if it’s valid. If not, we’ll try to generate a thumbnail ourselves. We recommend that you provide thumbnails that are at least 120x90 pixels. We also accept many thumbnail formats, such as PNG and JPEG.

  11. Any video filesize limitations?

    At this time, there aren’t video filesize limitations on content submitted through VIdeo Sitemaps.

  12. Is there any way to indicate a transcript or closed captioning for a video?

    Currently there isn’t, but perhaps down the road.

  13. What if I’m using Lightbox or a popup to display a video; can it still be indexed?

    Depends on the use case and how it’s rendered, but if indexing by search engines is important to you, it’s not the safest method. In the Webmaster Help Center, we explain that “When designing your site, it's important to configure your video pages without any overly complex JavaScript or Flash setup.” Most often, for bots, simpler is safer.
Have a safe and happy holiday!

Video Sitemaps & mRSS vs. Facebook Share & RDFa

Webmaster Level: Intermediate to Advanced

What are the benefits of submitting feeds like Video Sitemaps and mRSS vs. the benefits of Facebook Share and RDFa? Is one better than the other? Let’s start the discussion.

Functionality of feeds vs. on-page markup

Google accepts information from both video feeds, such as Video Sitemaps and mRSS, as well as on-page markup, such as Facebook Share and RDFa. We recommend that you use both!

If you have limited resources, however, here’s a chart explaining the pros and cons of each method. The key differentiators include:
  • While both feeds and on-page markup give search engines metadata, Video Sitemaps/mRSS also help with crawl discovery. We may find a new URL through your feed that we wouldn’t have easily discovered otherwise.

  • Using Video Sitemaps/mRSS requires that the search engine support these formats and not all engines do. Because on-page markup is just that -- on the page -- crawlers can gather the metadata through organic means as they index the URL. No feed support is required.

 Feeds
(Video Sitemaps & mRSS)
On-page markup
(Facebook Share & RDFa)
Accepted by Google
Helps search engines discover new URLs with videos (improves discovery and coverage)
Provides structured metadata (e.g. video title and description)
Allows search engines without sitemap/mRSS support to still obtain metadata information (allows organic gathering of metadata)
Incorporates additional metadata like “duration”


If you’re further wondering about the benefits of specific feeds (Video Sitemaps vs. mRSS), we can help with clarification there, too. First of all, you can use either. We’re agnostic. :) One benefit of Video Sitemaps is that, because it’s a format we’re actively enhancing, we can quickly extend it to allow for more specifications.

All this said, if you’re going to start from scratch, Video Sitemaps is our recommended start.

 Video SitemapsmRSS
Accepted by Google
Been around for a long, long time and pretty widely accepted
Extremely quick for Google Video Search team to extend


“Starving” to start conversation about feeds or on-page markup? Join us in the Sitemaps section of the Webmaster discussion forum.

Video Sitemaps: Is your video part of a gallery?

Webmaster Level: All

Often a website which hosts videos will have a common top-level page that groups conceptually related videos together. Such a page may be of interest to a user searching on that subject. Sites with many videos about a single subject can group these videos together on a top-level page, often known as a gallery. This can make it easier for users to find exactly what they're looking for. In this case, you can use a Sitemap to tell Google the URL of the gallery page on which each video appears.


You can specify the URL of the gallery level page using the optional tag <video:gallery_loc> on a per-video basis. Note that only one gallery_loc is allowed per video.

For more information on Google Videos, including Sitemap specifications, please visit our Help Center. To post questions and search for answers, check out our Help Forum.

To err is human, Video Sitemap feedback is divine!

Webmaster Level: All

You can now check your Video Sitemap for even more errors right in Webmaster Tools! It’s a new Labs feature to signal issues in your Video Sitemap such as:
  • URLs disallowed by robots.txt
  • Thumbnail size errors (160x120px is ideal. Anything smaller than 90x50 will be rejected.)



Video Sitemaps help us to better crawl and extract information about your videos, so we can appropriately feature them in search results.

Totally new to Video Sitemaps? Check out the Video Sitemaps center for more information. Otherwise, take a look at this new Labs feature in Webmaster Tools.

Video Sitemaps: Understanding location tags

Webmaster Level: All

If you want to add video information to a Sitemap or mRSS feed you must specify the location of the video. This means you must include one of two tags, either the video:player_loc or video:content_loc. In the case of an mRSS feed, these equivalent tags are media:player or media:content, respectively. We need this information to verify that there is actually a live video on your landing page and to extract metadata and signals from the video bytes for ranking. If one of these tags is not included we will not be able to verify the video and your Sitemap/mRSS feed will not be crawled. To reduce confusion, here is some more detail about these elements.

Video Locations Defined

Player Location/URL: the player (e.g., .swf) URL with corresponding arguments that load and play the actual video.

Content Location/URL: the actual raw video bytes (e.g., .flv, .avi) containing the video content.

The Requirements

One of either the player video:player_loc or content video:content_loc location is required. However, we strongly suggest you provide both, as they each serve distinct purposes: player location is primarily used to help verify that a video exists on the page, and content location helps us extract more signals and metadata to accurately rank your videos.

URL extensions at a glance:



















Sitemap:mRSS:Contents:
<loc><link>The playpage URL
<video:player_loc>

<media:player> (url attribute)The SWF URL
<video:content_loc><media:content> (url attribute)The FLV or other raw video URL

NOTE: All URLs should be unique (every URL in your entire Video Sitemap and mRSS feed should be unique)

If you would like to better ensure that only Googlebot accesses your content, you can perform a reverse DNS lookup.

For more information on Google Videos please visit our Help Center, and to post questions and search for answers check out our Help Forum.

Video Sitemaps 101: Making your videos searchable

Webmaster Level: All

We know that some of you, or your clients or colleagues, may be new to online video publishing. To make it easier for everyone to understand video indexing and Video Sitemaps, we’ve created a video -- narrated by Nelson Lee, Video Search Product Manager -- that explains everything in basic terms:



Also, last month we wrote about some best practices for getting video content indexed on Google. Today, to help beginners better understand the whys and hows of implementing a Video Sitemap, we added a starting page to the information on Video Sitemaps in the Webmaster Help Center. Please take a look and share your thoughts.

Sitemaps: One file, many content types

Webmaster Level: All

Have you ever wanted to submit your various content types (video, images, etc.) in one Sitemap? Now you can! If your site contains videos, images, mobile URLs, code or geo information, you can now create—and submit—a Sitemap with all the information.

Site owners have been leveraging Sitemaps to let Google know about their sites’ content since Sitemaps were first introduced in 2005. Since that time additional specialized Sitemap formats have been introduced to better accommodate video, images, mobile, code or geographic content. With the increasing number of specialized formats, we’d like to make it easier for you by supporting Sitemaps that can include multiple content types in the same file.

The structure of a Sitemap with multiple content types is similar to a standard Sitemap, with the additional ability to contain URLs referencing different content types. Here's an example of a Sitemap that contains a reference to a standard web page for Web search, image content for Image search and a video reference to be included in Video search:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>http://example.com/foo.html</loc>
<image:image>
<image:loc>http://example.com/image.jpg</image:loc>
</image:image>
<video:video>
<video:content_loc>http://example.com/videoABC.flv</video:content_loc>
<video:title>Grilling tofu for summer</video:title>
</video:video>
</url>
</urlset>

Here's an example of what you'll see in Webmaster Tools when a Sitemap containing multiple content types is submitted:



We hope the capability to include multiple content types in one Sitemap simplifies your Sitemap submission. The rest of the Sitemap rules, like 50,000 max URLs in one file and the 10MB uncompressed file size limit, still apply. If you have questions or other feedback, please visit the Webmaster Help Forum.

Help Google index your videos

Webmaster Level: All

The single best way to make Google aware of all your videos on your website is to create and maintain a Video Sitemap. Video Sitemaps provide Google with essential information about your videos, including the URLs for the pages where the videos can be found, the titles of the videos, keywords, thumbnail images, durations, and other information. The Sitemap also allows you to define the period of time for which each video will be available. This is particularly useful for content that has explicit viewing windows, so that we can remove the content from our index when it expires.

Once your Sitemap is created, you can can submit the URL of the Sitemap file in Google Webmaster Tools or through your robots.txt file.

Once we have indexed a video, it may appear in our web search results in what we call a Video Onebox (a cluster of videos related to the queried topic) and in our video search property, Google Videos. A video result is immediately recognizable by its thumbnail, duration, and a description.

As an example, this is what a video result from CNN.com looks like on Google:


We encourage those of you with videos to submit Video Sitemaps and to keep them updated with your new content. Please also visit our recently updated Video Sitemap Help Center, and utilize our Sitemap Help Forum. If you've submitted a Video Sitemap file via Webmaster Tools and want to share your experiences or problems, you can do so here.

Adding Images to your Sitemaps

Webmaster Level: All

Sitemaps are an invaluable resource for search engines. They can highlight the important content on a site and allow crawlers to quickly discover it. Images are an important element of many sites and search engines could equally benefit from knowing which images you consider important. This is particularly true for images that are only accessible via JavaScript forms, or for pages that contain many images but only some of which are integral to the page content.

Now you can use a Sitemaps extension to provide Google with exactly this information. For each URL you list in your Sitemap, you can add additional information about important images that exist on that page. You don’t need to create a new Sitemap, you can just add information on images to the Sitemap you already use.

Adding images to your Sitemaps is easy. Simply follow the instructions in the Webmaster Tools Help Center or refer to the example below:

<?xml version="1.0" encoding="UTF-8"?>
  <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
   xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
  <url>
    <loc>http://example.com/sample.html</loc>
    <image:image>
        <image:loc>http://example.com/image.jpg</image:loc>
    </image:image>
  </url>
</urlset>


We index billions of images and see hundreds of millions of image-related queries each day. To take advantage of that traffic most effectively, take a moment to update your Sitemap file with information on the images from your site. Let us know in the Sitemaps forum if you have any questions.

Tips for News Search

Webmaster Level: All

During my stint on the "How Google Works Tour: Seattle", I heard plenty of questions regarding News Search from esteemed members of the press, such as The Stranger, The Seattle Times and Seattle Weekly. After careful note-taking throughout our conversations, the News team and I compiled this presentation to provide background and FAQs for all publishers interested in Google News Search.



Along with the FAQs about News Sitemaps and PageRank in the video above, here's additional Q&A to get you started:

Would adding a city name to my paper—for example, changing our name from "The Times" to "The San Francisco Bay Area Times"—help me target my local audience in News Search?
No, this won't help News rankings. We extract geography and location information from the article itself (see video). Changing your name to include relevant keywords or adding a local address in your footer won't help you target a specific audience in our News rankings.
What happens if I accidentally include URLs in my News Sitemap that are older than 72 hours?
We want only the most recently added URLs in your News Sitemap, as it directs Googlebot to your breaking information. If you include older URLs, no worries (there's no penalty unless you're perceived as maliciously spamming -- this case would be rare, so again, no worries); we just won't include those URLs in our next News crawl.
To get the full scoop, check out the video!

An Update on Sitemaps at Google

Did you know that the number of website hosts that have been submitting Sitemap files has almost tripled over the last year? It's no wonder: the secret is out - as a recent research study showed, Sitemaps helps search engines to find new and changed content faster. Using Sitemaps doesn't guarantee that your site will be crawled and indexed completely, but it certainly helps us understand your website better.

Together with the Webmaster Tools design update, we've been working on Sitemaps as well:
  • Google and the other search engines which are a part of Sitemaps.org now support up to 50,000 child Sitemaps for Sitemap index files (instead of the previous 1,000). This allows large sites to submit a theoretical maximum of 2.5 billion URLs with a single Sitemap Index URL (oh, and if you need more, you can always submit multiple Sitemap index files). 
  • The Webmaster Tools design update now shows you all Sitemap files that were submitted for your verified website. This is particularly useful if you have multiple owners verified in Webmaster Tools or if you are submitting some Sitemap files via HTTP ping or through your robots.txt file.
  • The indexed URL count in Webmaster Tools for your Sitemap files is now even more precise.
  • For the XML developers out there, we've updated the XSD schemas to allow Sitemap extensions. The new schema helps webmasters to create better Sitemaps by verifying more features. By validating Sitemap files with the new schema, you can be more confident that the Sitemap files are correct.
  • Do I need to mention that Sitemap file processing is much faster than ever before? We've drastically reduced the average time from submitting a Sitemap file to processing it and showing some initial data in Webmaster Tools. 


For more information about using Sitemaps, make sure to check out our blog post about frequently asked questions on Sitemaps and our Help Center. If you have any questions that aren't covered here, don't forget to search our Help Forum and start a thread in the Sitemaps section for more help.

Research study of Sitemaps

We've been tracking the growth of Sitemaps on the web. It's been just 2 years since Google, Yahoo and Microsoft co-announced the Sitemaps directive in robots.txt, and it is already supported in many millions of websites including educational and government websites! At the WWW'09 conference in Madrid, Uri Schonfeld presented his summer internship work studying Sitemaps from a coverage and freshness perspective. If you're interested in how some popular websites are using Sitemaps, and how Sitemaps complement "classic" webcrawling, take a look:


At Google, we care deeply about getting increased coverage and freshness of the content we index. We are excited about open standards that help webmasters open up their content automatically to search engines, so users can find relevant content for their searches.

Using stats from site: and Sitemap details

Webmaster Level: Beginner to Intermediate

Every now and then in the webmaster blogosphere and forums, this issue comes up: when a webmaster performs a [site:example.com] query on their website, the number of indexed results differs from what is displayed in their Sitemaps report in Webmaster Tools. Such a discrepancy may smell like a bug, but it's actually by design. Your Sitemap report only reflects the URLs you've submitted in your Sitemap file. The site operator, on the other hand, takes into account whatever Google has crawled, which may include URLs not included in your Sitemap, such as newly added URLs or other URLs discovered via links.

Think of the site operator as a quick diagnosis of the general health of your site in Google's index. Site operator results can show you:
  • a rough estimate of how many pages have been indexed
  • one indication of if your site has been hacked
  • if you have duplicate titles or snippets
Here is an example query using the site operator:



Your Sitemap report provides more granular statistics about the URLs you submitted, such as the number of indexed URLs vs. the number submitted for crawling, and Sitemap-specific warnings or errors that may have occurred when Google tried to access your URLs.

Sitemap report

Feel free to check out our Help Center for more on the site: operator and Sitemaps. If you have further questions or issues, please post to our Webmaster Help Forum, where experienced webmasters and Googlers are happy to help.

Posted by Charlene Perez

A new Google Sitemap Generator for your website

It's been well over three years since we initially announced the Python Sitemap generator in June 2005. In this time, we've seen lots of people create great third-party Sitemap generators to help webmasters create better Sitemap files. While most Sitemap generators either crawl websites or list the files on a server, we have created a different kind of Sitemap generator that uses several ways to find URLs on your website and then allows you to automatically create and maintain different kinds of Sitemap files.

Google Sitemap Generator screenshot of the admin console

About Google Sitemap Generator


Our new open-source Google Sitemap Generator finds new and modified URLs based on your webserver's traffic, its log files, or the files found on the server. By combining these methods, Google Sitemap Generator can be very fast in finding these URLs and calculating relevant metadata, thereby making your Sitemap files as effective as possible. Once Google Sitemap Generator has collected the URLs, it can create the following Sitemap files for you:

In addition, Google Sitemap Generator can send a ping to Google Blog Search for all of your new or modified URLs. You can optionally include the URLs of the Sitemap files in your robots.txt file as well as "ping" the other search engines that support the sitemaps.org standard.

Sending the URLs to the right Sitemap files is simple thanks to the web-based administration console. This console gives you access to various features that make administration a piece of cake while maintaining a high level of security by default.

Getting started


Google Sitemap Generator is a server plug-in that can be installed on both Linux/Apache and Microsoft IIS Windows-based servers. As with other server-side plug-ins, you will need to have administrative access to the server to install it. You can find detailed information for the installation in the Google Sitemap Generator documentation.

We're excited to release Google Sitemap Generator with the source code and hope that this will encourage more web hosters to include this or similar tools in their hosting packages!

Do you have any questions? Feel free to drop by our Help Group for Google Sitemap Generator or ask general Sitemaps question in our Webmaster Help Forum.

On-Demand Sitemaps for Custom Search

Since we launched enhanced indexing with the Custom Search platform earlier this year, webmasters who submit Sitemaps to Webmaster Tools get special treatment: Custom Search recognizes the submitted Sitemaps and indexes URLs from these Sitemaps into a separate index for higher quality Custom Search results. We analyze your Custom Search Engines (CSEs), pick up the appropriate Sitemaps, and figure out which URLs are relevant for your engines for enhanced indexing. You get the dual benefit of better discovery for Google.com and more comprehensive coverage in your own CSEs.

Today, we're taking another step towards improving your experience with Google webmaster services with the launch of On-Demand Indexing in Custom Search. With On-Demand Indexing, you can now tell us about the pages on your websites that are new, or that are important and have changed, and Custom Search will instantly schedule them for crawl, and index and serve them in your CSEs usually within 24 hours, often much faster.

How do you tell us about these URLs? You guessed it... provide a Sitemap to Webmaster Tools, like you always do, and tell Custom Search about it. Just go to the CSE control panel, click on the Indexing tab, select your On-Demand Sitemap, and hit the "Index Now" button. You can tell us which of these URLs are most important to you via the priority and lastmod attributes that you provide in your Sitemap. Each CSE has a number of pages allocated within the On-Demand Index, and with these attributes, you can us which are most important for indexing. If you need greater allocation in the On-Demand index, as well as more customization controls, Google Site Search provides a range of options.


Some important points to remember:
  1. You only need to submit your Sitemaps once in Webmaster Tools. Custom Search will automatically list the Sitemaps submitted via Webmaster Tools and you can decide which Sitemap to select for On-Demand Indexing.
  2. Your Sitemap needs to be for a website verified in Webmaster Tools, so that we can verify ownership of the right URLs.
  3. In order for us to index these additional pages, our crawlers must be able to crawl them. You can use "Webmaster Tools > Crawl Errors > URLs restricted by robots.txt" or check your robots.txt file to ensure that you're not blocking us from crawling these pages.
  4. Submitting pages for On-Demand Indexing will not make them appear any faster in the main Google index, or impact ranking on Google.com.
We hope you'll use this feature to inform us regularly of the most important changes on your sites, so we can respond quickly and get those pages indexed in your CSE. As always, we're always listening for your feedback on Custom Search.

Video Tutorial: Google for Webmasters

We're always looking for new ways to help educate our fellow webmasters. While you may already be familiar with Webmaster Tools, Webmaster Help Discussion Groups, this blog, and our Help Center, we've added another tutorial to help you understand how Google works. Hence we've made this video of a soon-to-come presentation titled "Google for Webmasters." This video will introduce how Google discovers, crawls, indexes your site's pages, and how Google displays them in search results. It also touches lightly upon challenges webmasters and search engines face, such as duplicate content, and the effective indexing of Flash and AJAX content. Lastly, it also talks about the benefits of offerings Webmaster Central and other useful Google products.


Take a look for yourself.

Discoverability:



Accessibility - Crawling and Indexing:


Ranking:


Webmaster Central Overview:


Other Resources:



Google Presentations Version:
http://docs.google.com/Presentation?id=dc5x7mrn_245gf8kjwfx

Important links from this presentation as they chronologically appear in the video:
Add your URL to Google
Help Center: Sitemaps
Sitemaps.org
Robots.txt
Meta tags
Best uses of Flash
Best uses of Ajax
Duplicate content
Google's Technology
Google's History
PigeonRank
Help Center: Link Schemes
Help Center: Cloaking
Webmaster Guidelines
Webmaster Central
Google Analytics
Google Website Optimizer
Google Trends
Google Reader
Google Alerts
More Google Products


Special thanks to Wysz, Chark, and Alissa for the voices.

Sitemaps offer better coverage for your Custom Search Engine



If you're a webmaster or site owner, you realize the importance of providing high quality search on your site so that users easily find the right information.

We just announced today that AdSense for Search is now powered by Custom Search. Custom Search (a Google-powered search box that you can install on your website in minutes) helps your users quickly find what they're looking for. As a webmaster, Custom Search gives you advanced customization options to improve the accuracy of your site's search results. You can also choose to monetize your traffic with ads tuned to the topic of your site. If you don't want ads, you can use Custom Search Business Edition.



Now, we're also looking to index more of your site's content for inclusion in your Custom Search Engine (CSE) used for search on your site. We figure out what sites and URLs are included in your CSE, and -- if you've provided Sitemaps for the relevant sites -- we use that information to create a more comprehensive experience for your site's visitors. You don't have to do anything specific, besides submitting a Sitemap (via Webmaster Tools) for your site if you haven't already done so. Note that this change will not result in more pages indexed on Google.com and your search rankings on Google.com won't change. However, you will be able to get much better results coverage in your CSE.

Custom Search is built on top of the Google index. This means that all pages that are available on Google.com are also available to your search engine. We're now maintaining a CSE-specific index in addition to the Google.com index for enhancing the performance of search on your site. If you submit a Sitemap, it's likely that we will crawl those pages and include them in the additional index we build.

In order for us to index these additional pages, our crawlers must be able to crawl them. Your Sitemap will also help us identify the URLs that are important. Please ensure you are not blocking us from crawling any pages you want indexed. Improved index coverage is not instantaneous, as it takes some time for the pages to be crawled and indexed.

So what are you waiting for? Submit your Sitemap!