News Blog
The official blog from the team at Google News
Eight Ways to Help Google News Better Crawl Your Site
Wednesday, February 11, 2009
Posted by Abe Epton, News Online Operations Team
From time to time, publishers ask us what they can do to improve their listings in Google News. The following are eight of the most frequent, and useful, pieces of advice we give out. Why eight? Because at Google, we love powers of 2.
* Keep the article body clean
For various reasons, when crawling an article, Google News checks to make sure it can find the article body. If your article body is broken up by
tags, ads, sidebars or other non-article content, we may not be able to detect the actual article body, and reject your article as a result. In addition, if you place the beginning of your article's body near the title in the HTML, we'll be more likely to extract the correct title and snippet.
* Make sure article URLs are permanent and unique
If you reuse article URLs, our system may have difficulty crawling and categorizing your stories. In addition, make sure your article URLs have at least three digits that don't resemble a year (for example, 5232 is ok, but 2008 is not.) You can get around this requirement by submitting your articles in News Sitemaps. Also, please note that session IDs can confuse our crawler, and we may not realize that two distinct URLs actually point to the same page. You can learn more about some of these requirements
here
.
* Take advantage of stock tickers in Sitemaps
Google News Sitemaps allow publishers to specify stock ticker symbols for companies mentioned in individual articles. Using these symbols helps us better identify the subjects of your articles. You can read more about the format we use for this data
here
.
* Check your encoding
We occasionally see articles that declare themselves to be encoded in one format (say, UTF-8) and are actually encoded in another (say, ISO 8859-1). Don't do this. It hurts us.
* Make your article publication dates explicit
In order to help our crawler determine the correct date, please make the actual publication date of your articles explicit. You can do this by placing the article date and time in the HTML, between the title and the body. Also, you can remove other dates from the HTML of the article page, and add the required
tag to articles in your
News Sitemap
. Dates on article pages can be in most common formats, but for sitemaps, we ask that you use the
W3C
format; e.g. 2008-12-29T06:30:00Z.
Note that the article times and dates displayed on Google News reflect the time at which we originally crawled the articles, and may not be the same as the publication date.
* Keep original content separate from press releases
If your site produces original content and distributes press releases that you'd like us to crawl, make sure to separate your original news content from your press releases by creating two different sections on your site. As you may know, Google News labels press releases distinctly in order to alert our users that the article they're about to read is a press release. If your original news sections have links to press releases, adding the rel="nofollow" attribute to all links that point to your press release articles will ensure that they're labeled correctly. You can learn more about this attribute
here
.
* Format your images properly
To help Google News identify your images and crawl them along with your articles, use fairly large images with reasonable aspect ratios and descriptive captions. Make sure to place them near their respective article titles on the page and make the images inline and non-clickable. Images in the JPEG format are more likely to be crawled correctly.
* Article Titles in Google News
In order for Google News to crawl the correct titles for your articles, make sure the title you want appears in both the title tag and as the headline on the article page. In addition, don't hyperlink the headline on the article page - after all, your reader is already there! And it's always a good idea to have links that point to your articles use the article title as anchor text.
If you found these suggestions helpful, you might also want to check out our more general
Webmaster Guidelines
. The Webmaster Guidelines aren't necessarily specific to Google News, but much of the wisdom you'll find there can help make your site Google News-friendly. Our
Publisher Help Center
contains lots more information about many of these topics. And you can always check out the
Google News Help Forum
to give us feedback on these suggestions, and share other tips and advice with webmasters and News users.
Labels
announcements
30
currently in the news
13
features
43
Google News Blog
153
help for publishers
21
languages and editions
13
looking backward
7
Archive
2016
Sep
May
Apr
2015
Aug
2014
Aug
Feb
2013
Dec
Jun
Mar
2012
Dec
Oct
Sep
May
Mar
Jan
2011
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feb
2010
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feb
Jan
2009
Dec
Nov
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Nov
Oct
Sep
Aug
Jun
May
Apr
Mar
Feb
Jan
2007
Dec
Nov
Oct
Sep
Aug
Jul
Jun
Feed
Google
on
Follow @google
Follow
Give us feedback in our
Product Forums
.