Google News To Release With WordPress


Google News can be a coveted platform. But Google needs for the aggregation a special format - News Sitemap.

Basically you can create this format in two ways with WordPress. Both solutions will be presented here. I will talk about the second example more in detail, because I believe it shows very nicely how to use content from WordPress outside your blog.

  1. The first way is to create a sitemap, similar to a feed in WordPress. This has several advantages for the administration in WordPress.
    How to create a feed, I have in the tutorial „WordPress Feed for Drafts“ shown. You can download this solution as a plugin and simple use Google News-Sitemap.
  2. A second possibility is to create a PHP file in the root directory and to write the latest posts into the appropriate format.

Include WordPress

To get the data from WordPress, you have to have access to wp-load.php, therefore I include it and can get it from the global variables of WordPress, for example the $wpdb database .
This means you can retrieve now all data from the database, which are relevant to the XML format of the Google News Sitemap.

The format

Google provides the following XML structure. I build the structure in the file, and fill it only with the last 20 News.
Backgrounds and tips from Google are on their document site.

<urlset xmlns=“http://www.sitemaps.org/schemas/sitemap/0.9?
xmlns:news=“http://www.google.com/schemas/sitemap-news/0.9?>
	<url>
		<loc>http://www.domain.de/news/news1.html</loc>
		<news:news>
			<news:publication_date>2008-22-01T00:29:19+01:00</news:publication_date>
			<news:keywords>key1, key2, key3</news:keywords>
		</news:news>
	</url>
</urlset>

The file

Below you'll find a simple solution that you can surely expand. In the SQL query is the example of a defined category. This is simply the ID of the category compared (AND wp_term_taxonomy.term_id = 7). If all content should be drawn, then it's suffice to delete this line.

<?php
require('wp-load.php');

// XML header
echo '<?xml version="1.0" encoding="utf-8"?>' . "\n";

// urlset
echo '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
				xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">' . "\n";

// Select posts; set limit 20
$rows = $wpdb->get_results("SELECT DISTINCT ID, post_date_gmt
                            FROM $wpdb->posts, $wpdb->term_relationships, $wpdb->term_taxonomy
                            WHERE wp_term_relationships.object_id = wp_posts.id
                            AND post_status = 'publish'
                            AND post_type = 'post'
                            AND wp_term_taxonomy.term_taxonomy_id = wp_term_relationships.term_taxonomy_id
                            AND wp_term_taxonomy.taxonomy = 'category'
                            AND wp_term_taxonomy.term_id = 7
                            ORDER BY wp_posts.post_date_gmt DESC
                            LIMIT 0, 20");

// sitemap data
// set keywords !

foreach ($rows as $row) {
	echo "\t" . '<url>' . "\n";


	echo "\t\t" . '<loc>';
	echo get_permalink($row->ID);
	echo '</loc>' . "\n";
	echo "\t\t" . '<news:news>' . "\n";
	echo "\t\t" . '<news:publication_date>';
	$thedate = substr($row->post_date_gmt, 0, 10);
	$thetime = substr($row->post_date_gmt, 11, 20);
	echo $thedate . 'T' . $thetime . 'Z';
	echo '</news:publication_date>' . "\n";
	echo "\t\t" . '<news:keywords>online, news</news:keywords>' . "\n"; // change keywords
	echo "\t\t" . '</news:news>' . "\n";
	echo "\t" . '</url>' . "\n";
}

// End urlset
echo '</urlset>';
?>

In the above syntax, the unique keywords are statically assigned. If you assign tags to a blog post, then it is advisable to use it and create there. Following addition will help.

	$tags     = wp_get_post_tags( $row->ID, array('fields' => 'all') );
	$tagcount = count($tags);
	echo "\t\t" . '<news:keywords>';
	for ($i = 1; $i < $tagcount; $i++) {
		echo $taglist  = str_replace( "'", '', str_replace( '"', '', urldecode($tags[$i]->name) ) );
		if ( $i != $tagcount-1 )
		 echo ', ';
	}
	echo '</news:keywords>' . "\n";

Use this line instead

echo "\t\t" . '<news:keywords>online, news</news:keywords>' . "\n"; // change keywords

and it will assign tags automatically, seperated by a comma.

Inclusion in Google

Once you have the above syntax as a file in the root installation and successfully tested, then you only have to ask for inclusion in Google News. There is a form available. Then just wait for an answer from Google.
To see if you are indexed, you can simple search in Google News: site:domain.com.

Comments are closed.

18 comments

  1. DD32

    Its best to include wp-load.php instead of wp-config.php, for a few reasons, 1 of which is that wp-config.php can exist 1 level up from WP.

    See the ticket here: http://trac.wordpress.org/ticket/6933

    (Note: wp-load.php is WP 2.6+)

  2. Michael

    Thanks DD32, you are right. I use wp-load.php for the help files of my upcomming theme too. Frank should fix that after holidays.

  3. Blog Expert

    This was an awesome post. I definitely agree with you and I am looking forward to reading more of your blog.

  4. Alex

    Hey Blog Expert, thanks for the compliment and happy new year!

  5. Greg

    Great post! Does anyone know how I can get the unique 3 digits into my URLs?

  6. Alex

    Hey Greg, I see you already have 3 digits into your URLs, looks like you worked it out.

  7. Greg

    Hi Alex, yes thanks a friend on DP told me how to do this :) Imagine yesterday I got an email from Google... My site has been approved :)

    Do you know how I can submit the sitemap to them?

  8. nemoprincess

    Hello Alex, great post for me too.
    I follow it but running my php file I have only three old posts of my blog.
    What's wrong?
    Thank you very much

  9. SiriusBuzz.com

    @Greg
    How did you get the 3 digit unique code in to your urls without updating your permalink structure? Or did you?

    @Alex,
    From what I can see on the webmaster tools you can simply submit an rss feed instead of using a sitemap.xml. Why go through the trouble of creating this is their own feed from feedburner would work?

  10. Greg

    Hi Siriusbuzz to get the 3 digit id I updated my permalinks structure. I couldn't find any other way to do this.

  11. SiriusBuzz.com

    Is there any reason we would want to limit the sitemap to 20 items? I have looked around but I cant find a suggested/optimal number to feed Google with.

  12. Greg

    According to Google:

    A News sitemap can contain no more than 1,000 URLs. If you want to include more, you can either break these into multiple sitemaps or create a Sitemap index file to manage them. Use the XML format provided in the Sitemap protocol. Your sitemap index file shouldn't list more than 1,000 sitemaps. These limits help ensure that your web server isn't overloaded by serving large files to Google News.

    http://www.google.com/support/webmasters/bin/answer.py?answer=74288&topic=10078

    Hope that answers your question...

  13. SiriusBuzz.com

    I also noticed that they said you shouldn't have any news items older then 72 hours old. Actually, they said 3 days but, you get the idea. I don't know who the heck has the ability to post 1k items in 3 days.

  14. Greg

    Where did you see this?

  15. Simon

    Once you create the php file, how do you then access it as an XML file?

  16. GrantG

    Hi folks.

    In case anyone has missed the obvious - and had to go away searching like I did - to set the 'Unique 3 Digit Code' to WordPress blogs, you will have to set your Permalink structure to:

    /%postname%-%post_id%/

    As default his will produce URL's such as yoursite.com/my-favourite-news-article-1/

    This is where the trick comes in handy. IF, like me, you have not yet added any news articles to your blog, edit your MySQL database and set the autoincrement start value to 100 for wp_posts - thus, your first post will have ID 100, then 101, 102, 103 and so on. Now your URLs will be like yoursite.com/my-favourite-news-article-100/

    Remember you can change the permalink structure to suit not only Google News, but also for SEO purposes.

    Thanks for posting the code above, I'm now well on my way to integrating WordPress with an ecommerce solution to build an online store/news resource.

    Cheers

2 pingbacks

  1. WordPress Links - Week 53/1 - 2008/2009 | WPStart.org - WordPress themes, plugins and news
  2. Using Google News to drive traffic to your site | CoPress