XML - Managing Data Exchange/RSS

Learning Objectives
Upon completion of this chapter, you will


 * Understand the basics of RSS
 * Understand the history of RSS
 * Be able to construct a RSS 2.0 document using XML
 * Subscribe to an RSS aggregator/reader

Introduction
RSS is a simple XML format used to syndicate headlines. It is now popularly used by websites that publish new content regularly and provide a list of headlines with links to their latest content. Content such as news feeds, events listings, project updates, blogger and most recently podcasting, video and image distribution can all be distributed by RSS. RSS feeds are also used by major Internet portals such as Google, Yahoo and AOL for people to personalize and have information that they care about delivered to them, i.e. MyYahoo.

What does RSS mean?
RSS is considered a name variously used to refer to three different standards. The three separate branches are the RSS 0.9 branch, the RSS 1.0 branch (which is based on RDF) and RSS 2.0, and the initials have been expanded into three different names: "Really Simple Syndication" (RSS 0.9, 2.0), "Rich Site Summary" and "RDF Site Summary" (for RSS 1.0).

Several different versions have been developed by different developers under different names. According to XML.com, seven versions of RSS have been developed (see What is RSS?). Because RSS is understood as a term referring to many types of syndication protocols, these various RSS protocols have sometimes been accused of being "incompatible" with each other (see The myth of RSS compatibility). This is an important issue for RSS reader/aggregator developers.

History
The original version (version 0.90) of RSS was released by Netscape in 1999. Netscape developers were designing a format for making portals of headlines for news sites. After Netscape released the simplified version of RSS, they lost interest in developing RSS. However, another company, UserLand Software took over with intention to use RSS with their web-logging products and web-based writing software. While UserLand Software continued development with version 0.91, a third non-commercial group split off from the company and designed a new format based on version 0.90, which was a non-simplified version. The new format developed by this non-commercial group became known as version 1.0. In the meantime UserLand Software grew angered at the new 1.0 version, kept developing RSS and released version 2.0. Version 2.0 has become the leader and most widely adopted version of RSS. The 2.0 specification was donated to a non-commercial third party, Harvard Law School. Harvard Law is now responsible for the future development of the RSS 2.0 specification. Below is a table that describes each version, the owner, pros and cons, as well as its current status and recommendation for use.

RSS structure
A RSS document is often known as RSS feed and can have three different types of file extensions: .RSS, .XML and .RDF. All RSS documents must conform 100% to the XML specification begin with the XML declaration. To identify a RSS document, the top level starts with a element, followed by a mandatory version attribute that specifies the RSS version. Sub-element to the element, is the single element which contains a brief description of the channel. Below is a sample of RSS(2.0) from the New York Times.

Exhibit 1: Data model for RSS 

Figure 1-1: New York Times - HomePage.xml - RSS version 2 The element has three mandatory elements and several optional elements. Mandatory elements:

Optional elements: Other optional elements include: managingEditor, webMaster, pubDate, category, generator, docs, cloud, ttl, image, rating, textInput, skipHours, skipDates. The requirement or sub-elements of each element please refer to the RSS specification.(see at Harvard Law). Below are example of image element. elements: A channel may contain a number of s. An item may represent a "story" - much like a story in a newspaper or magazine; if so, its description is a synopsis of the story. The link points to the full story. An item may also be complete in itself, if so, the description contains the text (entity-encoded HTML is allowed; see examples), and the link and title may be omitted. Each RSS channel can contain up to 15 items. All elements of an item are optional,however, an element must contain at least one or element. elements: Others include: source, enclosure, category, and comments.(see at Harvard Law). An item can either be a child or a sibling of a channel.

More optional elements visit RSS 2.0 Specification

How does it work?
RSS can be divided into two parts; the reader/ag and the feed. The reader is the program that reads and presents the RSS feed in an understandable format. The feed is the website with its RSS file. RSS feeds are typically identified on webpages with an orange rectangle icon, or an orange icon with the letters RSS written on it. To view the XML code, you simply have to click on the icon.

Creating an RSS feed
A website author can establish a RSS feed for itself in different ways; either by doing it manually, by using software or by online services. Most large websites use content management software to produce their RSS feed. Every time a change is made on their website, the content management software produce a RSS file of the changes with the new items added and old items removed.

Subscribing to an RSS feed
As a RSS subscriber you need a RSS aggregator. By feeding a RSS link, the aggregator will search for information you subscribed and display them. Say that you subscribe on the sport section in the New York Times; each time the NY Times publish a new sport article the article’s headlines, description and the URL will be displayed on your computer. Whenever you are online, the aggregator will search out and sort your list of interests and display them.

RSS Aggregators
RSS aggregator (aka RSS Reader) is an application that is used to collect, update and display RSS feeds. Below is a list of some RSS aggregators for different platforms that the aggregator will work properly on.

Some others include:
 * FeedReader - Windows
 * Sharp Reader - Windows(.NET)
 * NetNewsWire - Macintosh
 * Straw - Linux
 * Bloglines - Server-based
 * NewsHutch - Server-Based
 * AmphetaDesk - Windows, Macintosh, Linux
 * FeedDemon - Windows
 * FeedReader - Windows
 * NewsGator - Windows(.NET)
 * RSS NewsWatcher - Windows
 * Radio Userland - Windows, Macintosh
 * SlashDock - Macintosh
 * PocketFeed - PocketPC

Future of RSS
The future of RSS seems very promising as version 2.0 has become extremely popular with the Internet industry and somewhat the standard of the RSS versions. Yahoo recently released its new version of Yahoo Maps and the API is based on georRSS version 2.0. This version of Yahoo Maps allows users to edit the information on the maps, which makes the Maps and Local Search products more effective. RSS version 2.0 is also very popular with distributing podcasts to the subscriber base along with distributing content Google’s blogger product. Furthermore, RSS is being utilized in an innovative way for search engine marketers to submit time sensitive content to the engines. The Mozilla Firefox browser already contains an internal RSS aggregator that allows users to view RSS news and blog headlines in the bookmark toolbar or bookmark menu. This is accomplished through the Mozilla Firefox feature named “Live Bookmarks”. RSS has quickly become a mainstream technology in a relatively short period and has definitely become a major player in the Internet space.