Monday, November 08, 2010 at 12:07 AM.
system.verbs.builtins.xml.rss.readService
on readService (url, adrservices, referer="", adrStoryArrivedCallback=nil, flSaveData=false) {
<<Changes
<<10/29/03; 1:58:10 AM by JES
<<Fix a bug username/password support: Decode url-encoded characters in the username and password as present in the service URL, before passing them to tcp.httpClient.
<<11/8/02; 11:10:52 AM by DW
<<If the service has a skipHoursList as part of its compliation table, and if it's one of the hours we've been told to skip, skip it.
<<10/31/02; 4:52:43 PM by DW
<<Add call to xml.rss.serviceDidntChange to flow control through the changesUrl. If it says it didn't change, we don't read the feed.
<<10/23/02; 11:50:18 AM by DW
<<Track updates in a new sub-table of the service table called hourlyUpdateCounts.
<<10/22/02; 6:50:00 AM by DW
<<Allow for callbacks..
<<1. They receive the same parameter list that this routine receives.
<<2. They return the address of the service table.
<<3. This routine returns after calling the first callback that does not scriptError.
<<4. The scripts table is at aggregatorData.callbacks.readService.
<<10/21/02; 8:11:31 AM by DW
<<Store the HTTP response headers in the (new) httpResponseHeaders sub-table of each service table.
<<And we're now ETag-aware, per Simon Fell's excellent Busy Developer's Guide.
<<http://www.pocketsoap.com/weblog/stories/2002/05/19/bdgToEtags.html
<<Thanks Simon!
<<10/17/02; 9:31:32 AM by DW
<<When we do the redirection, mark the subscriptions as dirty to force the users mySubscriptions.opml to be saved. Some people's subs file contains the non-redirected urls which makes the (in development) RSS Explorer tool behave very strangely when you subscribe and unsubscribe to the pre-redirected urls.
<<10/13/02; 6:47:17 PM by DW
<<Changes to make 301-type redirection permanent. This script changed, as did tcp.httpClient.
<<Also had to change xml.aggregator.subscribeService. It watches for an address change for the service table.
<<Also had to change xml.aggregator.readService to percolate the redirect up the call stack.
<<Summary, changed parts:
<<tcp.httpClient
<<xml.rss.readService
<<xml.aggregator.subscribeService
<<xml.aggregator.readService
<<3/24/02; 8:49:35 AM by DW
<<New optional parameter, flSaveData, passed to xml.rss.compiledService. If true, each item has a data sub-table, containing the non-mashed elements from the items in the XML feed.
<<12/17/01; 4:47:26 PM by DW
<<Support username-password's encoded in the URL, per request by Doug Kaye.
<<12/3/01; 5:43:04 PM by DW
<<Manage ctErrors, ctConsecutiveErrors, timeLastError.
<<Thursday, December 28, 2000 at 7:26:03 AM by DW
<<Added adrStoryArrivedCallback, it's called when a new story has arrived. You can store it in a database, or whatever else you might want to do.
<<Sunday, December 17, 2000 at 10:16:37 AM by DW
<<Created. Reads a URL into a service table.
<<Referer is passed as a header if non-empty, allows the server to track references. It shows up in the /stats/referers page on Manila sites.
bundle { //callbacks, 10/22/02 by DW
local (adrdata = xml.aggregator.init ());
local (adrtable = @adrdata^.callbacks.readService);
if not defined (adrtable^) {
new (tabletype, adrtable)};
local (adrscript);
for adrscript in adrtable {
try {
while typeof (adrscript^) == addresstype {
adrscript = adrscript^};
return (adrscript^ (url, adrservices, referer, adrStoryArrivedCallback, flSaveData))}}};
local (adrservice = xml.rss.initService (url, adrservices)); //make sure all fields are init'd
bundle { //11/8/02 by DW, respect skipHoursList, if present
if defined (adrservice^.compilation.skipHoursList) {
local (day, month, year, hour, minute, second);
date.get (clock.now (), @day, @month, @year, @hour, @minute, @second);
if adrservice^.compilation.skipHoursList contains hour {
return (adrservice)}}};
<<bundle //10/31/02 by DW, if it uses a changes.xml ping, and it hasn't updated, skip the read, optimize
<<This feature isn't ready for general use yet.
<<11/8/02; 11:12:20 AM by DW
<<if xml.rss.serviceDidntChange (adrservice)
<<return (adrservice)
try {
local (headers, adrheaders = nil, flHaveEtag = false);
if sizeof (referer) > 0 {
new (tabletype, @headers);
headers.referer = referer;
adrheaders = @headers};
bundle { //if we have an ETag, add a If-None-Match header
<<See http://www.pocketsoap.com/weblog/stories/2002/05/19/bdgToEtags.html
try {
etag = adrservice^.httpResponseHeaders.ETag;
if adrheaders == nil {
new (tabletype, @headers);
adrheaders = @headers};
headers.["If-None-Match"] = etag;
flHaveEtag = true}};
local (urllist = string.urlSplit (url), server = urllist [2], path = urllist [3], username = "", password = "");
bundle { //set username, password, if they were present in the URL
<<Example: http://doug:guacamole@www.infrastrat.com/ht/rss.xml
if server contains "@" {
local (s = string.nthfield (server, "@", 1));
username = string.urlDecode (string.nthfield (s, ":", 1));
password = string.urlDecode (string.nthfield (s, ":", 2));
server = string.nthfield (server, "@", 2)}};
local (redirectInfo); new (tabletype, @redirectInfo);
local (s = tcp.httpClient (server:server, path:path, timeOutTicks:60*30, flmessages:false, ctFollowRedirects:5, adrHdrTable:adrheaders, username:username, password:password, adrRedirectInfo:@redirectInfo));
local (statusCode = string.nthField (s, ' ', 2));
s = string.httpResultSplit (s, @adrservice^.httpResponseHeaders);
local (flchanged = true);
if flHaveEtag {
if statusCode == "304" {
flchanged = false}};
if flchanged {
if sizeof (redirectInfo) > 0 { //at least one level of redirection, 10/13/02 by DW
local (i, adr);
for i = sizeof (redirectInfo) downto 1 {
adr = @redirectinfo [i];
if adr^.code == "301" { //permanent redirect
local (newurl = "http://" + adr^.server);
if adr^.port != 80 {
newurl = newurl + ":" + adr^.port};
newurl = newurl + "/" + adr^.path;
table.rename (adrservice, newurl);
adrservice = parentof (adrservice^);
adrservice = @adrservice^.[newurl];
bundle { //10/17/02 by DW, mark subscriptions as needing a refresh
try {
local (adraggregatordata = xml.aggregator.init ());
adraggregatordata^.settings.flSubscriptionsChanged = true}};
break}}};
local (now = clock.now ());
adrservice^.timeLastRead = now;
adrservice^.ctReads++;
if defined (adrservice^.xmltext) {
if s == adrservice^.xmltext {
flchanged = false}};
if flchanged {
bundle { //track hourlyUpdateCounts per service
local (adrtable = @adrservice^.hourlyUpdateCounts);
if not defined (adrtable^) {
local (i);
new (tabletype, adrtable);
for i = 0 to 23 {
adrtable^.[string.padwithzeros (i, 2)] = 0}};
local (day, month, year, hour, minute, second);
date.get (now, @day, @month, @year, @hour, @minute, @second);
adrtable^.[string.padwithzeros (hour, 2)]++};
adrservice^.ctChanges++;
adrservice^.xmltext = s;
adrservice^.timeLastChange = now;
local (errorstring = "");
semaphore.lock (this, 3600); //keep stories from each channel in a group
try {
xml.rss.compileService (adrservice, flSaveData, adrStoryArrivedCallback)}
else {
errorstring = tryerror};
semaphore.unlock (this);
if errorstring != "" {
scripterror (errorstring)}};
try {delete (@adrservice^.error)};
adrservice^.ctConsecutiveErrors = 0}}
else {
adrservice^.error = tryError;
adrservice^.ctErrors++;
adrservice^.ctConsecutiveErrors++;
adrservice^.timeLastError = clock.now ()};
return (adrservice)}
<<bundle //test code
<<readService ("http://www.scripting.com/rss.xml", @aggregatordata.services, "jerzy")
<<readService ("http://inessential.com/xml/rss.xml", @aggregatordata.services, "jerzy")
<<readService ("http://boingboing.net/rss.xml", @aggregatordata.services, "jerzy")
<<readService ("http://www.infrastrat.com/ht/rss.xml", @scratchpad) //should get an error
<<readService ("http://doug:guacamole@www.infrastrat.com/ht/rss.xml", @scratchpad)
<<readService ("http://radio.weblogs.com/0001015/categories/radio7.1/rss.xml", @scratchpad)
This listing is for code that runs in the OPML Editor environment. I created these listings because I wanted the search engines to index it, so that when I want to look up something in my codebase I don't have to use the much slower search functionality in my object database. Dave Winer.