You are viewing a read-only archive of the Blogs.Harvard network. Learn more.

Scraping Up RSS News

ø

A few weeks ago, a Thursday night blogger meeting was talking about
“scraping” news from Harvard websites so that their contents could be
read in RSS aggregators as well as Web browsers. I’ve just discovered
that O’Reilly publications has  a related chapter  online as
a sample from  a new book, Spidering Hacks.
If the book was mentioned that Thursday, I didn’t catch the title. In
any case, the extracted chapter may not have been  posted online
at that point. Somewhat technical  Thursday-nighters and online-news researchers may find it useful. Authors  Kevin Hemenway and Tara Calishain
devote several chapters to blogging and RSS feeds, and say the book is
for “developers, researchers, technical assistants, librarians, and
power users.”

Hack #24: Painless RSS with Template::Extract

Wouldn’t it be nice if you could simply
visualize what data on a page looks like, explain it in template form
to Perl, and not bother with the need for parsers, regular expressions,
and other programmatic logic? That’s exactly what Template::Extract
helps you do….

Scraping Up RSS News …

Be Sociable, Share!
previous:
Scholarly Publishing About Communication
next:
Narrative Journalism Conference Weblog

Comments are closed.