Scraping Up RSS News
ø
A few weeks ago, a Thursday night blogger meeting was talking about
“scraping” news from Harvard websites so that their contents could be
read in RSS aggregators as well as Web browsers. I’ve just discovered
that O’Reilly publications has a related chapter online as
a sample from a new book, Spidering Hacks.
If the book was mentioned that Thursday, I didn’t catch the title. In
any case, the extracted chapter may not have been posted online
at that point. Somewhat technical Thursday-nighters and online-news researchers may find it useful. Authors Kevin Hemenway and Tara Calishain
devote several chapters to blogging and RSS feeds, and say the book is
for “developers, researchers, technical assistants, librarians, and
power users.”
Hack #24: Painless RSS with Template::Extract
visualize what data on a page looks like, explain it in template form
to Perl, and not bother with the need for parsers, regular expressions,
and other programmatic logic? That’s exactly what Template::Extract
helps you do….

