How to parse rss-feeds / xml in a shell script

I'd like to parse rss feeds and download podcasts on my ReadyNas which is running 24/7 anyway.

So I'm thinking about having a shell script checking periodically the feeds and spawning wget to download the files.

What is the best way to do the parsing?

Thanks!

Answers


Sometimes a simple one liner with shell standard commands can be enough for this:

 wget -q -O- "http://www.rss-specifications.com/rss-podcast.xml" | grep -o '<enclosure url="[^"]*' | grep -o '[^"]*$' | xargs wget -c

Sure this does not work in every case, but it's often good enough.


Do you have access to awk? Maybe you could use XMLGawk


I read about XMLStartlet here and there

But is there a port to ReadyNas NV+ available?


I've wrote the following simple script for downloading XML from Amazon S3, so it would be useful for parsing different kind of XML files:

#!/bin/bash
#
# Download all files from the Amazon feed
#
# Usage:
#  ./dl_amazon_feed_files.sh http://example.s3.amazonaws.com/
# Note: Don't forget about slash at the end
#

wget -qO- "$1" | grep -o '<Key>[^<]*' | grep -o "[^>]*$" | xargs -I% -L1 wget -c "$1%"

This is similar approach to @leo answer.


You can use xsltproc from libxml2 and write a simple xsl stylesheet that parses the rss and outputs a list of links.


Need Your Help

java.lang.NoSuchMethodError: No static method getFont(Landroid/content/Context;ILandroid/util/TypedValue;ILandroid/widget/TextView;)

android android-studio-3.0 android-gradle-3.0

After I updated my Android Studio to 3.0 I am getting No static method getFont() error. The project on which I am working is on github, https://github.com/ik024/GithubBrowser