How to get the RSS source using IE automation?

I have the following powershell script to get the RSS result. However, the script returns the HTML code of formatted RSS content instead of the raw RSS source, which can be viewed by right click the IE screen and "View source".

Question:

How to get the raw RSS(XML) source?

$url = "http://www.osnews.com/files/recent.xml"
$ie = New-Object -com "InternetExplorer.Application"
$ie.Navigate($url)

while ($ie.busy) { start-sleep -milliseconds 1000; }

$ie.Document.documentElement.OuterHTML 

Update: I didn't use webclient because I need to login my site first (I just use osnews.com as an example here). It seems it's not easy to use webclient to login my site using powershell (cookie, credential, ... etc).

My original example:

$ie$url = New-Object -com "InternetExplorer.Application"
$ie.Navigate("http://mysite.com/login")
$ie.visible = $true

while ($ie.busy) { start-sleep -milliseconds 1000; }

$ie.Document.getElementById("username").value = "myusername";
$ie.Document.getElementById("password").value = "mypassword";
$ie.Document.getElementById("login").click();

while ($ie.busy) { start-sleep -milliseconds 1000; }

$url = "http://mysite.com/rss/..."
$ie.Navigate($url)}

[xml]$rss = $ie.Document.documentElement.OuterHTML

Answers


Try using WebClient, instead:

$url = "http://www.osnews.com/files/recent.xml"
$client = new-object System.Net.WebClient
$htmlsource = $client.DownloadString($url)
$xml = [xml]($htmlsource)

Once you get to this point, then you can do anything. For example, you can print everything, like this:

$xml.rss.channel.item

Or, just the first 10 titles, like this:

$xml.rss.channel.item | select title -f 10

Try something like this:

$feed=[xml](new-object system.net.webclient).downloadstring("http://www.osnews.com/files/recent.xml")  
$results= $feed.rss.channel.item | Select-Object TITLE,DESCRIPTION | ConvertTo-Html | out-file c:\rss.htm

Invoke-Expression C:\rss.htm

Don't user Internet Explorer. You can do it e.g. by this code (PowerShell V2):

$w = New-Object Net.WebClient
$xml = [xml]$w.DownloadString('http://www.osnews.com/files/recent.xml')

Update:

Getting rss source is much more tricky, because InternetExplorer formats it automatically. Also if I uncheck tools->Content->Settings (for information services)->something like "turn on info channel..." (just guessing, I have windows localized to czech), then it shows rss itself in IE (not formatted as a feed, but formatted as XML). However, the $ie.document.body.innerhtml is still html :(


Need Your Help

cap deploy setup error - no directory for rvm-shell

ruby-on-rails ruby deployment nginx capistrano

I have a rookie question, I am new to server deployment on rails. The current set up is Nginx and unicorn, with Rails 4.0.0 and Ruby 2.0.0-p247.

Show table id based on radio button selection.

javascript jquery

Objective - user select regular, short or long then toggle between Inch and CM based on the selection made previously.