View Full Version : XML Problem - reading Newsfeeds
SonG0han
09-25-2005, 02:03 PM
hi! i wrote some lines to get title and link of each element in a rss newsfeed.
first i used "rdf:RDF" as root path for xml operations but some have others like "<rss version="2.0">".
I tried to use "*" as root path and it works for the new file. I tried to use "rss" instead of * with GetElementXML and it showed me the whole file in a dialogbox - so it works, too. The XML Data is in memory wheter I use "*" or "rss". BUT with "*" the rss data of the other files with "rdf:RDF" as root are not shown, i get a blank string.
And when I try to access title and link of the ITEMS I get blank strings for the file that has the "<rss version="2.0">" header.
Is there an error in the XML Actions or what? :huh
I hope you can understand what I mean.
SonG0han
09-25-2005, 03:04 PM
Here is the current code
I have modified many different things to try to fix it so its a bit messed up now (the part with RSSadd = ... for example). I tried many different ways to get the data but it does not work for the one with <rss version="2.0">.
the .dat file contains one url per line.
Can anyone of you help me please? :rolleyes
function UpdateNews()
tblRSSFeeds = TextFile.ReadToTable(_SourceFolder.."\\newsfeeds.dat");
if tblRSSFeeds ~= nil then
StatusDlg.SetTitle("Generating News");
StatusDlg.ShowCancelButton(false, "Cancel");
StatusDlg.Show(MB_ICONNONE, false);
File.Delete(_SourceFolder.."\\newsfeed.htm", false, false, false, nil);
txtstring = "<html><head><style type=\"text/css\"><!-- body,td,th {font-family: Verdana, Arial, Helvetica, sans-serif;font-size: 10px;color: #000000;} body {background-color: #FFFFFF;} --></style></head><body>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
for index, value in tblRSSFeeds do
if value ~= "" then
File.Delete(_SourceFolder.."\\newsfeed.tmp", false, false, false, nil);
-- Datei herunterladen
HTTP.Download(value, _SourceFolder.."\\newsfeed.tmp", MODE_TEXT, 20, 80, nil, nil, nil);
LastError = Application.GetLastError();
if LastError == 0 then
-- Datei parsen und HTML erstellen
XML.Delimiter = "|";
XML.Load(_SourceFolder.."\\newsfeed.tmp");
RSSPfad = "rdf:RDF";
foundElements = XML.Count(RSSPfad, "item");
--foundElements = XML.Count("rdf:RDF", "item");###
if foundElements == -1 then
RSSadd = "/channel";
foundElements = XML.Count(RSSPfad..RSSadd, "item");
end
if foundElements == -1 then
RSSPfad = "rss";
foundElements = XML.Count(RSSPfad, "item");
end
if foundElements == -1 then
RSSadd = "/channel";
foundElements = XML.Count(RSSPfad..RSSadd, "item");
end
--result = XML.GetElementXML(RSSPfad);
--Dialog.Message("Notice", result, MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
--Dialog.Message("Pfad", RSSPfad, MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
channel = XML.GetValue(RSSPfad.."/channel/title");
-- channel = XML.GetValue("rdf:RDF/channel/title");###
if channel == "" then
channel = value;
end
channellink = XML.GetValue(RSSPfad.."/channel/link");
if channellink == "" then
channellink = value;
end
txtstring = "<b><font style=\"font-size: 12px\"><a href=\""..channellink.."\" target=\"_blank\">"..channel.."</a></font></b><br>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
if foundElements ~= -1 then
for x = 1, foundElements do
title = XML.GetValue(RSSPfad.."/item|"..x.."/title");
if title == "" then
title = XML.GetValue(RSSPfad.."/channel/item|"..x.."/title");
end
link = XML.GetValue(RSSPfad.."/item|"..x.."/link");
if link == "" then
link = XML.GetValue(RSSPfad.."/channel/item|"..x.."/link");
end
-- link = XML.GetValue("rdf:RDF/item|"..x.."/link");###
txtstring = "- <a href=\""..link.."\" target=\"_blank\">"..title.."</a><br>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
if x == foundElements then
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", "<p>", true);
end
end
else
txtstring = "<p>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
end
end
end
end
txtstring = "</body></html>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
StatusDlg.Hide();
showNews();
end -- rssfeed nicht existent ende
end
with this code the rdf:RDF feeds are shown and the one with "rss version="2.0"" is not visible, just the channelname without any items. (some files contain an additional line on top like: <?xml version="1.0" encoding="ISO-8859-1" ?> )
TJ_Tigger
09-25-2005, 03:35 PM
I tried to use "*" as root path and it works for the new file. I tried to use "rss" instead of * with GetElementXML and it showed me the whole file in a dialogbox - so it works, too. The XML Data is in memory wheter I use "*" or "rss". BUT with "*" the rss data of the other files with "rdf:RDF" as root are not shown, i get a blank string.
You can use "/" to return root elements, then I would check to see if it is rdf or rss and if rss you could then check the version and based on the version retrieve the data you want.
SonG0han
09-25-2005, 03:38 PM
hmm the "best" would be to support all versions but I dont know every difference between.
most file (rdf:RDF) have this:
- channel
- item
but the rss 2.0 has this
- channel
- channel > item
they are inside the channel.
TJ_Tigger
09-25-2005, 06:24 PM
hmm the "best" would be to support all versions but I dont know every difference between.
most file (rdf:RDF) have this:
- channel
- item
but the rss 2.0 has this
- channel
- channel > item
they are inside the channel.
What I started to do was create a function for RSS feeds, and one for ATOM feeds. One problem with trying to code for XML is that it can change from place to place, that is the benefit of XML, and when you put namespaces on top of the flexability of XML then you open a whole can of worms. I would suggest you look into the standards you are most interested in and grab a couple of the XML feeds and use them to code your different functions.
Tigg
TJ_Tigger
09-25-2005, 07:31 PM
It looks like the problem with rdf feeds working and rss feeds not is due to the variable RSSPfad. It is always set to "rdf:RDF" and does not change. I would suggest that when you load your XML file you get the child elements of root "/" and if it is rdf:RDF then do one thing, otherwise if it is rss then do another.
Tigg
TJ_Tigger
09-25-2005, 09:08 PM
Try this
function UpdateNews()
tblRSSFeeds = TextFile.ReadToTable(_SourceFolder.."\\newsfeeds.dat");
if tblRSSFeeds ~= nil then
StatusDlg.SetTitle("Generating News");
StatusDlg.ShowCancelButton(false, "Cancel");
StatusDlg.Show(MB_ICONNONE, false);
File.Delete(_SourceFolder.."\\newsfeed.htm", false, false, false, nil);
txtstring = "<html><head><style type=\"text/css\"><!-- body,td,th {font-family: Verdana, Arial, Helvetica, sans-serif;font-size: 10px;color: #000000;} body {background-color: #FFFFFF;} --></style></head><body>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
for index, value in tblRSSFeeds do
if value ~= "" then
File.Delete(_SourceFolder.."\\newsfeed.tmp", false, false, false, nil);
-- Datei herunterladen
HTTP.Download(value, _SourceFolder.."\\newsfeed.tmp", MODE_TEXT, 20, 80, nil, nil, nil);
LastError = Application.GetLastError();
if LastError == 0 then
-- Datei parsen und HTML erstellen
XML.Delimiter = "|";
XML.Load(_SourceFolder.."\\newsfeed.tmp");
tbElements = XML.GetElementNames("/", true, false);
if tbElements[1] == "rdf:RDF" then
RSSPfad = "rdf:RDF";
foundElements = XML.Count(RSSPfad, "item");
channel = XML.GetValue(RSSPfad.."/channel/title");
channellink = XML.GetValue(RSSPfad.."/channel/link");
--foundElements = XML.Count("rdf:RDF", "item");###
--[[if foundElements == -1 then RSSadd = "/channel"; foundElements = XML.Count(RSSPfad..RSSadd, "item"); end
if foundElements == -1 then RSSPfad = "rss"; foundElements = XML.Count(RSSPfad, "item"); end
if foundElements == -1 then RSSadd = "/channel"; foundElements = XML.Count(RSSPfad..RSSadd, "item"); end]]
elseif tbElements[1] == "rss" then
RSSPfad = "rss";
channel = XML.GetValue(RSSPfad.."/channel/title");
channellink = XML.GetValue(RSSPfad.."/channel/link");
foundElements = XML.Count(RSSPfad.."/channel", "item");
end
--result = XML.GetElementXML(RSSPfad);
--Dialog.Message("Notice", result, MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
--Dialog.Message("Pfad", RSSPfad, MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
txtstring = "<b><font style=\"font-size: 12px\"><a href=\""..channellink.."\" target=\"_blank\">"..channel.."</a></font></b><br>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
if foundElements ~= -1 then
for x = 1, foundElements do
if tbElements[1] == "rdf:RDF" then
title = XML.GetValue(RSSPfad.."/item|"..x.."/title");
link = XML.GetValue(RSSPfad.."/item|"..x.."/link");
elseif tbElements[1] == "rss" then
title = XML.GetValue(RSSPfad.."/channel/item|"..x.."/title");
link = XML.GetValue(RSSPfad.."/channel/item|"..x.."/link");
end
txtstring = "- <a href=\""..link.."\" target=\"_blank\">"..title.."</a><br>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
if x == foundElements then
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", "<p>", true);
end
end
else
txtstring = "<p>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
end
end
end
end
txtstring = "</body></html>";
TextFile.WriteFromString(_SourceFolder.."\\newsfeed.htm", txtstring, true);
StatusDlg.Hide();
showNews();
end -- rssfeed nicht existent ende
end
SonG0han
09-26-2005, 04:48 AM
GREAT! It is working now! :D
Just had to add a "/" in front of rss and rdf:RDF (if tbElements[1] == /xxx) and I added the "if channel or channellink == "" use value" to show the URL if no channel exists in the file. :)
I tried too hard to find the error and could not find the "RSSPfad" problem after editing it so often. :o
Thank you very much!
I hope this will work for most files now because I did not find much info about the RSS standards and different file structures yet. :rolleyes
/edit: one thing I noticed is that sometimes the RSS Data is there twice or more times after switching pages or editing the rss sources on the preferences page. it looks like this:
channelname1
item1
item2
channelname2
item1
item2
item3
----- then i switch page, go back (its an input field with multiline where you add one url per line) and the rss files are saved in newsfeeds.dat (simple textfile). on the frontpage I call the function to read the rss data and generate the .htm file but I delete it prior to that so why can this happen->
channelname1
item1
item2
channelname2
item1
item2
item3
channelname1
item1
item2
channelname2
item1
item2
item3
I have to call the function again to clear it and recreate it. but thats already what I do "on show" :huh
SonG0han
09-26-2005, 05:11 AM
GREAT! It is working now! :D
Just had to add a "/" in front of rss and rdf:RDF (if tbElements[1] == /xxx) and I added the "if channel or channellink == "" use value" to show the URL if no channel exists in the file. :)
I tried too hard to find the error and could not find the "RSSPfad" problem after editing it so often. :o
Thank you very much!
I hope this will work for most files now because I did not find much info about the RSS standards and different file structures yet. :rolleyes
/edit: one thing I noticed is that sometimes the RSS Data is there twice or more times after switching pages or editing the rss sources on the preferences page. it looks like this:
channelname1
item1
item2
channelname2
item1
item2
item3
----- then i switch page, go back (its an input field with multiline where you add one url per line) and the rss files are saved in newsfeeds.dat (simple textfile). on the frontpage I call the function to read the rss data and generate the .htm file but I delete it prior to that so why can this happen->
channelname1
item1
item2
channelname2
item1
item2
item3
channelname1
item1
item2
channelname2
item1
item2
item3
I have to call the function again to clear it and recreate it. but thats already what I do "on show" :huh
/2nd Edit:
Seems it is fixed now, I called the funcion on Preload, I added it to the other on Show actions and now it seems to update properly.
but I don't really know why yet. The web.loadurl seemed to work on preload. :huh
TJ_Tigger
09-26-2005, 08:21 AM
Here is the RSS 2.0 spec.
http://blogs.law.harvard.edu/tech/rss
If you google for rss and rdf you will find a lot of pages with the content. My suggestion would be to look at the feeds that you are most interested in or those which seem to be most popular and build your project around those feeds. If you follow what they do then you should be able to read most of the files.
Tigg
Here is the aggregator and XML parser I have been playing with. The aggretator doesn't have everything in it I want to put in it, but it is fun to play with.
SonG0han
09-26-2005, 09:25 AM
Thanks! :yes
I will take a look at your apps. :)
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.