I just updated the Pro RSS feed. It is very likely that I broke something in the process. *sheepish grin* (Now links in the feed work just like the "link to this post" buttons on the board, instead of just /messages/#/#.html . This way, even if a post has been moved to a new page, the feed's link will still be correct.)
Do report any problems, please! I can't fix it if I'm not aware of it.
I wondered what was breaking RSS Ticker. It seems to have caught up with itself now, though. No problems currently.
Ree, I'm seeing & instead of the correct ampersand in the feed links.
Oops! Should be fixed now. Thanks for the heads up!
Not really feed-related, but close. I set up this page on LiveJournal, publically viewable, to track all the Profusionite character journals I know of on there.
It's a stopgap, really; I'd plan to set up something better if and when we switch to cool new software (read: Drupal).
Say, Ree, do you think you'd be able to tinker with the Pro board feed so it displays the newest items first?
I'm not sure how much leeway your scraping method gives you - for example, ideally I'd like the post excerpts to be the item description/summary rather than part of the title, but if that'd be too tricky, don't worry about it.
Here goes.
Sorting by post order: Mayyyybe, but it probably won't be soon. The dates and times I scrape from Last Week aren't in any format that I could just order PHP to sort for me, so right now the feed items are in the same order as on Last Week. I can try to change that, but it will probably take several regular expressions and a lot of patience. I know post order would make the feed a lot more sensible, though, so I hope I can get something going there.
Leeway given by scraping method: It's only a customised version of RSSify, so not much.
Post excerpts in feed item description/summary: Sure! I actually have test code working that separates the date/time stamp, author name, and post excerpt. The reason it's not in the live feed yet is that I'm not sure how to arrange those elements. Should it go date/time and author in title? Just date/time, with author name in the summary? Should author name go before or after the post excerpt? Basically I win at second-guessing myself. If somebody wants to tell me "Do it this way," I'll get right on it.
Topic name in feed, somewhere: Nobody mentioned this but it's been on my mind. I would love to get the board name in there. Unfortunately it's more complex than I thought it would be. If I get a chance to sit down with several consecutive hours of free time and a browser tab locked on PHP.net, I might get somewhere. Or I might not. It also might make the feed sloooooow, and it's not terribly fast as-is.
Don't spend too much time, srsly. Long-term plans, after all, negate Discus's less-than-optimal setup altogether.
Just a thought - would it help at all if the pages were in clean HTML instead of tables?
FWIW, I like the format of the Profusion TEST feed with the text excerpt better than the one with the excerpt in the title.
Having "in [topicname]" in the title would be nice, but it's not important and not worth much work.
Is there anything that can be done in the way of changing the date format on the Last Week page to something more useful for the RSS feed?
I don't consider time spent working on the Pro feed lost. If we got a magic self-updating feed next week, I'd still be glad I spent last night learning about PHP's date() function and discovering that my "wait, would this actually work?" method for getting real datestamps? DOES. Regex + date() = WIN.
Tables aren't really a problem for the scraper. Clean HTML might load a bit faster for the scraper and thus make the feed load faster too, but it's not something I'm worried about.
Anke, thanks for the feedback. I'm planning to bring the post excerpt out of the title in the real feed, too, but I want to try adding the topic name into the feed first. If I can get the topic/story name, that will affect the way I arrange the titles on post items.
The only change to Last Week/Day that I'm really pulling for is longer post excerpts. If it would be possible to add the topic name in an HTML comment or somesuch on the same (rendered) line as the post excerpt, that would be a big help to me, too.
...Let me have a look at that last point.
The thing is, there's three versions of the tree view generated. IE, Mozilla and non-JS. For the commented topic name, I've tried putting in what I think is the right thing into the non-JS version assuming that's what you're capturing.
I've changed a few character lengths but, again, don't know if they're the right ones.
Will try to check in again later today, may not be able to.
Oh geez, three versions?! The Dept. of Redundancy Dept. would like to recommend Discusware for their department headquarters dept. message board. Department. (Sorry, done now.)
Well, the scraper looks for "font size=1>" to indicate the start of a new feed item, and "</font>" for the end. If there's three versions, they probably choose which to show based on User-Agent, and the scraper doesn't pretend to be IE or Mozilla so it gets the non-JS. Since I use Opera, I also see the non-JS (I'd wondered why I saw a JavaScript function with nothing inside it!) - here is a sample of non-JS Last Day if it helps to View Source on that.
But is the commented topic name visible on it after I made the tweak above?
Doesn't seem to be, though it's entirely possible I'm missing the obvious. :(
I've been fixing up the Pro feed with the help of Yahoo Pipes. If you noticed a hiccup in the feed last week, it was probably me, fiddling with the permalinks - sorry about that. It shouldn't happen again.
Here's the gussied-up feed.
It now puts the newest entries on top, adds the thread name to the beginning of each post except, and contains the full text of each post.
As always, let me know if you have any problems. I've noticed that sometimes Pipes likes to just crap out on me. Usually it refuses to add the full text to the feed but otherwise works, which is tolerable; occasionally it decides to return a completely empty feed, even when there have been recent posts, and that obviously is not OK. Unfortunately I haven't figured out why it does that.