How fast.....less than half an hour so the content of the feeds is slightly relevant. Especially the twitter feeds :P Not that I use twitter much myself, but others have the feeds coming in and it takes long for them to appear
OK - now think about how many twitter users there are with their feeds on Jaiku (I haven't looked, but for the sake of discussion, let's say 10,000). That's 20,000 hits an our from our servers to twitter's... which would look very much like a denial of service attack from their end.
The same applies to any service that a lot of our users have feeds from - last.fm, tumblr... whatever.
And that's just considering the impact on the other end of the equation - on our end, that would mean hundreds or thousands of outbound requests per minute.
I know termie's working on a variety of ways to speed things up - hopefully now that Jaiku's part of the Google family, we'll be able to pull on some of those resources.
"That's 20,000 hits an our from our servers to twitter's... which would look very much like a denial of service attack from their end." I say — onward, brave little http requests! (CHAAARGE! :)
@yoggel - I think once per feed per hour would be a good target to aim for, at least as a first pass.
@fyrefiend - oh, for sure. I agree completely. But there's a line to walk between 'active and legitimate use' and 'causing servers to melt down'. We very much want to stay on the right side of that :)
@malach: I'm just kidding of course :) Once an hour seems like a good target. If one wants twitter messages to reach jaiku faster, one could always use a client that posts to them both.
I'd suggest that RSS (and Atom etc) were a good 'first step' to syndication. The next thing we need (the greater internet, not just Jaiku) is a push mechanism or a ping mechanism tied to it.
This mechanism need to be standardized at least as much as RSS or Atom are. I'd say the biggest 'offenders' in the constant need for up-to-date information are going to be Jaiku, Plaxo, Google Reader and maybe a couple of others. Get a couple of the larger content providers, like Last.FM and Facebook, and gather in a quiet room somewhere and agree on a protocol. Once the behemoths have worked out what they all agree on, everyone else will follow along.
If you have such a system in place, Jaiku doesn't have to ask Twitter 20,000 times an hour about stuff that hasn't changed. Twitter tells Jaiku 300 times an hour to either re-request a document (ping) or it actually sends the updated content (push). This reduces the connections and traffic on both sides of the system.
Some quick thoughts:
The service that wants the information becomes a subscriber to that information.
The RSS/Atom document would have a link in it for a (common API) subscription service.
Subscriptions would need to be renewed periodically (define the period in the document as some might want a weekly renewal and other just annually).
Renewal means that the service isn't pinging or pushing to something that no longer wants the information but forgot to unsubscribe.
You'd also want to consider a failure limit so that if 5/10/100 pushes/pings failed, the subscription is terminated automatically
For sure there's a pile of implementations, but that example isn't of the sort I envision. That requires a stream .. a permanent connection. My vision is of regular http connections being used to push the data to subscribers
So does this mean you want something like this:
1- client issues a POST with a url as the payload (this amounts to registering a subscription) to a notification server
2-notification servers starts sending POST requests to that url to inform the client that content has changed. The changed content and it's url are in the body of the POST request.
Jaiku sends a request to that address that says:<subscribe:callback method="ping">http://jaiku.com/ping?276248</subscribe:callback>
Last.FM stores this in their list of pings for feed 92837463
Next time feed 92837463 gets updated, last.fm sends a ping to http://jaiku.com/ping?276248 along the lines of <subscribe:ping>http://url.of.rss/</subscribe:ping>
Because it's a ping, Jaiku.com then goes and grabs the RSS, knowing that something has changed.
If we were using push then it would be slightly different:
26 comments so far
they work well enough for me... although a bit slow.
2 years ago by BUGabundo
very slow.
2 years ago by bmc
How fast would you expect it to be?
2 years ago by malach
@yoggel - ok, how far would you realistically expect it to be? :)
2 years ago by malach
erm, fast.
2 years ago by malach
How fast.....less than half an hour so the content of the feeds is slightly relevant. Especially the twitter feeds :P Not that I use twitter much myself, but others have the feeds coming in and it takes long for them to appear
2 years ago by edythemighty
it needs to be faster, i fully agree, haven't figured out how to set up and claim processing power and bandwidth on google servers yet though
2 years ago by termie
Like edythemighty said, 30 would be the longest it should take. 15 would be perfect.
2 years ago by FyreFiend
minutes that is
2 years ago by FyreFiend
Wouldn't want to wait an entire day after all ;)
2 years ago by edythemighty
@termie: yes, why not tie it into the google reader servers? They've got some serious stuff going on there in terms of feed updating rate :)
2 years ago by lemonad
OK - now think about how many twitter users there are with their feeds on Jaiku (I haven't looked, but for the sake of discussion, let's say 10,000). That's 20,000 hits an our from our servers to twitter's... which would look very much like a denial of service attack from their end.
The same applies to any service that a lot of our users have feeds from - last.fm, tumblr... whatever.
And that's just considering the impact on the other end of the equation - on our end, that would mean hundreds or thousands of outbound requests per minute.
I know termie's working on a variety of ways to speed things up - hopefully now that Jaiku's part of the Google family, we'll be able to pull on some of those resources.
2 years ago by malach
@malach I can see your point but at the same time the speed that a feed updates really effects its usefulness.
2 years ago by FyreFiend
"That's 20,000 hits an our from our servers to twitter's... which would look very much like a denial of service attack from their end." I say — onward, brave little http requests! (CHAAARGE! :)
2 years ago by lemonad
:D Yes! We must feed our internet addiction! (bad pun, I know, but felt i had to say it)
2 years ago by edythemighty
@yoggel - I think once per feed per hour would be a good target to aim for, at least as a first pass.
@fyrefiend - oh, for sure. I agree completely. But there's a line to walk between 'active and legitimate use' and 'causing servers to melt down'. We very much want to stay on the right side of that :)
2 years ago by malach
@lemonad - I think Jaiku has even more of an obligation to be 'not evil' these days :)
2 years ago by malach
@malach: I'm just kidding of course :) Once an hour seems like a good target. If one wants twitter messages to reach jaiku faster, one could always use a client that posts to them both.
2 years ago by lemonad
@lemond - Happy medium. Twitku to the rescue! or
2 years ago by edythemighty
I'd suggest that RSS (and Atom etc) were a good 'first step' to syndication. The next thing we need (the greater internet, not just Jaiku) is a push mechanism or a ping mechanism tied to it.
This mechanism need to be standardized at least as much as RSS or Atom are. I'd say the biggest 'offenders' in the constant need for up-to-date information are going to be Jaiku, Plaxo, Google Reader and maybe a couple of others. Get a couple of the larger content providers, like Last.FM and Facebook, and gather in a quiet room somewhere and agree on a protocol. Once the behemoths have worked out what they all agree on, everyone else will follow along.
If you have such a system in place, Jaiku doesn't have to ask Twitter 20,000 times an hour about stuff that hasn't changed. Twitter tells Jaiku 300 times an hour to either re-request a document (ping) or it actually sends the updated content (push). This reduces the connections and traffic on both sides of the system.
Some quick thoughts:
The service that wants the information becomes a subscriber to that information.
The RSS/Atom document would have a link in it for a (common API) subscription service.
Subscriptions would need to be renewed periodically (define the period in the document as some might want a weekly renewal and other just annually).
Renewal means that the service isn't pinging or pushing to something that no longer wants the information but forgot to unsubscribe.
You'd also want to consider a failure limit so that if 5/10/100 pushes/pings failed, the subscription is terminated automatically
2 years ago by RickMeasham
@RickMeasham - Sounds great. Now to...."coerce" the big players into a room and lock it!
2 years ago by edythemighty
@termie: maybe you could run it on EC2? ;-)
2 years ago by kshep
@RickMeasham - It sounds like you're looking for something like this: http://updates.sixapart.com/
2 years ago by adewale
For sure there's a pile of implementations, but that example isn't of the sort I envision. That requires a stream .. a permanent connection. My vision is of regular http connections being used to push the data to subscribers
2 years ago by RickMeasham
So does this mean you want something like this: 1- client issues a POST with a url as the payload (this amounts to registering a subscription) to a notification server 2-notification servers starts sending POST requests to that url to inform the client that content has changed. The changed content and it's url are in the body of the POST request.
2 years ago by adewale
(Repost to fix formatting)
Here's a typical outline of how I see the concept working:
For
pingtype subscriptions:RSS file says <subscribe:ping>http://last.fm/ping/subscribe/92837463</subscribe:ping>
Jaiku sends a request to that address that says:<subscribe:callback method="ping">http://jaiku.com/ping?276248</subscribe:callback>
Last.FM stores this in their list of pings for feed 92837463
Next time feed 92837463 gets updated, last.fm sends a ping to http://jaiku.com/ping?276248 along the lines of <subscribe:ping>http://url.of.rss/</subscribe:ping>
Because it's a ping, Jaiku.com then goes and grabs the RSS, knowing that something has changed.
If we were using
pushthen it would be slightly different:RSS file says <subscribe:push>http://last.fm/push/subscribe/92837463</subscribe:push>
Jaiku sends a request to that address that says: <subscribe:callback method="push">http://jaiku.com/push?276248</subscribe:callback>
Last.FM stores this in their list of pushes for feed 92837463
Next time feed 92837463 gets updated, last.fm sends the whole RSS file to http://jaiku.com/ping?276248
Because it's a push, Jaiku.com doesn't need to grab the RSS
Note that all XML suggestions are poorly thought through, though I believe the concept is sound
2 years ago by RickMeasham