Is this why Twitter-IM doesnt scale?
twitter jabber scale fail
It's been said that twitters IM gateway has been one of the causes for their scale issues, and I just logged into one my older gmail accounts I only use for IM (and twitter-im in particular):

What I found greeting me was 3-4 messages with "off line IM messages" from twitter - 795 lines in one, 1675 in another, and a few other.
Could it be that a simple scaling technique twitter could use is to follow a users jabber presence and NOT send them thousands of off line messages? I had actually thought they already did, but I guess not. Surely only sending IM messages when the user is online would lower their resource consumption tremendously.
Comments (8)
great post
mibus - my assumption wouldn't be to check the database for my last presence and flood me with messages, it would be to simply DROP any message received if they don't have presence. Obviously I'm not around, so I don't care about any missed messages. Direct Messages still get emailed to me so....
Wouldn't it be *more* work to then wait for your 'available' presence, check the database for all posts since your last presence update (which could be a lot), then flood you with messages?
Or, they "deliver" them to your server, and consider their job done. Seems the easier choice (and lower-workload) to me! :)Weird, I also never got anything offline from twitter. Actually, it even stops sending me stuff whenever I go offline!
I need to explicitly tell it to send me stuff again or it wont.weird, i never had that problem here. i definitely never got offline messages from twitter.
Yannooo, they already receive your presence now. At least when I added twitter as a budy, I received the subscription request from twitter. Looking at my roster, twitter@twitter.com is flaged as both.
So no change to their current load. Sure it is not free in terms of scalability, but given the ephemeral nature of presence, you could only store it in memcached. I think its very much worth the effort. Best regards, Best regards,If the twitter bot keeps track of users who are online, it also implies receiving a huge amount of incoming presence that will need to be processed. In addition you need some memory to store who is online/offline. Not sure what's the best approach here but keeping track of who is online is definitively not free in terms of scalability.
Well I am not sure, I never got any offliners from Twitter yet. Also I am sure, they will be getting loads of presence per second coz a presence is sent for every busy, online, unavailable, available action. One of the possible solution can be caching as discussed here http://tinyurl.com/829d53