December 2008 Archives

Every time I'm here in Sweden my sister-in-law has some new tidbit of info about the Cardigans that she learned in some Swedish newspaper or magazine.  On the other side of the world, I don't hear as much about this band as I would if I lived in their backyard like she does.  Wanting to keep up on the latest, I figured I would subscribe to their blog.  To my surprise, the band's site wasn't syndicated!  It seems as if Sweden hasn't been hit by the whole Web 2.0/social networking phenomenon yet (at least not to the degree that America has).  No one that I know over here has heard of Twitter, no one I know is blogging, and famous Swedish bands aren't syndicating their sites.

Not willing to fall behind again, I found a great service called Feed43 that allows you to parse the HTML of a site that doesn't have a feed in order to create one.  In just a few minutes, I used this service to turn the Cardigan's site into a news feed, and was reading all the latest news in Google Reader.  In case you haven't been keeping up the Swedish rockers' current events, feel free to use the feed I made.
There's a lot happening related to cloud computing these days.  To help stay on top of it all, check out Alltop's virtual cloud computing magazine rack.  What other sources of current events are others finding useful?

[Updated Jan. 27, 2009] Since posting this entry, I don't think that I've looked at Alltop's cloud computing "magazine" once.  It's just too far beyond my peripheral vision.  However, I was stuck by the human-compiled list of top bloggers and knew that there was a lot of good info in those blogs.  What to do?  I could copy and paste all them into Google Reader, but that would take a lot of time and then I'd be swamped by all the new info.  Instead, I exported Alltop's list of blogs as an OPML document by appending /opml to the URL of the cloud computing section of their site.  Then, I imported that XML document into PostRank, and configured it to include only the greatest entries from them all.  This reduced the noice a lot.  The final result is one feed that contains the greatest entries on all the blogs that a group of folks over in Hawaii have deemed the very best in cloud computing.  If you would like to subscribe to it, point your feed reader over here.

Now, I just need to find or build a service that will automatically keep Alltop and PostRank in sync.  Anyone know of such a service?
I have been really reluctant to start using Twitter because of what happen when I discovered IRC years ago: I wasted hours and hours chatting with strangers about absolutely nothing.  I was at a Web Innovators meeting the other day, and the speaker, Rick Turoczy, asked who in the room was not using Twitter yet. Of the fifty or so attendees, only one other soul besides me raised their hand.  So, despite the potential waste, I felt compelled to contribute some noise of my own.

A few weeks after signing up, it was starting to seem worthless.  But then, an entrepreneur friend sent me a link to some various Twitter tools, and she got me thinking about the potential utility of this social networking phenomenon.  Inspired by these tools and Kevin Kelly's talk on Ted.com about what is in store from the Web over the next 5,000 days, I thought I would create a Twitter robot that connected a few different Web services to do something interesting.  The result demonstrates the marvel of cloud and utility computing.

One system that I wanted to use was a new service called Twilio that allowed you automatically make phone calls using a text to speech interface.  This service runs in Amazon's cloud, and allows you to initiate phone calls using a RESTful API.  I decided to use this service from the bot as follows: Any time one of the bot's followers sends it a direct message with their phone number, it would call them.  It would ask them to record a greeting, and then it would update its status with a URL to the recorded greeting.

Overview

Before I talk about all the technical details of how I implemented this, let me describe the big picture:

  • Followers of the bot send it a direct message of the form "callme NNN-NNN-NNNN", the N's being the Twitter user's American phone number. (I don't know if Twilio works with non-American numbers.)
  • Twitter sends an email to the address of the bot with some special headers and the direct message in the body of the mail.
  • Procmail is configured to pipe all emails with these Twitter-specific headers to a script.
  • This script initiates a call with Twilio.
  • Twilio makes a phone call and records the message that the recipient leaves.
  • Twilio invokes a callback script and provides a URL to the recording.
  • This callback handler updates the bot's status with this URL, so that followers can click on it and hear the greeting.
This sequence of actions is illustrated in the following diagram:

Twitter_Bot.gif

This figure isn't accurate in a few different respects, but it gives the general idea.

Technical Details

To begin with, I needed to create a new Twitter account for the bot. I chose @tweetybot.  (While creating this new account, I found that Twitter doesn't allow multiple accounts to reuse the same email.  You can work around this using sub-addressing.)  I configured the account such that an email would be sent to [email protected] every time someone sent a direct message.  On the cs.pdx.edu email server, I added the following procmail recipe:

:0:
* X-Twitterrecipientname: tweetybot
* X-Twitteremailtype: direct_message
|~/bin/tweetybot.py
This results in an email being piped into the script tweetybot.py with the direct message in the body.  What about the sub-address?  It isn't being used.  I included it, so that I could reuse my PSU email address. However, if I create another bot, I could update the recipe like this (IINM):

ARG = $1

:0:
* X-Twitterrecipientname: tweetybot
* X-Twitteremailtype: direct_message
* ARG ?? ^^tweetybot^^
|~/bin/tweetybot.py

:0:
* X-Twitterrecipientname: bot2
* X-Twitteremailtype: direct_message
* ARG ?? ^^bot2^^
|~/bin/bot2.py

By doing this, the lines in blue could be omitted, or they could be left and the sub-addressing-related stuff (in yellow) could be removed.  Either way would work (I think).

Initiating the Call

Once the email is piped into the Python script, Tweetybot.py, its contents are parsed for the command, callme, and the phone number to call.  If the command isn't found the email is dropped on the floor.  (As I've written it, the bot only handles one command; however, typical bots support multiple commands, so parsing would have to be beefed up in most scenarios.)  If the command is found, a new call is initiated with Twilio using their REST API.  The twilorest library that's imported can be found on the Twilio Web site.  For posterity, you can download the complete script from my site.  The part that initiates the call is this:

d = {
    'Caller' : CALLER_ID,
    'Called' : phoneNumber,
    'Url' : 'http://web.cecs.pdx.edu/~tspencer/twiliotest.xml',
}
account.request('/%s/Accounts/%s/Calls' % (API_VERSION, ACCOUNT_SID), 'POST', d)

Note that no additional data can be provided when initiating calls.  If the service's interface allowed for this and returned it later when invoking the callback (a common idiom in asynchronous APIs), the user name of the follower who sent the direct message could be included in the eventual status update.

The TwiML Document

As you can see from the snipped above (in yellow), one of the arguments passed to Twilio is a URL.  This refers to an XML document in a markup language called TwiML which contains instructions directing Twilio to record the call that Tweetybot.py initiated (in blue above) using a Record element like this:

<Record action="http://web.cecs.pdx.edu/~tspencer/playback.cgi" maxLength="55"/>
This element contains an action attribute which informs Twilio of the URL to send the notification to once the phone call has been made, recorded, and transcoded.

The need for this document and the way that the URL for it is provided when initiating a call makes for an awkward API (IMHO).  I say this because Twilio will immediately pull down the XML document after initiating the phone call.  Requiring this data when initiating the call would make interacting with the service less complex and more performant (by avoiding a round-trip).  Apparently, this extra request/response might not be necessary in future versions of the API.

The Callback Script

After Twilio does its work, it sends an HTTP POST request to the callback handler, playback.cgi, provided in the TwiML.  This message will contain a parameter called RecordingUrl, which points to the transcoded MP3 version of the phone call (in yellow below).  Given this, it then uses the Twitter JSON API to update the bot's status (in blue below) to let the world know that someone recorded a greeting and where they can go to listen to it:

my $ua = LWP::UserAgent->new();
my $recording_url = uri_escape(param('RecordingUrl'));
my $request = HTTP::Request->new("POST", "http://twitter.com/statuses/update.json", undef, "status=Someone has recording a greeting. Click here to play it back: $recording_url");

$request->authorization_basic('tweetybot', 'PASSWORD');
$ua->request($request);   
One thing to note about this call and the one to Twilio is that the respective credentials are being sent in the clear!  Neither service from what I could find supports HTTPS or a more secure method of authentication.  This is really bad, and limits their applicability and usage.

Also note that the name of the follower who record the message can't be included in the status update, as mentioned above, because Twilio doesn't allow opaque data to be passed in and out with the current API.

Conclusion

In just two hours with no prior understanding of Twitter's API or Twilio's, I was able to create a bot that uses these innovative Web services to respond to an IM, call an arbitrary phone number, record the user's message, transcode the resulting audio clip, and update the bot's status with a URL pointing to the follower's actual voice.  Isn't cloud computing incredible?!  Kevin Kelly was right: We have created just one machine, the Internet, and our phones, laptops, servers, and other devices are just ways to interface with it.  Considering that we can do this today, it's mind boggling to imagine what we'll be able to do in the next 5,000 days.  I can't wait!

(Note: You can get all of the scripts and artifacts from my stash and they are licensed under the GNU GPL v. 2.)