Dump Your Twitter Friends with Tweet-SQL

I’ve been using Twitter for a while, and I decided to prune the number of people I’m following.

But before I start going through my list of friends by hand, why not strip out everybody who hasn’t been active in a certain amount of days?

To help me do that, I’ll be using Tweet-SQL.  It’s a set of stored procedures you can use to call the Twitter API from Microsoft SQL Server using plain T-SQL code.  In today’s examples, I’ve got T-SQL code that will:

  • Call the Twitter API to get the list of people I’m following (my “friends” in Twitter terms)
  • For each friend, fetch their last updates
  • Generate a web page to help me unfollow people

Buckle your seatbelts!

Our Twitter Cache Database Schema

If you read my award-winning (not really) blog post on how to use Tweet-SQL to import a Twitter feed, this schema is going to look pretty familiar.  I’ve started with the same two tables for Users and Statuses, but I’m adding a new table called UserFollows to hold the list of users that a user is following.

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Statuses](
	[id] [INT] NOT NULL,
	[created_at] [datetime] NOT NULL,
	[text] [nvarchar](200) NOT NULL,
	[SOURCE] [nvarchar](160) NOT NULL,
	[truncated] [bit] NOT NULL DEFAULT ((0)),
	[in_reply_to_status_id] [INT] NULL,
	[in_reply_to_user_id] [INT] NULL,
	[favorited] [bit] NOT NULL DEFAULT ((0)),
	[user_id] [INT] NOT NULL,
	[cached_date] [datetime] NOT NULL CONSTRAINT [DF_Statuses_cached_date]
        DEFAULT (getutcdate()),
 CONSTRAINT [PK_Statuses] PRIMARY KEY CLUSTERED
(
	[id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON,
  ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
 
CREATE TABLE [dbo].[Users](
	[id] [INT] NOT NULL,
	[name] [nvarchar](50) NULL,
	[password] [nvarchar](50) NULL,
	[screen_name] [nvarchar](50) NULL,
	[location] [nvarchar](200) NULL,
	[description] [nvarchar](160) NULL,
	[profile_image_url] [nvarchar](200) NULL,
	[url] [nvarchar](200) NULL,
	[protected] [bit] NULL,
	[followers_count] [INT] NULL,
	[cached_date] [datetime] NOT NULL CONSTRAINT [DF_Users_cached_date]  DEFAULT (getutcdate()),
	[cached_friends_status_id] [INT] NULL,
	[cached_status_id] [INT] NULL,  --New For This Demo!
	[cached_status_date] [datetime] NULL,  --New For This Demo!
 CONSTRAINT [PK_Users] PRIMARY KEY CLUSTERED
(
	[id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON,
  ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
 
CREATE TABLE [dbo].[UsersFriends](
	[id] [INT] NOT NULL,
	[user_id] [INT] NOT NULL,
	[friend_user_id] [INT] NOT NULL,
	[cached_date] [datetime] NOT NULL CONSTRAINT [DF_UsersFriends_cached_date]  DEFAULT (getutcdate()),
 CONSTRAINT [PK_UsersFriends] PRIMARY KEY CLUSTERED
(
	[id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF,
  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON,
  ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

Our code is going to be driven by our Twitter ID# – a number Twitter assigns each user – and it will rely on having our information already in the Users table.  The easiest way to do this is to follow the steps in my last Tweet-SQL blog post about fetching your timeline, but if you’re too lazy, here’s a quick script to do it:

EXEC dbo.tweet_cfg_resultset_send 1;
EXEC dbo.tweet_usr_show 'brento', NULL, NULL;

Those two stored procedures will configure Tweet-SQL to return recordsets, and then go fetch your user information.  The first ID# in the first recordset is your Twitter ID#.

Get Our Friends from the Twitter API

Now that we’ve got our schema set up, we can call the Twitter API to get our list of friends and populate it into the UsersFriends table.  Here’s the code to do it, a stored procedure I like to call usp_ImportFriends:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_ImportFriends]') AND TYPE IN (N'P', N'PC'))
DROP PROCEDURE [dbo].[usp_ImportFriends]
GO
CREATE PROCEDURE [dbo].usp_ImportFriends @import_users_id INT
AS
  SET NOCOUNT ON
 
  DECLARE  @xml                                XML
           ,@handle                            INT
           ,@page                              INT
           ,@optional                          NVARCHAR(50)
           ,@RowsReturned                      INT
           ,@new_cached_friends_status_id INT
           ,@cached_friends_status_id     INT
 
  /* Table to join statuses and users */
  DECLARE  @imported_statuses  TABLE(
                                     id                      INT
                                     ,created_at             DATETIME
                                     ,text                   NVARCHAR(200)
                                     ,SOURCE                 NVARCHAR(160)
                                     ,truncated              BIT
                                     ,in_reply_to_status_id  INT
                                     ,in_reply_to_user_id    INT
                                     ,favorited              BIT
                                     ,user_id                INT
                                     ,user_name              NVARCHAR(50)
                                     ,user_screen_name       NVARCHAR(50)
                                     ,user_location          NVARCHAR(200)
                                     ,user_description       NVARCHAR(160)
                                     ,user_profile_image_url NVARCHAR(200)
                                     ,user_url               NVARCHAR(200)
                                     ,user_protected         BIT
                                     ,user_followers_count   INT
                                     )
 
  SET @page = 1;
  SET @optional = '?page=' + CAST(@page AS NVARCHAR(4));
 
  /* Turn off resultsets from Tweet-SQL because we'll be using XML data */
  EXEC dbo.tweet_cfg_resultset_send
    0;
 
  /* Get the first set of statuses */
  WHILE (@page <= 20)
    BEGIN
      EXEC dbo.tweet_usr_friends
        @import_users_id ,
        @optional ,
        @xml OUTPUT;
 
      EXEC sp_xml_preparedocument
        @handle OUTPUT ,
        @xml;
 
      /* Copy the XML data into a single table to make handling easier */
      INSERT INTO @imported_statuses
                 (id
                  ,created_at
                  ,text
                  ,SOURCE
                  ,truncated
                  ,in_reply_to_status_id
                  ,in_reply_to_user_id
                  ,favorited
                  ,user_id
                  ,user_name
                  ,user_screen_name
                  ,user_location
                  ,user_description
                  ,user_profile_image_url
                  ,user_url
                  ,user_protected
                  ,user_followers_count)
      SELECT DISTINCT id
                      ,dbo.tweet_fnc_dateconvert(created_at)
                      ,text
                      ,SOURCE
                      ,truncated
                      ,in_reply_to_status_id
                      ,in_reply_to_user_id
                      ,favorited
                      ,user_id
                      ,user_name
                      ,user_screen_name
                      ,user_location
                      ,user_description
                      ,user_profile_image_url
                      ,user_url
                      ,user_protected
                      ,user_followers_count
      FROM   OPENXML (@handle, '/users/user', 2)
                WITH (created_at             NVARCHAR(50)  'status/created_at',
                      id                     INT           'status/id',
                      text                   NVARCHAR(200) 'status/text',
                      SOURCE                 NVARCHAR(200) 'status/source',
                      truncated              NVARCHAR(5)   'status/truncated',
                      in_reply_to_status_id  INT           'status/in_reply_to_status_id',
                      in_reply_to_user_id    INT           'status/in_reply_to_user_id',
                      favorited              NVARCHAR(5)   'status/favorited',
                      user_id                INT           'id',
                      user_screen_name       NVARCHAR(200) 'screen_name',
                      user_name              NVARCHAR(200) 'name',
                      user_location          NVARCHAR(200) 'location',
                      user_description       NVARCHAR(200) 'description',
                      user_profile_image_url NVARCHAR(200) 'profile_image_url',
                      user_url               NVARCHAR(200) 'url',
                      user_protected         NVARCHAR(5)   'protected',
                      user_followers_count   INT           'followers_count')
 
      EXEC sp_xml_removedocument
        @handle
 
      /* Insert new members that haven't been cached yet.
         The not-exists clause at the end is to avoid adding duplicate users. */
      INSERT INTO dbo.Users
                 (id
                  ,name
                  ,screen_name
                  ,location
                  ,description
                  ,profile_image_url
                  ,url
                  ,protected
                  ,followers_count)
      SELECT DISTINCT imp.user_id
                      ,user_name
                      ,user_screen_name
                      ,user_location
                      ,user_description
                      ,user_profile_image_url
                      ,user_url
                      ,user_protected
                      ,user_followers_count
      FROM   @imported_statuses imp
             LEFT OUTER JOIN dbo.Users u
               ON imp.user_id = u.id
      WHERE  u.id IS NULL
             AND NOT EXISTS (SELECT *
                             FROM   @imported_statuses impNewer
                             WHERE  imp.user_id = impNewer.user_id
                                    AND imp.id < impNewer.id)
 
      /* Import statuses that haven't been cached yet */
      INSERT INTO dbo.Statuses
                 (id
                  ,created_at
                  ,text
                  ,SOURCE
                  ,truncated
                  ,in_reply_to_status_id
                  ,in_reply_to_user_id
                  ,favorited
                  ,user_id)
      SELECT DISTINCT imp.id
                      ,imp.created_at
                      ,imp.text
                      ,imp.SOURCE
                      ,imp.truncated
                      ,imp.in_reply_to_status_id
                      ,imp.in_reply_to_user_id
                      ,imp.favorited
                      ,imp.user_id
      FROM   @imported_statuses imp
             LEFT OUTER JOIN dbo.Statuses s
               ON imp.id = s.id
      WHERE  s.id IS NULL
             AND imp.id IS NOT NULL
 
      SET @RowsReturned = (SELECT COUNT(* )
                           FROM   @imported_statuses)
      SET @page = @page + 1
      SET @optional = '?page=' + CAST(@page AS NVARCHAR(4));
 
      /* If we retrieved less than 100 rows, then we hit the last page. */
      IF @RowsReturned < 100
        BEGIN
          SET @page = 999
        END
 
      /* Empty out the holding table for the next pass */
      DELETE @imported_statuses
    END
 
  /* Update the Users table with the most recently cached record. */
  UPDATE dbo.Users
    SET cached_friends_status_id = @new_cached_friends_status_id
    WHERE id = @import_users_id
 
  SET NOCOUNT OFF

When you execute this stored procedure, call it with your ID# that we fetched earlier.  Here’s how I do it, since my ID# is 495643:

EXEC dbo.usp_ImportFriends 495643;

This fetches all of your friends, but be aware that it will burn one Twitter API request for each 100 of your friends.  If you’ve got 500 friends, that’s 5 API calls.  There’s a limit to how many API calls you can make per hour.  To get around that, you can request to be whitelisted for the Twitter API, meaning you can call it as often as you want. (My first request got denied, putting in my second one now with more info.)

Twitter has implemented a limit of 2,000 friends.  Some users have been grandfathered in with higher friend limits: if you’re one of those folks, you probably don’t care how many friends you have, but be aware that this stored proc will only fetch up to 2,000 friends, and only call the API up to 20 times.

Problem: Private User Updates Aren’t Included

After running that usp_ImportFriends stored procedure, run this query to show what users haven’t been active:

SELECT u.name, u.screen_name, u.protected,
  (SELECT TOP 1 created_at FROM dbo.Statuses s
   WHERE u.id = s.user_id ORDER BY s.created_at DESC) AS last_status_update
  FROM dbo.Users u
  ORDER BY 4

You’ll find that users who have protected their Twitter stream won’t show a recent status update. Dang! Well, for the purpose of this demo, we’re going to assume that protected users are worth following, and we won’t include them in our analysis.

Finding The Dead Wood

Let’s query the database to fetch user information, and we’ll order it by the time the user sent their last update:

SELECT u.name, u.screen_name, u.location, u.description, u.profile_image_url,
    u.url, u.followers_count, ('http://twitter.com/' + screen_name) AS twitter_url,
	(SELECT TOP 1 created_at FROM dbo.Statuses s
     WHERE u.id = s.user_id ORDER BY s.created_at DESC) AS last_status_update
  FROM dbo.Users u
  WHERE u.protected = 0
  ORDER BY 9
Twitter People To Unfollow

Twitter People To Unfollow, SQL Style

My results are shown at right, and just by glancing at them, I can see that I could easily dump a bunch of people.  For example, @HurricaneIke doesn’t really matter to me anymore, and I can see why that account hasn’t been updating lately.

I could write a T-SQL script to simply unfollow people who haven’t updated in X days, but there are some people in the list that I want to keep no matter what.  If @cwgabriel of Penny Arcade starts tweeting again, I wanna know.

Instead, what I really need is a web page that I can click on when I want to unfollow them.

I know Classic ASP, and I could write a page that would query SQL Server to build the result set, but we’re knee-deep in T-SQL here and it’s time to show off a really ugly T-SQL hack.  Let’s build the whole thing entirely with T-SQL.

Building the Web Page

This last query takes the results from our database and formats them HTML style.  This is horrendous code, and you should never ever do anything like this in production.  Don’t generate HTML inside the database.  It’s a bad idea for a million reasons.

Having said that, let’s do it!  Run this query:

SELECT ‘<p><img src=”‘ + u.profile_image_url + ‘” width=”40″ height=”40″ align=”left”>’
+ ‘<a href=”http://twitter.com/’ + u.screen_name + ‘” target=”_blank”>’ + u.screen_name + ‘</a><br />’
+ u.location + ‘<br />’
+ u.description + ‘<br />’
+ ‘URL: <a href=”‘ + u.url + ‘” target=”_blank”>’ + u.url + ‘</a><br />’
+ ‘Followers: ‘ + CAST(u.followers_count AS VARCHAR(20)) + ‘<br />’
+ ‘Last Update: ‘ + COALESCE((SELECT TOP 1 CAST(created_at AS VARCHAR(20)) FROM dbo.Statuses s
WHERE u.id = s.user_id ORDER BY s.created_at DESC),’Never’) + ‘</p><hr>’
FROM dbo.Users u
WHERE u.protected = 0
ORDER BY (SELECT TOP 1 created_at FROM dbo.Statuses s
WHERE u.id = s.user_id ORDER BY s.created_at DESC)

People to Unfollow

People to Unfollow, HTML Style

You’ll notice that this is not cleanly formatted like the rest of my SQL code examples.  That’s because WordPress, my blog software, starts choking when I mix T-SQL code with HTML in my code examples.  Heck, so does my SQL Server!

When you run that query, you’ll get a list of records that don’t mean much to you.

Open your favorite text editor and start a new file called MyFriends.html.  Put <html><body> at the top on a new line, then copy/paste your results from SQL Server into the text file.  Then add </body></html> at the end of the file, save it, and preview it with your web browser by double-clicking on the file in Windows Explorer.  If you’re lucky, it’ll look like my example at right.

Their name links to their Twitter page, and it opens in a new window so that you can race through the list unfollowing people.

Happy pruning!

By remixing the “ORDER BY” clause of the query, you can also sort by u.followers_count descending.  I find that early in my Twittering, I followed some people who seemed cool just because they were huge, like @scobleizer, but as I get more used to Twitter, I find those people less interesting.

Wondering Who’s Unfollowing You?

If you’re curious about who unfollows you, there’s a service called Qwitter that will email you whenever someone unfollows you.  I’ve had mixed luck with it – it hasn’t sent me an email in months, and I refuse to believe I’m that magnetic.

When Qwitter emails you, it includes the tweet that caused them to unfollow you – the straw that broke the camel’s back, if you will.  Don’t put too much faith into that, because it has to do with the frequency that Qwitter runs.  They might have unfollowed you several hours prior to that particular tweet.

Brent Ozar

Brent specializes in performance tuning for SQL Server, VMware, and storage. He's one of the very few Microsoft Certified Masters of SQL Server, a published author, and a Microsoft MVP. He likes travel, Jeeps, Apple gear, jokes, and writing about himself in the third person. Read more and contact Brent.

Website - Twitter - Facebook - More Posts

3 Responses to Dump Your Twitter Friends with Tweet-SQL
  1. [...] Brent Ozar has posted a detailed article on Tweet-SQL titled Dump Your Twitter Friends with Tweet-SQL. [...]

  2. Jason Strate
    March 10, 2009 | 10:52 AM

    I like this. I’m going to have to get Tweet-SQL up and running now.

  3. SDC
    September 26, 2009 | 9:07 AM

    Pretty sure you are right about qwitter. I haven’t gotten an email in months and occasionally I say stuff that I’m sure causes the less committed to drop me.

    Here’s an exercise for the reader(s): a pattern-matching (Reg Exp?) routine to find and block spammers.

Leave a Reply

Wanting to leave an <em>phasis on your comment?

Notify me of followup comments via e-mail. You can also subscribe without commenting.