Twitter Bug Makes Tweet Archives Unreliable For eDiscovery

Tweets from 2010 and earlier suffer from URL redirection problem.

2014-11-17-follow-quote-us-unquote-on-Twitter-1000x300.png

Old Tweets: Now You See Them, Now You Don’t

I’ve been on Twitter continuously since 2008-10-30. Here’s my first Tweet:

At first, I played Twitter’s game: followed lots of people, had lots of people follow me, and posted lots of Tweets. I then gained “authority” based on sites that claim to measure such things (screenshot from 2009-06-09):

LexTweet-home-ErikJHeels-5th

In early 2014, I changed my thinking about Twitter and other social networks. I adopted document retention policies that included deleting old stuff (including email and social networking stuff) and keeping only the good stuff. Turns out that most of what I posted on Twitter was not worth the paper it was printed on, so to speak. So I deleted most of my old Tweets (and other stuff).

At some point, however, I noticed that Twitter was pretending that my first Tweet was from 2010-09-05, nearly two years after I joined Twitter:

In other words, Twitter was preventing me (blocking me?) from accessing about two years worth of Tweets. I tried finding my old Tweets on the Twitter website, via third-party apps that use Twitter’s API (such as AllMyTweets.net), and via Twitter’s own downloadable archive of my Tweets. Same results: my Tweets from 2008 and 2009 were gone.

Why A Buggy (But Free) Twitter Is Problematic

This is a huge issue for several reasons.

First, it speaks to how bad Twitter’s software and customer service are. Numerous requests, both private and public (including case no. 03195672 and support@twitter.com requests dated 2014-06-25, 2014-07-11, and 2014-11-10) to fix this problem were ignored.

Second, it means that Twitter is saying one thing (i.e. you can download all of your Tweets) but doing another (i.e. except for those which you cannot).

Third, anything you say can, and will, be used against you in a court of law. So if you are involved in eDiscovery and are either trying to delete or discover old Tweets, then you will run head first into this bug.

Needless to say, I think that Twitter should fix this issue, explain why it happened, apologize, and explain how it will not happen again. I am doubtful, however, that this will actually happen, since those of us who use the Twitter service for free are not the customers – we are the product. So we’re getting all of the customer support that we’ve paid for.

All of this reminds me of the the Jul/Aug 2002 MIT Technology Review cover story entitled “Why Software Is So Bad” (http://www.technologyreview.com/articles/mann0702.asp). In short, software is bad because we, as users, put up with bad software. I have complained about bad software and sloppy programming in the past (see “related posts” below). And, in some cases I’ve received a free t-shirt for my efforts. But this Twitter bug, IMHO, takes the cake.

My Own eDiscovery Discovers Twitter’s Reproducible Bug

Since Twitter chose to ignore my support requests, I set out to solve the problem myself. Here’s what I discovered.

On 2010-10-13, Twitter announced that 100% of its users had access to the “new Twitter,” including a makeover of Twitter’s web UI (https://blog.twitter.com/2010/100).

Approximately in the fall of 2014, during the rollout of the “new Twitter,” Twitter changed the format for its status URLs (Tweets) so that the sequential number at the end of each Tweet (the Tweet ID) changed length. Between 2008-10-30 (when I joined Twitter) to 2014-11-17 (today), the length of the Tweet ID doubled from nine digits (which supports up to one billion (1,000,000,000) unique Tweets) to 18 digits (which supports up to one quintillion (or a billion billion; 1,000,000,000,000,000,000) Tweets. More on this below.

On 2012-12-19, Twitter announced that users could export archives of their Tweets (https://blog.twitter.com/2012/your-twitter-archive). The tweets.csv file that is included with your Twitter archive contains the following nine fields:

  1. tweet_id
  2. in_reply_to_status_id
  3. in_reply_to_user_id
  4. timestamp
  5. source
  6. text
  7. retweeted_status_id
  8. retweeted_status_user_id
  9. retweeted_status_timestamp

Of these, tweet_id is the most interesting, as it contains the (presumably sequential) number needed to recreate your status URL (AKA Tweet).

I first requested my archived Tweets 2013-09-16, and it is my archive from this date that provided the information needed to crack the code on this bug. Archives requested since this one exclude Tweets from 2008 and 2009.

Of course, my old Tweets are not really gone. If you have the URL, you can still find them. Right? Or wrong?

Right and wrong.

For many of my old Tweets, the old URLs still worked. But for a few, the URL for my Tweet redirected to somebody else’s account with the same Tweet ID! Same Tweet ID, different Twitter account. Here is the proof: video, screen shots, and URLs. In all three cases, my URL redirects to somebody else’s Twitter account.

* 2014-11-17 Twitter eDiscovery Redirect Bug (60 sec)
https://www.youtube.com/watch?v=pk10SDn0Ij8

Compare one bogus URL, which (correctly) goes to Twitter’s 404 page:

https://twitter.com/ErikJHeels/status/1234567890123456 (16 digits)

to three valid URLs, which (incorrectly) get redirected to accounts other than the original:

Redirected Tweet #1 from 2010-11-22

2010-11-22-Twitter-redirect-bug-Tweet-6898295347609600-959x838.png

my Tweet: https://twitter.com/ErikJHeels/status/6898295347609600 (16 digits)
not mine: https://twitter.com/ayessadelapena/status/6898295347609600

Redirected Tweet #2 from 2010-11-25

2010-11-25-Twitter-redirect-bug-Tweet-7798832997859330-959x838.png

my Tweet: https://twitter.com/ErikJHeels/status/7798832997859330 (16 digits)
not mine: https://twitter.com/cyfraley/status/7798832997859330

Redirected Tweet #3 from 2010-12-24

2010-12-14-Twitter-redirect-bug-Tweet-14667971616051200-959x838.png

my Tweet: https://twitter.com/ErikJHeels/status/14667971616051200 (17 digits)
not mine: https://twitter.com/BaddAzzAng814/status/14667971616051200

Why Users Should Demand A Less Buggy (And More Responsive) Twitter

Here is my tweets.csv file from 2009-09-16, showing three valid Tweets (highlighted in green) and three redirected Tweets (highlighted in yellow):

2014-11-17-green-valid-and-yellow-missing-Tweets-693x699.png

So what happened to the redirected Tweets from my account? Are Tweets from other Twitter accounts redirecting to my account? What if one of those hidden/redirected Tweets is the key piece of evidence needed in a civil or criminal trial? Litigators and litigants who think that they can rely on Twitter’s Tweet archives to make or break their case will be disappointed at the news that this Twitter bug makes Tweet archives unreliable for eDiscovery. Among other things.

This is, admittedly, a small sample size. But consider that I deleted all but eight of my Tweets from 2010. Now it’s a big problem, since three of my remaining eight Tweets (37.5%) suffer from this bug.

How many of your Tweets are being misdirected to somebody else’s Twitter account?

How many of others’ Tweets are being misdirected to your Twitter account?

How many of your Tweets are missing and inaccessible?

When was the last time you downloaded and validated your Twitter archive?

In the end, Twitter itself doesn’t really matter. Unless you really need it. In which case it matters immensely. So my advice is this: don’t use Twitter unless and until Twitter can prove that it has fixed this fundamental flaw. Just say no to bad software.

Oh and Twitter, if you’re reading this, I wear an XL t-shirt.


Erik J. Heels is a patent and trademark lawyer for Boston startups, Red Sox fan, MIT engineer, and musician. He blogs about technology, law, baseball, and rock ‘n’ roll at ErikJHeels.com.

Related Posts

  1. Top 10 Questions To Ask Before Your Company Spends Money On Social Media (2010-09-12)
    Social networking is not magic.
  2. Google’s Buzz Tweaks Are Lipstick On A Pig, And Why Google 2010 Is Like Microsoft 1998 (2010-02-17)
    Just because something can be done doesn’t mean it should be done.
  3. Drawing That Explains Google Buzz Privacy Problems (2010-02-14)
    Think visually before launching technology products.
  4. #FeedBurnerFail Problem Fixed: The Perfect Storm Of Miscommunication (2009-05-22)
    When feeds are out of sight, feeds are out of mind.
  5. Major #Twitter Security Bug (2009-02-25)
  6. Gmail Fail: Account Lockdown: Unusual Activity Detected (2008-10-10)
    Gmail locks me out daily for READING my mail too quickly.
  7. Google Checkout Is Broken (2008-01-01)
    An ominous start to 2008 for Google Apps Premier Edition.
  8. Gmail Problems (2007-12-18)
    I want to trust Gmail, but it needs to be reliable.
  9. Google Analytics Chokes On Long WordPress URLs (2007-07-17)
    So I’m turning off Google Analytics for now.
  10. How To Debug Computer Problems (2007-07-15)
    Let’s work the problem people!
  11. Google Reader And Generalissimo Francisco Franco Are Still Dead (2007-05-25)
    Google Reader death watch day 3.
  12. Uncool: USPTO Breaks Millions Of Patent URLs Without Public Notice (2007-03-26)
    Static URLs? We don’t need no stinkin’ static URLs!
  13. Joel On (Perfect) Software (2004-04-26)
  14. Anatomy Of A Web Site Crash (1999-11-01)
    When our columnist’s Web site experienced a denial-of-service attack, the resulting devastation might be called the ‘loop condition from hell.’

Top 10 Reasons Shane Victorino is Worth 22 Seconds of Music

Don’t worry, about a thing, ’cause every little thing, gonna be all right!

Remember when the NHL changed rules to require helmets? And it grandfathered the “old school” players? Well, MLB has reduced the time alloted for pre-at-bat music from 22 seconds to 15 seconds.

Here are the top 10 reaons Shane Victorino should be grandfathered from this (silly) rule:

10. 08/13/13: Victorino lifts Sox over Blue Jays in 11th

9. 08/27/13: Victorino records seven RBIs in Red Sox rout

8. 09/04/13: Victorino makes catch, falls into stands

7. 09/05/13: Victorino comes up clutch

6. 09/06/13: Victorino hits two-run shot for Boston lead

5. 09/13/13: Victorino makes great running catch on liner

4. 10/08/13: ALDS Game 4: Victorino plates Ellsbury for 2-1 lead

3. 10/19/13: ALCS Game 6: Victorino’s clutch slam sends Sox to WS

2. 10/30/13: World Series Game 6: Victorino collects four RBIs in clincher

1. 10/30/13: ANY Shane Victorino Interview (You Know?)


Erik J. Heels is a patent and trademark lawyer for Boston startups, Red Sox fan, MIT engineer, and musician. He blogs about technology, law, baseball, and rock ‘n’ roll at ErikJHeels.com.

17 Related Posts

  1. Red Sox World Series Game 6: More Than A Game (2013-10-31)
    A top 10 moment at Fenway with my best friend.
  2. 17 Is The Most Random Number (2013-01-17)
    The Most Random Number Is 17.
  3. Drawing That Explains ErikJHeels Blog (2012-09-18)
    Technology, Law, Baseball, Rock ‘n’ Roll.
  4. #RedSox 2010: The Season In Tweets (2010-10-04)
    26 snapshots of a season to make Red Sox Nation proud.
  5. How To Avoid Infringing Red Sox Trademarks (2009-01-13)
    Or anybody else’s, for that matter.
  6. Dear 2008 Red Sox: Thanks For A Great Season (2008-10-20)
    Win or lose, I’m with you every game, every inning.
  7. Epic Comeback: Red Sox Win ALCS Game 5 Over Rays 8-7 (2008-10-17)
    Down to their last 7 outs and trailing 7-0, the Boston Red Sox staged one of the most amazing comebacks in history. And I was there.
  8. Sox Win World Series (2007-10-29)
    Red Sox defeat Rockies 4-3, Sox sweep series.
  9. A Red Sox Fan In Colorado (2007-10-25)
    Rocky Dogs vs. Fenway Franks.
  10. Red Sox Perfect Game (2007-10-03)
    A perfect game with my son.
  11. Nine Principles Of Baseball And Life (2007-05-02)
    Philosophy of Baseball: How to Play the Game of Life.
  12. Walk-Off Hits And Other Game-Winning Hits By David Ortiz (2006-08-01)
    Twenty game-winning moments by Boston’s Big Papi.
  13. Erik Heels In A Suit And Red Sox Hat (2005-11-16)
    The photo that I wanted the ABA to publish.
  14. A Tree Named Nomar (2005-10-26)
    And cats named Pudge and Pesky.
  15. Taking Computing Seriously (Or How The Red Sox Won The World Series) (2005-03-01)
    A strategy for preventing and curing computer problems.
  16. Red Sox Thoughts 2004-10-27 23:49:52 (2004-10-27)
    An email message promoted from my email archives to my blog on the last day the 2004 Red Sox were world champions (10/26/05).
  17. Red Sox Thoughts: What It Means To Be A Red Sox Fan (1999-10-18)
    An email message promoted from my email archives to my blog on the last day the 2004 Red Sox were world champions (10/26/05).