LATEST ENTRY

INTERNET | Noah Brier

The Many Skins of Web Data

Some thoughts on APIs and the different ways to view the same data.

November 26, 2008 | RSS | EMAIL | PRINT | 9 COMMENTS

As I've said in the past, I really love making stuff on the internet as much for the thing that's created as watching and learning from the reactions to it. This was most certainly the case with My First Tweet (which is still alive and well, by the way, with 5,370 first tweets in the DB so far). There's one response in particular I want to highlight today, though, because I think it's particularly interesting.

A few days after launching I got an email from someone telling me I must take down their first tweet. It wasn't offensive or anything like that, rather, they just didn't like the idea that they hadn't said it was okay for it to be on the site. While I didn't really understand it, I figured it seemed like a reasonable request and would only take a minute of my time. So I took it down. When they went back to check that I had done what I said, they found their first tweet again. Once again, I took it down.

Then I realized what the problem is. You see, the site is built so that if the user's first tweet isn't already in the database, it queries Twitter's API and grabs it. That means that every time they went back to check if I had been honest, they were actually responsible for their first tweet being in the database.

That, I thought, is a really interesting problem. I went over to read Twitter's terms of service and indeed you the user own everything you create. In addition, they "encourage users to contribute their creations to the public domain or consider progressive licensing terms." However, from a technology perspective there are only two states for Twitter: Public and private.

Let me step back for one second and explain the act of querying Twitter's API for one second. Basically, when someone puts their username into the site, I send a message to Twitter saying, "hey, can I have the information for the user XYZ?" Twitter then sends me back one of two different messages, most often they say, "sure, here's the info you requested," but sometimes they say, "sorry, we can't give you that info because the user you requested have made themselves private." (When you try to look at the tweets of a user that is private on twitter.com you get a little lock icon and a message that says you can only see this person's tweets if they give you permission.)

So basically Twitter is a binary system, you are either public or you are private. If you're private I can't grab your first tweet. However, if you're public, I can, whether you want me to or not.

This is particularly interesting to me for a few reasons. First, it's a good way to explain how outdated the idea of webpages really are. Most people think of them as these hard coded things, like pages in a magazine or something. However, many of the webpages you look at are not created until the moment you look at the site. Brand Tags, for instance, really only consists of about a dozen files. Even though there are 800 brands in the system, all the tag clouds are generated by the same few lines of code which queries the database and returns the formatted results. When I was getting the request to take down the first tweet, I was complying, however, it didn't really matter because it never existed as anything but a database entry in the first place.

What's so interesting about this is that that's actually how Twitter works as well (I believe). The results that the Twitter API returns are remarkably similar to the way the pages are formatted (down to the fact that you can only get to page 160 on both Twitter.com and from their API). That means that the site isn't so much a site as it is a view for the data (of which My First Tweet is one, search.twitter.com is another and Twitter Grader is a third).

Twitter isn't alone in working this way, either. Most sites these days are just skins for the underlying data, which is increasingly being shared with others who are making new skins for it. This isn't new news to those who build things on the web, but I think it is a fundamentally different functionality than the average user understands. Just something to think about.

The second point I wanted to make is around this public/private thing. In a world where everything is just skins for the underlying data, you have fewer and fewer controls over how that data is displayed when you sign up to use a service. Some services (like Flickr) allow you to specify a licensing for your work (full copyright, creative commons, etc.) and they report that to those people who want to work with the data, but even then, the API user can chose to ignore the licensing entirely and just take the photo unless the user has specified that this CAN NOT be used (either because it's private or there is no access to full size).

As someone developing using APIs this kind of flexibility is pretty awesome. I can get access to pretty much anything I want (which is rad). But for some users, clearly this is worrying. I don't know that more safeguards need to be put in place, but I do think that this wholesale data access needs to be better explained (there's a tendency to live in a world where we assume people know what an API is1).

As usual, no hard answers here, just some stuff to think about.

1 While I'm no technician, I do think it's worth trying to explain what an API is, since it's thrown around quite a bit these days. Essentially an API is just wholesale access to the data/functionality from a web service. If you're Google Maps that can manifest itself in letting people send you an address and returning the latitude and longitude or if you're Flickr that can mean returning the URLs for photos tagged with noah. Developers then can find lots of different ways to use the data/functionality. Essentially, with access to the raw data the sky is the limit. In some ways, RSS feeds are kind of like APIs for websites. They provide people with some access to the underlying data (which is separated from the presentation layer that you see when you visit NoahBrier.com for instance). (I don't know if this definition is helpful at all. If anyone wants I can take another shot, or maybe someone else can try to give a better definition in the comments.)

PREVIOUS ENTRY | NEXT ENTRY

LEAVE A COMMENT

First name, first and last, whatever you feel like.

Required, but not displayed (so don't worry about spam).

If you've got one, flaunt it.

You can use some HTML (a's, br's, p's, oh my!) if you'd like, if you don't know what that means, don't worry about it.

REMEMBER ME?

COMMENTS

1(@pilchardmusic) Lee Stacey

Useful post, Noah. Loving "My First Tweet" by the way. Interesting to see people's first step into the twitterverse.

I wonder how many first tweets say something like "test".

November 26, 2008

2faris

completely awesome.

November 27, 2008

3Noah Brier

@Lee: Thanks a bunch. And so far out of the roughly 6,000 first tweets, there are 160 that have the word test (in some form). That's 2.6% ... Happy to run anymore you're interested in if you'd like.

@Faris: Thanks dude.

November 27, 2008

4charles gallant

It's really too bad that we call them "APIs". They're suffering in the same way that "RSS" does: the average person doesn't know what the heck they are... and the name doesn't begin to describe them. Instead of API, what if it were called...

- Data Toolkit
- Info (or Data) Feed
- Data Access

November 29, 2008

5Matt

There should be a class on this stuff--or atleast a really good blog--that explains the underlying technology for seasoned marketers that haven't got a clue. I've tried to explain APIs to MBAs, but unless you can create relevance (e.g., APIs create privacy issues), it doesn't seem to get enough attention.

November 30, 2008

6stephanie gerson

do data and functionality have to be married? or what am I missing?

December 1, 2008

7Noah Brier

@Stephanie: Um, not sure I completely follow the question, but no, I don't think so. Mind explaining a bit more ...

December 1, 2008

8stephanie gerson

in your definition of API, you refer to data/functionality as one....er...thing. but to me, data and functionality are different organisms, and it matters if API gives access to data and/OR functionality. so I was wondering if data and functionality are always married from an API's point of view.

December 1, 2008

9Noah Brier

Oh, gotcha. Sorry. Nope, they're not. And functionality is honestly stretching it a little bit, as at the end of the day that functionality is delivered by way of data. I know I was thinking about something when I wrote functionality, but now I can't seem to remember what it was ...

December 1, 2008