There was no laundry in my tiny apartment building, but I had a special "laundry key" which opened the front door of a totally different house. In the foyer of this house was a washer and dryer, and if you lived there you'd have a different key that opened up the house proper. It was an odd system.
The first time I hauled my laundry over to this house someone else was using the washer, so I had to come back the next night. The second time I made it. After drying my clothes, I tugged on the lint trap to clean it out. The trap practically exploded out of its receptacle as the hundreds of loads worth of lint it contained expanded to fill the space outside.
I peeled the lint off the lint trap. It was two inches thick, a lasagna of lint, striated in colors like the geologic column. There was no trash can in the laundry room, so no one had ever emptied the lint trap.
I didn't want the house to burn down, so I took the lintsagna with me and threw it in my building's dumpster. Sometimes I can still hear it calling me. It says, "I'm a pile of compressed lint and incapable of speech, but nonetheless youuuu are responsible for my deaaaaath!" I generally ignore it.
Another data point I'm not sure what to do with (see Dada Chess weirdness passim) is that both stories I've sold had their origin in weblog entries I posted to NYCB. "Mallory" was the end result of this bizarre entry, and "Awesome Dinosaurs" was the end result of this more-obviously-an-idea entry. I sold both stories to the first market I sent them to, though for both I had to do a revision and resubmit. If a story didn't start in this weblog, I haven't been able to sell it.
There's artistic license stuff like sound in space and stars visible from the lunar surface during the daytime. That stuff doesn't really bother me, and Moon at least gave alternate POVs for most of the sound you heard while the camera was in vacuum. There's stuff that would just be too expensive to get right, like filming all the scenes in lunar gravity. Moon did get the exterior scenes right. And then there's... the whole premise of the movie. Which doesn't make any sense.
And the movie knows it. As in many movies, there's a scene where the characters nibble around the fact that the premise doesn't make any sense, and then defuse it with a joke and move on. I call this the "Gremlins 2" solution. I wasn't even happy about it in Gremlins 2, which played it for laughs. I'm sorry but I can't let it go.
It's a good enough movie that I keep thinking of ways to tell similar stories without doing anything nonsensical. While the movie was going on I coped with the situation by deciding I was watching a horror movie. Horror movies work on the logic of nightmares, where something like what happens in Moon can make sense. But it's not satisfying to me as science fiction.
The other thing I was worried about was that this movie would be so similar to a story I wrote that I'd never be able to sell the story, but despite some shared inspirations the stories are pretty different. Not that I'll ever sell that story!
The numbers are large and steady enough that I'm starting to wonder if there is some significant advantage in Dada Chess to not moving first. I can't think of what it could be.
But for the better part of the decade I've been trying to come up with some fiendish plot involving shipping containers. Wednesday I was reading on the subway, when I looked up and envisioned a shipping container with the logo of an organization from my current writing project. I thought: Why would they make shipping-- and then I knew why. One of this organization's plot points makes one of my old shipping container schemes usable. It took years to create, but it fits together.
The feeling you get when everything fits together is a drug that I'm addicted to. It's why I write and read and play games. Like all drugs it's probably not good for me on balance, but unlike other drugs it produces things of value as a side effect.
Woody Hartzog and I are recruiting subjects for a study of privacy behaviors in online social networks (note: Twitter counts!). If you’d like to participate (and you can remotely – via phone, Skype, etc), please be in touch. The official recruitment script follows:
UNC-Chapel Hill researchers are conducting a study of privacy behaviors in social networking sites (Facebook, Myspace). We seek individuals who maintain multiple profiles, and purposefully keep their profiles separate on social networking sites. For example, people who maintain a “work profile” and a “personal profile” on a social networking site. If you meet this criteria, we are interested in your opinions about privacy, as well as the social implications of maintaining multiple profiles.
To qualify for this research, you must be age 24 or older, have started using social networking sites within the last two years, and maintain multiple profiles (e.g. a “work profile” and a “personal profile”) on social networking sites. Participation is entirely voluntary. Individuals who wish to participate will be interviewed for one hour, and will be compensated $10.00 for their time. Interviews can be in person, or remotely (over the phone, via Skype, etc.). To volunteer for participation, or ask any questions about the project, please email Principal Investigator Fred Stutzman at fred@fredstutzman.com. If you prefer, you may call 919-260-8508.
This research has been approved by the University of North Carolina Institutional Review Board, IRB-09-1078. Gary Marchionini, Ph.D., Cary C. Boshamer Distinguished Professor in the School of Information and Library Science, is faculty supervisor of this study.
Put simply, we’re looking for people who are fairly recent adopters of SNS, that maintain more than one profile on a site or sites (e.g. maintain a “personal” and “professional” identity that is separate), or who attempt to segment their identity in social network sites (create “friend lists” for coworkers, friends, or family, for instance). If you have any questions, please feel free to leave them in the comments or email me directly.
Fred Vogelstein has an interesting article in the new edition of Wired, previewing Facebook’s full-on assault of Google for targeted advertising territory. The article makes news, and includes some great (and painfully ironic quotes) from Mark Zuckerberg in which he accuses Google of contributing to the surveillance society (Pot, Kettle, Black). The article reads like a preview for the Super Bowl, with notoriously tight-lipped executives tossing bombs back and forth. Congrats to Vogelstein for successfully stoking the ire of these monoliths.
The fundamental conflict of the article lies in the comparison of the advertising products offered by the two companies. Google’s product, targeted text ads, is the single most successful product on the Internet. The tiny, unobstructive ads have fueled Google’s dominance in multiple markets; today, 90% of Google’s revenue comes from Adsense. Facebook’s product is nascent – it is the concept that advertising works better when it is socially mediated. That is, we are more likely to click on ads, content, and links when the content is funneled through our friends. This theory is sensible, but to date, Facebook’s concept remains vaporware, with a majority of their revenue coming through traditional targeted text and banner campaigns.
Framed by Zuckerberg, the contrast between Facebook and Google is personal vs. impersonal. Of Google he states: “You have a bunch of machines and algorithms going out and crawling the Web and bringing information back. That only gets stuff that is publicly available to everyone. And it doesn’t give people the control that they need to be really comfortable.” Vogelstein writes:
Facebook CEO Mark Zuckerberg envisions a more personalized, humanized Web, where our network of friends, colleagues, peers, and family is our primary source of information, just as it is offline. In Zuckerberg’s vision, users will query this “social graph” to find a doctor, the best camera, or someone to hire—rather than tapping the cold mathematics of a Google search. It is a complete rethinking of how we navigate the online world, one that places Facebook right at the center. In other words, right where Google is now.
Personal vs. impersonal. Wouldn’t you rather get a doctor recommendation from ten of your friends than a text link? The value of peer recommendations have driven many communities, including countless bulletin boards and fora, sites like epinions and Yelp, and members-only specialist communities. The fundamental problem with monetization in Facebook’s case lies with norms that govern the exchange of advice, particularly that the advice be truthful and unbiased. If we are to trust advice, we must know that external agents aren’t corrupting or influencing the transmission of advice. We can get advice from Facebook regrading doctors, but we won’t trust the advice if Facebook pays our friends to recommend certain doctors.
Facebook’s grand vision involves a wholly-contained world of social information that is brokered out through the web. With enough critical mass, it is argued, most of our common information needs can be answered by our social networks. With most technological main effect hypotheses, the formulation is generally suspect. Researchers of social support argue that support is more effectively derived from certain actors, that support is contextual, etc. In a traditional model, where the people around you are the primary producers of information, your personal support network is crucial. With the advent of the Internet, however, most of us no longer exist in a traditional model where the people around us are our only support vector (1).
The reality is that Google, and other search engines, have restructured expectations regarding everyday information seeking. It is no longer good enough to simply get recommendations from a personal network when there is a vast quantity of electronic information available at one’s fingertips. You can certainly get doctor recommendations from your friends, but the online search for information about the doctor is now a natural part of the information seeking process. In this sense, Facebook is complementary, providing an important but not all-encompassing factor in our decision making process. The argument that individuals will move their information seeking to a social network, and away from the mechanistic site Google simply assumes too much. Google has already won by making itself an integral part of our everyday information seeking processes.
If Facebook (a proxy for “socially mediated search”) is a complementary and useful part of everyday information seeking, we must consider the relevance of information we get from the site. We generally assess relevance in information systems through “recall” and “precision.” In Facebook, recall is strictly bound to our known social world – the people who we have connected with. Therefore, precision is a function of how well the various others producing results match our needs. If you have 500 friends, spaced across a variety of age ranges, is it safe to assume that information you get from the network will actually be all that relevant? Our core social networks are generally homophilous, but our core social networks are very small. Expand past a certain network size and it becomes likely the interests and experience of your “friends” will vary significantly from yours.
Facebook could address this problem with friend lists, the privacy feature that compels individuals to place their friends in groups. Perhaps friend lists could be converted to interest groups (People whose book recommendations I trust), but the mechanics of a process would require a good bit of intervention on behalf of the user. The participation gap is also problematic – if the people who you really trust for book recommendations are not heavy users of Facebook, then it is unlikely you’ll have your information needs addressed.
Facebook could develop algorithms that look for similarity between question askers and answerers – if I ask for a book recommendation, perhaps Facebook could weight responses from people who share my stated book tastes. This compels participation and broadcast of information, one of Michael Zimmer’s new laws of social networking.
Although the debate framed by Vogelstein and Zuckerberg is Facebook vs. Google, there is actually very little opportunity for Facebook to significantly edge into Google’s core market – targeted text-link ads. Text link ads are served as a by-product of information search, which is an integral part of our everyday information seeking processes. Facebook is likely to emerge as a complement to search, and in some areas it may perform better than search, but search will remain relevant. The challenge to Facebook is to find a way to monetize their value areas without being in contravention of social norms. The challenge to Google is to get access to the wealth of personal data Facebook is collecting (and no, Google Friend Connect and all of their other terrifically lame social products, will solve this problem). For the consumer, the battle between Google and Facebook is a win-win, with the obvious exception of privacy matters.
(1) Those with “impoverished life-worlds” – those with limited access to information and resources, are unlikely to incorporate search engines or social networks into their everyday information search processes.
I made a very, very brief appearance on Morning Edition this AM:
Fred Stutzman, who studies social networks at the University of North Carolina, thinks charging for services will turn out to be the best way for social networks to get profitable.
“People will pay for good technology,” he says. “People will pay for a responsive company.”
He points to the professional networking site LinkedIn. It offers some free services, but users pay for a premium level with more features. With only 40 million users, LinkedIn is significantly smaller than Facebook or MySpace, but it’s making a profit.
Facebook, though, may face a bit of a conundrum. There are two groups on the site called “We Will Not Pay To Use Facebook. If This Happens We Are Gone.” Their combined membership? Nearly 8 million.
Stutzman thinks that ultimately Facebook, MySpace and Twitter are going to be around for a long time. They just might not be the big cash cows that some people expect.
Unfortunately I missed the live broadcast.
The book arrived today and I reread it. The political subtext is only sub-textual if you're a kid, but it did its job. Pretty much everything in the book is part of my adult philosophy, right down to the ham-handed satirical dialogue I write for government employees. Highly recommended assuming you want your kid to turn out like me.
The illustrations are also awesome. My main complaint (also mentioned in the postcard, which will show up sometime in the next 3 years) is that if a chicken gave birth to an evolutionary throwback it would be a theropod, not a saurichian like Triceratops.
When I mentioned this book to Sumana she immediately countered with Homer Price (the book with the story about the donut machine), which I remember being really good. I was also considering John Fitzgerald's Great Brain books for the "lesser-known but awesome childrens' books" list, but those books have a pretty good Amazon sales rank (they're outselling RESTful Web Services) so they're not as obscure as I thought.
I got a huge amount of writing done yesterday. Today not so much. I really hope I can show you this soon (ie. by the end of the year--it's a big project). Most of what you're seeing from me this year is sparks thrown off from this project or things I'm doing to procrastinate or recover from working on it.
Recently an article made the rounds of my syndication feeds, to the effect that you shouldn't even mention things you're working on until they're done, because your brain treats announcing a project as work on the project. If you look at my very early weblog entries they're full of promises I never followed up on. But after about 2000 I generally follow this rule, albeit sometimes to my detriment--I should have announced RESTful Web Services earlier to get more feedback. This time I'm happy to work on a big project in semi-silence because I'm still not convinced I can pull it off.
Michael Zimmer has released a new critique of the “Facebook Dataset” – and it is well worth reading.
Recall that last fall, a group of researchers affiliated with the Berkman Center for Internet & Society at Harvard University released a dataset of Facebook profile information from an entire cohort (the class of 2009) of college students from “an anonymous, northeastern American university.” While the researchers took good faith steps to preserve the anonymity of the source of the data (and, presumably, the privacy of the subjects), I quickly narrowed it down to 7 possible universities, and then with only a little more effort, identified the source (with some confidence) as Harvard College. All this without ever even downloading or looking at the actual data.
Download the draft of Michael’s paper.
The other day, we saw a sign: "Area Rugs On Sale". "That's the worst Onion headline I've ever seen," I said.
The crude distinction between genes as implacable programmers of a Calvinist predestination and the environment as the home of liberal free will is a fallacy.
Via Inside Facebook: comScore: Facebook Passed MySpace in the US for the First Time in May.
It’s been a long time coming, but Facebook has finally passed MySpace in terms of total US uniques, according to comScore. In May, comScore reported 70.28 million US uniques for Facebook up 97% year over year, compared to 70.26 million for MySpace down 5% year over year.
Blogging this for posterity’s sake.
Still needs a little work, but I think it's ready to launch. Roy's Postcards is a new Crummy weblog that will feature a new scanned and transcribed postcard from the 1980s, every day for the next three years. Most of the postcards were written by my father, either as notes to himself or as letters to me and my sisters, sent while he was on one of his many business trips. Some of the postcards are quotidian, some are crazy or silly, some are emotionally charged. A lot of them have beautiful, interesting, or bizarre pictures on the front. I hope you'll give it a look.
This is the largest extant corpus of my father's writing and I've been trying to figure out the best way to present it since I discovered these postcards in 2006. I think the one-a-day format, in a weblog intended to be experienced through the RSS feed, is the best way to keep the presentation interesting. It'll give you a little visual break every day in your feed reader while letting me go into some detail on each postcard, point out funny things, and explain what needs to be explained.
Over the past week or so I've processed enough postcards to have a year's worth of backlog. I estimate my total time investment in this project at about 100 hours. Not bad for three years of daily entertainment.
L: What was your favorite part of the movie?
S: When Star Trek was swimming with the whales.
L: Star Trek?
S: Spock. I meant Spock.
(That's also my favorite part of the movie.) Sumana also pointed out that the reconstituted Spock in ST:IV is pretty much the same as Data at the beginning of ST:TNG, before they rounded out his character.
One of the more understated bits of humor in that movie is that while most of the Enterprise crew can't function in the 20th century, Sulu does fine. He's from San Francisco and he knows about old machines, so he just does his job with no problems.
This movie also puts into perspective one of the more unrealistic parts of the new Star Trek movie; Kirk getting a command assignment straight out of the Academy. There's still no excuse for that, but the first four Trek movies show that Kirk is very good at being a starship captain and very bad at any other job. So it's really the best use of his talents, though you wouldn't know that ahead of time.
I do prefer an honest negative review to an ironical review like "Close to ps1 graphics. 5 stars."
PRIVATE HOME USE ONLYPublic Performance rights are not included; the DVD may not be exhibited publicly, commercially or theatrically. The DVD may not be exhibited in museum or gallery exhibitions without obtaining additional licenses. The DVD or any of its contents may not be broadcast, cablecast or webcast in any manner. The DVD may not be duplicated, distributed or reproduced in whole or in part. The DVD may not be licensed to any institution or individual. The DVD may not be altered or excerpted in any way.
Here's the Oskar Fischinger quote from the DVD cover:
These films have no limitations on when they can be shown. Like a great work of music or a great painting, they will become more valuable with age. Because of its complete originality, this type of film knows no boundaries of time or fashion.
On Monday, the Harvard Business school posted a “conversation starter” study on gender differences in Twitter use. The authors found that “men have 15% more followers than women” and “an average man is almost twice as likely to follow another man than a woman.” The authors suggest, without empirical data, that men find the content produced by women less compelling “because of a lack of photo sharing.” Is everyone else offended by this base characterization?
As it happens, the study has serious flaws. I’d like to point those out, and then suggest an alternative method for addressing these questions. Let’s start by talking methods. This study is a survey; using a random sample of 300,000 Twitter users, the authors attempt to draw population-level inferences about “friending” behavior in Twitter.
When conducting a population survey, researchers collect a sample and attempt to use that sample to draw inferences about a population. The difference between the “sampled” population value and the “true” population value is known as error. Survey error (MSE) has two components: sampling error and non-sampling error. We are most familiar with sampling error; it is the differences between the “sample” value and the “true” value attributable to the sample selection. Non-sampling error comprises all other error non-attributable to sampling error, such as data entry error, instrument error, etc.
For the purpose of this analysis, we are going to focus primarily on sampling error. At the study sample size of 300,000, there is very little sampling error in an infinite population. While we generally associate a large sample size with better quality data because of this small sampling error, there are two caveats. First, above a certain sample size, say 20,000, there is little marginal gain in the addition of sample. The difference between an n of 500 and an n of 1000 is vast, but the difference between an n of 20,000 and an n of 40,000 is much smaller due to the properties of the normal distribution.
On paper, a larger n is always better; here is the second caveat. When dealing with very large samples, confidence intervals used to determine significance are smaller – meaning even the most minute differences become “significant.” Furthermore, discovery of influential data is more difficult, as those data may be sufficient in number (i.e., a pattern emerges in influential data) to influence the distribution. As any Twitter user with a public profile knows, there are certainly some “patterns” that emerge in follower behavior.
Let us revisit the purpose of the survey, which is to use a sample to draw inferences about a population with as little total error as possible. The goal is to not achieve significant differences on wild hypotheses, it is to collect good data that represents a population. To achieve this goal, survey designers expend a lot of effort understanding their populations, defining their sample, and working to achieve high data quality (while keeping costs under control).
Let’s say that I wanted to know the 2008 income of everyone over 18 born in my city. So I go down to city hall, I ask for the names of everyone who was born in my city before 1991. I then take this very large list, and cross-reference it with my magical 2008 tax records, and produce a wonderful study. Can you spot some problems with the data? At first, you might point out that not everyone over 18 born in my city earns an income. Ok, that’s fine – I want to know that. Now here’s the real problem: my city stared keeping records in 1830, meaning well over half of the people in my sample are dead, and they report no income. Now I’ve got some highly influential data that actually looks “normal” due to attrition.
Let’s consider what we know about Twitter. If we believe Nielsen, about 60% of people who create Twitter accounts abandon them within a month. And if we believe the fair and balanced news organization Fox News, Twitter has a spam problem (Ok, anyone who has a public profile knows that). What might these trends tell us about our population? First, there will be a large cluster of inactive (attrition) users. Second, there will likely be a large cluster of users who do not follow anyone, or follow a very small number of people (characteristic of attrited users). Finally, since following is non-reciprocal, these attrited users (and active users) likely have their follower numbers inflated by Twitter spammers.
What do the HBS numbers tell us? The authors find that the mean number of tweets/user is 26, but the median is 1 and 75th percentile is 4 tweets. This indicates a highly non-normal distribution (it most likely approximates a bimodal distribution); that there are a large number of users with 0 or 1 tweets (50% of the sample – and 75% of the sample have less than 4 tweets). This is indicative that a large portion of the sample is inactive. (Of course, a number of these accounts could be “follower” accounts (i.e. people who do not post but follow), but I would argue this would constitute a small portion of the population). This provides good support for my first point.
My second point, non-follower data, is not addressed by the study. They do not present information regarding the percentage of users who do not follow back, instead presenting an odds ratio that would hide the distribution of followers. I would guess that at least 40% of the sample does not follow a user (or follows only “suggested” users). My third point, that more people would be followed, seems to be upheld, as 80% of the sample has at least one follower. There is likely some spam inflation there, and information about the distribution would tell us a lot.
As we can see, all signs point to low data quality, which casts all of the hypotheses and findings in serious doubt. Just because a sample is large, and significance can be easily achieved, it doesn’t mean that data quality is good. Unfortunately, it appears that the Harvard authors have made the error I describe in my income study – yes, they’ve collected a lot of people, but the failed to see who had died. What good is an inference about a population if it is heavily influenced by bad data? Don’t we actually want to know what real users are doing?
Beyond these data quality problems, there is also an issue with the gender classification; the authors rely on a corpus of names to predict gender of users. As each name is a prediction, there is an error component associated with each name classification. This error component must be taken into account as a function of the total variance component – meaning all of the things that looked significant may not actually be significant.
Since this is a “discussion,” I’d like to propose a method to re-run the study with better data quality (but larger standard errors). The two main problems that will be addressed are compensation for attrition and gender classification. To deal with Twitter attrition, let us first define it. If we follow Neilsen’s numbers, a Twitter user that has posted at least once at >30 days and <30 days has a decent chance of being an active user. We may want to make this criteria more lenient – perhaps just requiring one post in the last 30 days. Either way, we must define a criteria to decide who is an active user (and this definition must be informed by data and theory).
The problem with gender is a little more difficult. I don’t spend a lot of time in the TREC community so I’m not sure how good automated techniques are, so I’m going to propose human classification. The most efficient way to do this is with Mechanical Turk. Turkers could be shown a profile and asked to decide the gender of the profile owner; you’d repeat with a different rater to get an estimate of reliability. Your guess is as good as mine about agreement – I’m generally skeptical of ethnicity ratings by third parties, but I tend to think that gender can be reasonably assessed. Update: @yardi brings up a good point regarding brand/persona/promotional/shared accounts. My (too simple) answer is exclusion. If we’re truly interested in this gender question, then non-gendered accounts fit an a priori exclusion critera. My gut instinct is that in a population sample, we would see low incidence of these accounts, and they could be analyzed separately to see how they would affect our data.
So the study would be simple – collect a first-stage sample of profiles and assess if they meet the activity criteria (this can be done automatically). Then run a second stage random sample on the eligibles and send them to Mechanical Turk. You could send 3000 profiles to MTurk and have them assessed, with a goal of ending up with 2400 profiles, giving you +/-2% at p<.05. Of course, all of the “friend lists” would also have to be gender coded, so if you have an average of 10 friends you’re looking at 24,000 extra codings (minus overlap). If we include overlap and say we’ll have 25,000 unique profile, and each profile has to be rated twice, at .01 a HIT we’re looking at a total price of 500.00. Of course, if we pull our sample back we can reduce this cost substantially.
There are a couple of questions: First, we can’t really say how much better humans will preform at gender-coding until we run a comparison to the machine-coded results. My gut is that humans will preform at a higher level of accuracy, but there is still a variance component with the classification. We also don’t know what kind of bias we introduce by cutting out “follower” profiles. I don’t know how many of these unique profiles would show up in a population survey, but it is an open question. And what about the findings, how would they change? My gut is that a lot of these “stunning” findings would go away, and we’d see greater gender homogeny in “following” behavior. “Follower” behavior would still be influenced by spam, so it might be useful to assign a spam attribute to profiles to be used as a covariate (you could have MTers code them, run them through spamassassin to get a naive score, or simply use standard techniques to find influential data).
The important takeaways from this discussion is that “bigger” is not always better with social data, that data should be looked at critically before running analysis (using existing information and theory), and data that wildly contravenes existing findings should always be re-run to produce robust estimates.
Final note: The authors state that “On a typical online social network, most of the activity is focused around women – men follow content produced by women they do and do not know, and women follow content produced by women they know.” Mike Thelwall’s (2008) large scale analysis of Myspace friending behaviors found that while females tend to friend females, there was not a significant gender effect for males. In Mayer and Puller’s (2008) analysis of Facebook, they found that same gender was a significant predictor of friendship (in a potentially overfitted model). Overall, studies commonly find gender differences regarding SNS/internet use; females are generally found to use communicative tools with greater intensity. (e.g. Joinson, 2008; Lenhart & Madden, 2007; Jones et al., 2009)
References:
Joinson, A. N. (2008). Looking at, looking up or keeping up with people?: motives and use of facebook. In CHI ‘08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, New York, NY, USA, 2008 (pp. 1027-1036). ACM.
Jones, S., Johnson-Yale, C., Millermaier, S., and Perez, F. S. (2009). U.S. College Students’ Internet Use: Race, Gender and Digital Divides. Journal of Computer-Mediated Communication, 14(2), 244-264.
Lenhart, A. and Madden, M. (April 18, 2007). Teens, Privacy and Online Social Networks: How teens manage their online identities and personal information in the age of MySpace. Pew Internet and American Life Project. Retrieved March 9, 2008 from http://www.pewinternet.org/PPF/r/211/report_display.asp.
Mayer, A. and Puller, S. L. (2008). The old boy (and girl) network: Social network formation on university campuses. Journal of Public Economics, 92(1-2), 329-347.
Thelwall, M. (2008). Social networks, gender and friending: An analysis of MySpace member profiles. Journal of the American Society for Information Science and Technology, 59(8):1321–1330
S: My dream was wrong! I dreamed that people left two comments on my most recent weblog entry, and it didn't happen!
L: My dream was also wrong! I dreamed that Neil Gaiman came into my room while I was asleep and scratched an autograph into my nightstand with an Exacto knife!
Over the past few days, I’ve seen a few blog posts referencing various “studies” that claim that young people don’t use Twitter. Apparently, this is a problem.
As reported on CNET, “99 percent of 18- to 24-year-olds have profiles on social networks, only 22 percent use Twitter, according to a new survey from Pace University and the Participatory Media Network.” Never mind that it’s not particularly fair to compare a sector to a single product, what does the study’s methodology look like? More bad news – the question was posed to 200 members of a volunteer panel. A small, convenience sample provides very little inferential power; it is just as likely that this survey’s statistics looked like Pew’s numbers by chance occurrence. However, my main goal here isn’t to rail against small or convenience samples being reported as representative – this is a pervasive problem and there’s not much that Unit Structures can do.
Rather, I’d like to question this problematization of the “fact” that Twitter’s users aren’t young. The inherent bias in media coverage of social software is that social software is for “the young.” If we look at the history of social networking websites, we find mixed evidence to support this theory. For example, danah boyd’s ethnography (and my personal recollection) of Friendster was that it was a place for the late-twenties and thirty-something set. If it weren’t for bonehead moves on behalf of Friendster’s staff, we might still be using the service. LinkedIn, a popular and pervasive social network, has existed with an older skew for years. Facebook’s growth after opening up? It has been primarily dominated by older users.
This is not to say that young people aren’t important. They are the lifeblood of a number of popular social networks, including large communities and countless smaller ones you’ll never hear about. But why do we accept youth adoption as social fact ensuring community success? One reason is surely that young people are trendsetters. However, this theory of “trending” is an artifact of a pre-digital age, in which exclusivity and first-mover capitalization were required in the context of a production cycle. What is a trend in the digital age, if I can have a perfect replica of what the kids have, streamed via cable modem? Another reason is that young people are more connected. There is truth here; young people are disproportionately more connected than older people, but this is also changing.
It might help to think of connectivity in two ways. The first is traditional connectivity – the ability to access the internet. If you look at Pew’s numbers[1], you’ll see that older users are less connected. However, if you cut off the tail of the distribution, and consider users 60 and younger – you still find that 71% of those age 60 or younger have connectivity. Users in their 40’s report connectivity rates in the 80’s, about 10% less than teenagers. For a large segment of users, we actually find that teens aren’t that much more connected.
Lets consider a second notion of connectivity, which is the saturation of your online connections with friends or contacts. Here, teens have old people beat hands down. Teens interact more with their friends online, they manage their lives online – overall, they are more connected to their personal networks through computers. Revisiting our first definition of connectivity, we can see that the explanation for the second definition must be heavily cultural, and not only technical. That is, this high saturation of connectivity is because of norms within younger users, and not just because they’re so much more connected than adults.
So what does this mean for Twitter? If Twitter’s users truly do skew older (and the difference between youngsters ‘18-24′ and oldsters ‘24-35′ was ns in Pew’s study), then Twitter benefits from what I think of as an identity-participation shift. My basic theory argues that as social norms and personal networks reward non-deceptive identities, people are more likely to share and participate in online communities. Put another way, as it becomes more OK to share (it stops being weird to use your real name on your Facebook profile), and more of your friends do it, you’re more likely to extend this type of participation to other parts of the web. Notably, the driving force of this theory is simple connectivity, which establishes the preconditions for the social shifts. For Twitter, there is a whole new old generation of web users coming online and embracing social software – because it is now socially OK to do so, because they have the connectivity and connections they need to feel worthwhile sharing, etc. And it just so happens that a lot of these people seem to have found Twitter.
The core problem here is that we’re treating older users as second-class citizens on the social web. I think that Twitter, and Facebook are going to serve as very useful testbeds to bat down this stereotype. In fact, I think we may see the older user emerge as the truly first-class citizen on the social web. As these users tend to be more settled, and going through less transitions that lead to upheval of the personal social networks, they may be more long-time users, less prone to “delete and move on” from one social site to the next. Of course, these ideas need to be tested, and I’m right now embarking on a long-term project to explore questions like these. If you are an older user of social software and might like to participate in my research interviews, keep watching this space for announcements.
[1] Jones, S. and Fox, S. (January 28, 2009). Generations Online in 2009. Pew Internet and American Life Project. Retrieved January 28, 2009 from http://www.pewinternet.org/PPF/r/275/source/rss/report_display.asp.
To avoid arbitrarily long games, Dada Chess forces resignation semi-randomly when the number of moves exceeds 500. Pretty much all of those games are destined to be draws. So 4459 draws (77.1%) total. About 16% of games have a winner, and there's no advantage to moving first.
[your profile directory]/formhistory.sqlite. I wrote a script (below) to dump the search history and went through it looking for fun. I found a lot of interesting stuff I'd forgotten about and stuff that's funny out of context, doing my part to add to other peoples' stock of Disturbing Search Requests. I thought I'd present some highlights, in the traditional Internet meme presentation of "one for every letter of the alphabet". Plus one number and one non-alphanumeric character.
Update: in case you were wondering, there were 4834 distinct search strings in my history.
It was not, however, until the 19th century that any great advance [in modern calculating machines] was made. In 1820 Charles Babbage began the construction of a machine for calculating mathematical tables, and in 1823 the Royal Society secured aid from the British government to enable him to continue his work. Babbage's progress not being satisfactory, this aid was soon withdrawn, but the work continued until 1856, when it was abandoned. From the time when Babbage began to the present, however, the modern calculating machine has been constantly improved, first by Thomas de Colmar (1820), and various types are now in extensive use.
[0] But that book is only 75 pages long, which makes me think that the history was greatly expanded, possibly in the post-public domain era. That would explain why no one ever scanned Volume II.
Brendan had a similar restaurant called "We'll Fry It!"
Last week, I appeared on the WUNC radio show “The State of Things.” We talked about Facebook for an hour – it was a great time. WUNC uploaded the MP3 the day of, but I’m only getting around to linking to it now. If you’d like to listen to the show, you can stream it here.
I wanted to make sure that I had my facts straight when I went on the air, so I prepared a little Facebook/SNS dossier. I’m sharing it here (PDF) – it may come in useful if you’re looking for some compiled facts about Facebook.
I also heard back from the original ChessPy author, who for complex reasons invited me to make my bug fixes public by forking the project. So here you go. It's also got unit tests for the stuff I changed. It's really easy to use, and recommended if you have some Dada Chess-like project that needs to run simulated chess games that don't require an AI.
I thought this project would take me days to realize, but thanks to Will McGugan's great Python chess library it only took a few hours. I did find a few bugs w/r/t what the chess library considers to be "check" and "checkmate", but I fixed them and sent in a patch.
While I was at it I fixed Dada Maps and Spurious, whose bits had rotted.
I met a guy, I'm pretty sure it was Mirco Müller, who grew up in East Germany. He'd never heard of the PolyPlay (which I'd forgotten the name of at the time), but he was conversant with the home computers of the time, and he mentioned that radio stations would broadcast programs for kids to record. They'd count down and then send a game or some other program over the air. You'd record it on a cassette tape and then use it in your computer's tape reader.
This is such an awesome idea and I'd never thought of it because it's so damn socialist. At the point on the technology curve where computer cassette drives make sense, you need to have private ownership of computers, but government ownership of radio stations and a government policy encouraging kids to mess around with computers (see previous entry for contrasting policies).
Otherwise the people who run the radio station won't want to make a timeslot to broadcast data, and the people who wrote the software will want to sell it instead of broadcasting it. You could have this scenario in a world with very primitive but very cheap computers, where such a show could be popular, but that brings us into the realm of science fiction--where I intend to milk this idea for all it's worth.
Is mine the first infernokrusher story to be sold? That can't be right. Prove me wrong. Examples predating the invention of infernokrusher grudgingly accepted. By browsing LibraryThing tags I've determined that John Varley's "Steel Beach" may be genuine infernokrusher, but most other things given that tag look like regular slipstream. Hey, if I don't have the expertise to slice music into sub-subgenres, I'll settle for fiction.
PS: I know infernokrusher is just a joke. Jokes are meant to be told.
On Wednesday, May 20th I’ll be appearing on WUNC’s excellent radio show “The State of Things.” As I listen to TSOT almost every day, it is pretty exciting to get a chance to do the show. We’ll be talking about social networking and its recent growth in popularity. If you’re local, tune in at noon tomorrow – or stream the show online at WUNC’s website.
While I was away on vacation, the Chicago Tribune profiled Freedom:
Are your weekdays a jittery mess? Distracted by e-mail? Tempted by Facebook? Too bleary-eyed from rotating though your Internet rounds for human interaction?
… Truth is, you don’t need Fred Stutzman’s Freedom. You already own a version — it’s called free will. But Fred Stutzman’s Freedom is more trustworthy than your free will.
Or as Stutzman’s Web site puts it, “Freedom will free you from the distractions of the Internet, allowing you time to code, write, or create.” Freedom, he says, “enforces freedom.”
But why no link to Freedom! C’mon, show the love!
Considered to be a very holy and venerable man, many drew near to Abba Sisoes while he was on his death bed. In his last moments, he saw choirs of angels and archangels, not to mention prophets, Apostles and saints. Wondering what was going on, those gathered around him asked, “With whom are you speaking, Abba?”
“With the angels,” he replied, and indicated that he was seeking to do penance before he left this life for the next.
Knowing his holiness, one friend said to him, “You have no need for penance, Father.”
Abba Sisoes replied, “I have not yet begun to repent.”
Rather than repudiating the legitimate pleasure taken in eating and in marital relations, fasting assists us in liberating ourselves from greed and lust, so that both these things become not a means of private pleasure but an expression of interpersonal communion.The second thing I hear is the singling out of this particular sin. As Peterson says: “I thought I couldn't be gay and a Christian.”
Here's what I mean. By normal standards, Khan is a really bad Star Trek villain. Star Trek villains generally have some self-justifying line of BS that makes them the hero. The Borg want to unify all life forms and cultures, the Cardassians are imperialists spreading civilization, the Xindi think they're acting in self-defense. Admiral Leyton is trying to stop the Federation from becoming soft. Harry Mudd is a Willy Loman type who's just trying to shift product. Etc. Their actions make sense given their worldview. Even TOS-era Khan assumes what he does is right by virtue of his innate superiority.
But Khan in Khan is Captain Ahab, a character defined by his obsession and his need for revenge. He no longer cares whether he's doing the right thing. With Khan it made sense, because he's the superman who's been humiliated by Kirk, a normal person. But the people who make Star Trek movies keep making the villain into Captain Ahab as if that'll make the movie good.
Check out the lousy Trek movie villians. Soran: obsessed with the Nexus. Ru'afo: wants revenge on the Son'a. (Yeah, I had to look those names up.) Shinzon: wants revenge on Picard, which doesn't even make sense. Nero: wants revenge on Spock (also makes no sense). Obsession. Revenge. They're all trying to be Khan.
Some of the Trek films don't have villains at all, which I always enjoy, but look at the good villains. General Chang: afraid of change. Borg Queen: is a freaking Borg. Even Sybok (religious fanatic) is a decent villain. I don't think you can get a big-budget SF movie made these days with no villain, but maybe if Khan hadn't been such a good movie the other Trek movies would have been better.
PS: Anyone complaining about red matter, rewatch Khan and try to explain how Genesis works.
I took some photos, but Susanna imposed a press embargo so that I wouldn't spoil the present surprise. Well, now the surprise has happened and I've got nothing to post tonight, so check it out. Susanna's entry on the topic has sewing instructions for all the food.
Evil settlement aside, I’m a fan of Google Booksearch. The ability to search within books is tremendously useful, and I look forward to the day that I’ve got a digital copy of all of the books on my shelves.
Until recently, I’ve kept track of interesting books in Google Booksearch by bookmarking them in my browser. This approach isn’t scaling well, so I decided to take advantage of Google’s native features by saving the books to my “Google Library.” I was shocked to find out that saving a book to your library requires that the book be added to your “shared library”, a public listing tied to your Google account.
There is no way to save a book privately in Google Booksearch. As Google writes in their FAQ, “When you add reviews, ratings, notes, or labels to a book—or when you add a book to your my Library page—that information will be publicly displayed on Google Book Search.” They go on to write that “No matter where you use these features, the information you submit will be displayed publicly.”
I couldn’t believe it either. If you want to set up a Google Library, even if it is just for convenience sake, you have to show the world what you’ve been reading. As far as I can tell, there’s no good technical or legal reason why one can’t save a book privately, or limit their book-sharing to a group of friends. This decision seems arbitrary and downright scary (or at least terribly ill-advised).
The cognitive dissonance comes from comparisons of Google’s Library policy to traditional libraries. Prominent in the ALA Code of Professional Ethics for Librarians is section 40.2.3: “We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.” The ALA also formally recommends that library administrators “advise all librarians and library employees that such records shall not be made available to any agency of state, federal, or local government except pursuant to such process, order or subpoena as may be authorized under the authority of, and pursuant to, federal, state, or local law relating to civil, criminal, or administrative discovery procedures or legislative investigative power.” (for more on regulation and library records see Minow, 2002)
Therefore, I must wonder why Google is not adhering to ALA policy, and the broader cultural norm of protecting library patron privacy. As Google partners with large institutions and attempts to monetize Booksearch, failing to respect patron privacy seems foolish and potentially dangerous. A patron researching a sensitive topic, or a topic that reveals information about the patron (for example, books about a health condition) will have their information revealed publicly if they add such a book to their library.
Google is clearly wrong on this issue, and must work to fix this dangerous privacy oversight. Have other librarians addressed this issue? Has Google responded? Unfortunately most of my due-dilligence for this post found articles/blog posts about the booksearch settlement, but I’d like to hear some other opinions.
Update: The Google Booksearch FAQ states that users may delete their data from public records. However, the link they provide doesn’t work (it is a 404), and it appears you have to delete all of your records (“Delete book search”) to remove book history from the public view.
Number forty-seven said to number three
"Hey, we're both prime!"
I am gonna mention two problems that I think are summer movie problems not Star Trek problems, so if you don't like spoilers or complaining, turn back.
Problem #1 has to do with the relationship between Scotty and his alien co-worker on Hoth. Scotty's always yelling at him, shoving him around, generally treating him like Igor. I don't terribly mind that Scotty was made the comic relief (it's Simon Pegg, after all), but this seemed cruel and even kind of racist of Scotty. There's only room for one racist on the Enterprise, and that's McCoy!
After the film I had writing group, and I told Andrew about this. He mentioned an interview he'd read in which Simon Pegg said Scotty and his co-worker had gone stir crazy being assigned to Hoth for so long. Sure, but stir crazy is a double-edged phaser. The alien should have shoved back.
Problem #2 is reliance on ungodly coincidences. I'm sure I'm overlooking something but I can't think of another Trek movie where major plot points just happened by coincidence. This shows up at its worst on Hoth. Kirk gets marooned there, he wanders around for a while and finds Future-Spock. Then the two of them wander a bit more and run into Scotty! If Kirk had found a guy (Scotty) who went on to play an major role in his life, that would be contigency, not coincidence. But Kirk meeting Future-Spock and then Future-Spock meeting Scotty was too much to bear, somehow.
PS: John Cho is terribly miscast as Sulu, but if someone says "We want you to play Sulu in a Star Trek movie" how can you turn them down?