Threadless Numbers #1: Is it possible to predict how well a design submitted to Threadless will score?

Because I’m a t-shirt nerd, I wrote my final for Technology and Society (it’s Sociology) last semester on Threadless, which involved gathering some data from the Threadless website. Even though my research has some problems, I found some intriguing stuff!

I’m planning on making a small series of ‘Threadless Numbers’ posts, beginning with this one. Some of the other topics I would like to cover are 1) the differences between how professional and amateur designers use Threadless, 2) who actually gets their designs printed, and 3) what losing submissions are like. If you have any requests for topics (however general or specific), please tell me!

The Problem
So, is it possible to predict how well a design submitted to Threadless will score?

The Answer
In short: maybe, but not with the data I collected. My data, however, provides an indication that it should be possible to make a pretty good guess based on the number of comments a design has received.

What This Means for You
Because I don’t have a strong background in statistics, and because I’m guessing most of our readers don’t either, I’m going to leave some of that stuff out. Suffice it to say, there is a real correlation between the number of comments a design receives during scoring and its final score. In a perfect world, all you would have to do is replace x in the equation displayed on the graph above with the number of comments the design has received, and you’d get its final score.

So you can only predict the final score of a design in hindsight with my numbers, which isn’t much of a prediction at all. Because of this correlation, though, it would make sense that the comments on a submission before its scoring period has ended would also correlate with the final score, although this correlation is likely to be weaker unless you can control for things like the number of users scoring designs on a given day (or day of the week).

Next Steps
If the number of users varies consistently by the day of the week, surveying a large enough sample of designs and counting the number of comments left on each day of the week should yield enough information to control for this variability. Fortunately, Threadless tell you two useful things about submissions and comments: both what day a design was submitted and on which day each comment was left. I’d be impressed if somebody goes on to do this, but it would be a logical next step to what I’ve already done. Be sure to let me know what (if anything) you find!

Discussion of Data
If reading about possible problems with this data doesn’t float your boat, stop right here. If you’re still on the edge of your seat, read on.

One of the things that would most affect the regression is a data point that I consider an outlier. That point is for Disbelief, which has 344 comments and a score of 3.10. Including the outlier makes the correlation look more logarithmic than linear (the R2 is higher for a logarithmic trend line than a linear one, but lower than when excluding it). Without data on how individuals vote and comment, I can’t think of many conclusions to make from this fact. It would make sense, however, for the correlation to be logarithmic, because the maximum average score a design can have is 5, and even the most popular designs don’t score anywhere near that.

Another potential problem with my data is that I was only working with the 60 most recently printed designs as of May 6, 2007; because an overwhelming majority of submissions are pulled after 24 hours for having such low scores, it’s difficult to find data on non-winning designs that have finished their entire time in the running. I’m not sure how this would affect the data, but it seems as if it might.

A final problem is that I used the total number of comments (both negative and positive) left for a submission, because it is an easy metric to record from the Threadless website. Using the total number of positive comments would probably result in a stronger correlation.

So can I predict the final score of a design submitted to Threadless? Not really. But I hope one of you figures it out!

Continue Reading ‘Threadless Numbers’
Threadless Numbers #2: What’s the difference between professional and amateur designers on Threadless?
Threadless Numbers #3: What kind of t-shirts don’t get printed at Threadless?
Threadless Numbers #4: Does being a Threadless ‘alumnus/a’ give your submissions an advantage?

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google

2 Responses to “Threadless Numbers #1: Is it possible to predict how well a design submitted to Threadless will score?”


  1. 1 JoeMonster

    Great work! I’ve been submitting to Threadless for a while and though I do agree with you the number of comments do suggest whether a design is going to do well or not, I think it might be interesting to also compare that to the number of people scored.

    Another interesting thing I noticed is that if there’s close to zero comments on a design during the first two days of submission, or there are fewer people scoring it within the same period of time (you can do a quick comparision with other submissions sent in around the same time), there is a great chance that it will be dropped early.

  2. 2 Joe

    I’m glad you like it, Joe! I made this graph just for you: Graph for Joe

    It’s not very good because I made it very quickly, but the x-axis is (number of comments received)/(total number of scores received) and the y-axis is the final average score.

    I don’t completely understand how designs could be scored significantly less than others within the same period of time (except for people skipping them or for people driving traffic to their own submissions, but I’m skeptical that either of these could have a huge influence…), yet it certainly happens. Any insights?

Leave a Reply