Unless you’re the Chinese government, however, it’s not possible to get a list of everyone’s names to randomly draw from. As a result, other methods must be used that produce samples which approximate the population. Oftentimes, these are only quasi-random or not random at all (e.g., quota sampling). This is where knowing our sample’s demographics can help ensure that, in the absence of true random selection, our sample still looks similar to the population.

However, in a social media survey, we tend to know nothing about the respondents, so it’s impossible to assess how our sample matches up to the population of interest. Our Twitter survey estimate could be spot on or ridiculously off depending on our sample’s demographics. To make matters worse, the margin of error applies not only to the total number of respondents to the survey, but also how the sample breaks down demographically. This can lead to very large margins of error that we have no method of evaluating on Twitter or Weibo.

Problem #3: Social media surveys are havens for response bias.

It’s not controversial to say that social media differs from real-world discourse in fundamental ways. Users are limited by character counts, accounts are often anonymous, and algorithms privilege emotion over logic. In many ways, social media does not encourage the care and attention needed to effectively survey how people think. In reality, social scientists toil away at question wording and survey design to ensure they’re precisely measuring what people believe and not what people might be (inadvertently) primed to think in that particular moment.

During this process, social scientists are acutely aware of non-response bias, a phenomenon that occurs when those who do not take a survey differ systematically from those who do. In nondemocratic countries like China, for example, it is well-documented that respondents might fear repression for expressing their true opinions about sensitive topics and, therefore, choose not to respond to surveys (or lie about their opinion, another phenomenon called preference falsification). As a result, responses to questions like ‘Do you support the Chinese Communist Party?’ tend to be heavily biased (studies that do not account for these phenomena report figures upwards of 90%, while more careful studies put this figure between 50% and 60%).

Non-response bias can also be caused by innocuous reasons, such as too long a survey questionnaire. On Twitter, we can think of other types of non-response bias. For example, some people engage with Tweets very little, and these lurkers’ opinions might systematically differ from those who are very active on Twitter – particularly in the China watching space – and engage with Tweets frequently. This might create situations where only people passionate and knowledgeable about the subject of a Twitter poll actually participate, which presents an important source of bias.

The big picture: polls on social media, such as Twitter and Weibo, present an inaccurate and unreliable picture of what the general public thinks. They should be invariably ignored.