Tasks, Variables and Participants: Experiments on bbc.co.uk
My team has spend the last 10 months writing software to run mass-participation experiments on bbc.co.uk. It’s been a fun project to work not least because it’s the first time we’ve tried to model the domain in which we are working and then write the code, features, tests, APIs etc. around it.
The most useful thing I’ve found about this approach is the common vocabulary that lets developers, non-technical production staff and academics converse in a much more fluent way that if we were each speaking in our own language. It forces the engineering teams to understand the problem in the terms of the client, then write their schemas, their models and their APIs around this language. I feel this leads to a more coherent, explainable system.
I thought it would be interested to explain the domain that is driving our software. I make apologies now for any inexpert curiosities below, I’m not a scientist and our software represents only a generalized model of an experiment. Comments and corrections are welcome.
Academic Sponsors
Each experiment is devised by a qualified academic, someone with an expert understanding of the scientific field under study.
The academic will ensure the scientific integrity of the experiment, oversee the ethical approval process and develop the mathematical models needed to analyze the data collected. If the experiment succeeds in producing new knowledge they may also report their findings through peer-reviewed journals.
Experiments (and Research Questions)
As a starting point for the design of an experiment the sponsor will typically phrase their work as an investigation to a research question.
For example, in our Child of our Time personality test the question formulated was to help further understand the correlations between demographic, life-style and personality traits. Or, in other words:
“Do our personalities shape our lives or do our lives shape our personalities?”.
With this question at the forefront of the academic’s mind they can begin to think about what sort of data they need to collect and measure for to aid this research.
Variables
Wikipedia defines an experiment as, “a method of investigating causal relationships among variables”. Variables represent the raw unit of data within the experiment, the empirical measurable things.
Here’s our variables:

If you run an experiment to test the speed of sound under different environmental conditions your variables might represent things like altitude, air pressure, temperature and time of day. In a sociological study the variables might represent demographic information about a person, age, ethnicity, salary etc.
Our software doesn’t distinguish between the various types of variables (dependent, independent, background …). The variable type seemed to depend very much of the perspective of the observing scientist and more important to the models they create than how that data is collected.
Is important to note that our variables aren’t just domain-agnostic key-value pairs, they have a slightly more interesting internal structure.

credit: socialresearchmethods.net
Values are the internal representations of variables, in contrast to the public facing attributes. For example, the public-facing attribute of a response might be ‘yes’ and ‘no’ whereas the internal values for analysis might be 1 or 0. The attributes are legible to humans, the values legible to computers. Because of this it usually makes more sense if values are numeric.
If the prompt ‘Do you like chocolate?’ has four attributes ‘detest’, ‘dislike’, ‘like’, ‘love’ then the internal values for each of these might be ‘0′, ‘1′, ‘2′, ‘3′ (3 = love, 0 = detest). For the purpose of analysis the internal values might often represent a ratio rather than fixed intervals, so ‘detest’ has a internal value of ‘-10′, and ‘dislike’, ‘like’, ‘love’ have values of ‘-3′, ‘0′, and ‘5′ respectively. Separating values from attributes makes it easier to represent these sort of relationships in data.
The relationship between values is known as the level of measurement, which determines the type of statistical analysis one can perform on the data once collected. i.e. data representing time needs different analysis to data representing a ranked scale.
Some variables are specialized, especially where we think they will appear again and again in multiple experiments.
Distance for example is a specialized variable with two constraining properties, min and max. Where this happens we’ve tended to pay special attention to the UI. In this instance the height of a person (height is a type of distance) is represented as a slider, which conveniently converts between metric and imperial as the handle is dragged horizontally.

We have similar specializations for things like weight, data collected against a Likert-scale, as well as several common data-types (boolean, enumerated, alphanumeric), each one enforces a standard interaction pattern across experiments.
Variables can also exist in composite form, share a co-dependence on one another and have validation criteria assigned to them. I expect as the system evolves and projects grow in scope our understanding of variables will grow deeper.
Tasks
A task is a container for a set of variables to be collected.
A task can be presented as a test, a game, a set of questions or, really, anything capable of collecting the variables required by the experiment.
Say an experiment needs to answer a question about wind speeds in the UK the task is then standing in a middle of a field with an anemometer (if it happens to have an http client built in).
If another experiment, as in Brain Test Britain, wants to the determine effectiveness of Brain Training then the tasks could be a series of puzzles to benchmark a person’s mental agility over a period of time.
So, a task is just a means to generate data for our experiment.
Here’s out experiment variables split in to three tasks.

Splitting the variables in to tasks has lots of benefits.
We can ask half you users to complete task 1 and the other half task 2 as found in classic A/B testing, or designate task 3 as the placebo group.
We can mandate that task 2 can be completed only once whereas tasks 1 and 3 can be completed multiple times by the same person.
We can designate periods of time that must occur between taking the tasks (ie. daily, three times a week, every other Tuesday) or we can decide how to assign (or prevent) certain tasks from people based on their given age, gender or some previous variables we hold against them. For example, the ethical considerations might require us to prevent under 18’s from taking part.
We can also randomize the order of tasks or serve them up a in fixed sequence.
Tasks, if you haven’t guessed yet, form the structure of the experiment, they control the flow of how we collect variables.
Participants (not Users)
I keep saying ‘users’, but people who go to websites are called users. People who take part in experiments (as the human subjects under study) are more usually called participants, so this is what we refer to them as. Presently a participant is an alias for a person with a BBC iD account.
Modeling the data this way allows us to inherit the benefits of this external service. A BBC iD user has a date-of-birth (i.e. an age at the point of participation), a country of residence and perhaps later things like friends and relations. All these things may be of interest to a scientist during their research.
The three experiments so far have all been sociological based, which lend themselves well to this vocabulary. As our understanding of different experiments grows perhaps there might be other types of user.
Variable Sets and Traits
In the last couple of months we’ve realized that we needed to do something with the variables the participant enters in to the site. There’s not a lot of reward in just filling out forms, so people want feedback.
If tasks are sets of variables that control how the data is collected we need a second set containing the same variables that control how the variable are analysed. We call this second group variable sets.
Here’s our variables again:
![]()
Lets say variables D, H, and K belong to a variable set. And lets say D represents height, H represents weight, and K the participant’s age. We could call this variable set, ‘body mass index‘, or BMI for short.
Note that these variables don’t need to be collected at the same time, they can span tasks. Age can collected at the point of registration, upon consenting to the experiment, and the other variables might be collected at some future point in time or conditionally based on their past activity.
We can think of the variables in this set as inputs to a function. In the case of our BMI, the function might calculate your age/weight/height ratio. The output of this function can be called, in very, very general terms, a trait, something that the participant can be said to exhibit give the measures they have divulged.
As far as is possible the technical system does not interpret the trait as being positive or negative. If a participant has a BMI value of less than 16.5 then we present that figure to the front-end (editorial) system that can chose to interpret the finding as it wishes and present them to the participant.
This is an important distinction. Our system does not know if a participant is morbidly obese or if they are severely emaciated, nor how to present that is a sensitive way to the participant, all these things are editorial judgements. Our software’s responsibility is to accurately calculate the BMI from a given set of inputs, not to have return things like, “needs to diet”, “needs to see relationship councillor”.
In the personality experiment the variable sets were based around the Big Five traits. Once calculated, the participant’s traits were mapped to videos of Robert Winston explaining what they meant.
Fin.
When I find some time I’ll write some more notes on the technical components, the APIs and how the software evolves over the coming months.

Comments are closed.