<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>Blog</title>
	<atom:link href="http://matt.chadburn.co.uk/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://matt.chadburn.co.uk</link>
	<description></description>
	<pubDate>Tue, 27 Apr 2010 16:52:38 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Wget &amp; robots.txt</title>
		<link>http://matt.chadburn.co.uk/?p=690</link>
		<comments>http://matt.chadburn.co.uk/?p=690#comments</comments>
		<pubDate>Tue, 27 Apr 2010 16:50:08 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=690</guid>
		<description><![CDATA[Wget obeys robots.txt files. And that&#8217;s why I&#8217;ve spend the last 45 minutes trying to figure out why it didn&#8217;t download an Apache directory of images. It says so in the first paragraph of the man page. One day I will learn.
wget -e robots=off -r http://foo...

Just saying.
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.gnu.org/software/wget/">Wget</a> obeys robots.txt files. And that&#8217;s why I&#8217;ve spend the last 45 minutes trying to figure out why it didn&#8217;t download an Apache directory of images. It says so in the first paragraph of the man page. One day I will learn.</p>
<pre><code>wget -e robots=off -r <em>http://foo...</em>
</code></pre>
<p>Just saying.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=690</wfw:commentRss>
		</item>
		<item>
		<title>Tasks, Variables and Participants: Experiments on bbc.co.uk</title>
		<link>http://matt.chadburn.co.uk/?p=680</link>
		<comments>http://matt.chadburn.co.uk/?p=680#comments</comments>
		<pubDate>Thu, 26 Nov 2009 23:37:55 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=680</guid>
		<description><![CDATA[
My team has spend the last 10 months writing software to run mass-participation experiments on bbc.co.uk. It&#8217;s been a fun project to work not least because it&#8217;s the first time we&#8217;ve tried to model the domain in which we are working and then write the code, features, tests, APIs etc. around it.

The most useful thing [...]]]></description>
			<content:encoded><![CDATA[<p>
My team has spend the last 10 months writing software to run mass-participation experiments on bbc.co.uk. It&#8217;s been a fun project to work not least because it&#8217;s the first time we&#8217;ve tried to model the domain in which we are working and then write the code, features, tests, APIs etc. around it.</p>
<p><a href="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/domainmodel091.png"><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/domainmodel091-300x251.png" alt="experiments domain" border="0" style="float:right;margin:1em;"/></a></p>
<p>The most useful thing I&#8217;ve found about this approach is the common vocabulary that lets developers, non-technical production staff and academics converse in a much more fluent way that if we were each speaking in our own language. It forces the engineering teams to understand the problem in the terms of the client, then write their schemas, their models and their APIs around this language. I feel this leads to a more coherent, explainable system.</p>
<p>I thought it would be interested to explain the domain that is driving our software. I make apologies now for any inexpert curiosities below, I&#8217;m not a scientist and our software represents only a generalized model of an experiment. Comments and corrections are welcome.</p>
<h3>Academic Sponsors</h3>
<p>Each <em>experiment</em> is devised by a qualified <em>academic</em>, someone with an expert understanding of the <em>scientific field</em> under study.</p>
<p>The academic will ensure the scientific integrity of the experiment, oversee the ethical approval process and develop the mathematical models needed to analyze the data collected. If the experiment succeeds in producing new <a href="http://en.wikipedia.org/wiki/Knowledge" title="The Truth">knowledge</a> they may also report their findings through peer-reviewed journals.</p>
<h3>Experiments (and Research Questions)</h3>
<p>As a starting point for the design of an experiment the sponsor will typically phrase their work as an investigation to a <a href="http://en.wikipedia.org/wiki/Research_question">research question</a>.</p>
<p>For example, in our Child of our Time <a href="https://www.bbc.co.uk/labuk/experiments/personality/">personality test</a> the question formulated was to help further understand the correlations between demographic, life-style and personality traits. Or, in other words:</p>
<blockquote><p>&#8220;Do our personalities shape our lives or do our lives shape our personalities?&#8221;.</p></blockquote>
<p>With this question at the forefront of the academic&#8217;s mind they can begin to think about what sort of data they need to collect and measure for to aid this research.</p>
<h3>Variables</h3>
<p>Wikipedia defines an experiment as, <em>&#8220;a method of investigating causal relationships among variables&#8221;</em>. Variables represent the raw unit of data within the experiment, the empirical measurable things.</p>
<p>Here&#8217;s our variables:</p>
<p><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/variables1.png" alt="variables" border="0" /></p>
<p>If you run an experiment to test the <a href="http://en.wikipedia.org/wiki/Speed_of_sound">speed of sound</a> under different environmental conditions your variables might represent things like altitude, air pressure, temperature and time of day. In a sociological study the variables might represent demographic information about a person, age, ethnicity, salary etc.</p>
<p>Our software doesn&#8217;t distinguish between the various types of variables (<a href="http://en.wikipedia.org/wiki/Dependent_and_independent_variables">dependent, independent, background</a> &#8230;). The variable type seemed to depend very much of the perspective of the observing scientist and more important to the models they create than how that data is collected.</p>
<p>Is important to note that our variables aren&#8217;t just domain-agnostic key-value pairs, they have a slightly more interesting internal structure.</p>
<p><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/variables-structure1.png" alt="variables structure (variables, attributes, values, measurements)" border="0"  /></p>
<p><cite>credit: <a href="http://www.socialresearchmethods.net/kb/measlevl.php">socialresearchmethods.net</a><br />
</cite></p>
<p><em>Values</em> are the internal representations of variables, in contrast to the public facing <em>attributes</em>. For example, the public-facing attribute of a response might be &#8216;yes&#8217; and &#8216;no&#8217; whereas the internal values for analysis might be 1 or 0. The attributes are legible to humans, the values legible to computers. Because of this it usually makes more sense if values are numeric.</p>
<p>If the prompt &#8216;Do you like chocolate?&#8217; has four attributes &#8216;detest&#8217;, &#8216;dislike&#8217;, &#8216;like&#8217;, &#8216;love&#8217; then the internal values for each of these might be &#8216;0&#8242;, &#8216;1&#8242;, &#8216;2&#8242;, &#8216;3&#8242; (3 = love, 0 = detest). For the purpose of analysis the internal values might often represent a ratio rather than fixed intervals, so &#8216;detest&#8217; has a internal value of &#8216;-10&#8242;, and &#8216;dislike&#8217;, &#8216;like&#8217;, &#8216;love&#8217; have values of &#8216;-3&#8242;, &#8216;0&#8242;, and &#8216;5&#8242; respectively. Separating values from attributes makes it easier to represent these sort of relationships in data. </p>
<p>The relationship between values is known as the <a href="http://www.socialresearchmethods.net/kb/measlevl.php">level of measurement</a>, which determines the type of statistical analysis one can perform on the data once collected. i.e. data representing <em>time</em> needs different analysis to data representing a <em>ranked</em> scale.</p>
<p>Some variables are <a href="http://en.wikipedia.org/wiki/Inheritance_%28computer_science%29#Specialization">specialized</a>, especially where we think they will appear again and again in multiple experiments.</p>
<p>Distance for example is a specialized variable with two constraining properties, <em>min</em> and <em>max</em>. Where this happens we&#8217;ve tended to pay special attention to the UI. In this instance the height of a person (height <em>is</em> a type of distance) is represented as a slider, which conveniently converts between metric and imperial as the handle is dragged horizontally.</p>
<p><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/height.png" alt="height.png" border="0" width="494" height="109" /></p>
<p>We have similar specializations for things like weight, data collected against a <a href="http://en.wikipedia.org/wiki/Likert_scale">Likert-scale</a>, as  well as several common data-types (boolean, enumerated, alphanumeric), each one enforces a standard interaction pattern across experiments.</p>
<p>Variables can also exist in <em>composite</em> form, share a <em>co-dependence</em> on one another and have validation criteria assigned to them. I expect as the system evolves and projects grow in scope our understanding of variables will grow deeper.</p>
<h3>Tasks</h3>
<p>A <em>task</em> is a container for a set of variables to be collected.</p>
<p>A task can be presented as a test, a game, a set of questions or, really, anything capable of collecting the variables required by the experiment.</p>
<p>Say an experiment needs to answer a question about wind speeds in the UK the <em>task</em> is then standing in a middle of a field with an anemometer (if it happens to have an <a href="http://goingapps.com/" title="anemometer for iphone">http client built in</a>).</p>
<p>If another experiment, as in <a href="https://www.bbc.co.uk/labuk/experiments/braintestbritain/">Brain Test Britain</a>, wants to the determine effectiveness of Brain Training then the tasks could be a series of puzzles to benchmark a person&#8217;s mental agility over a period of time.</p>
<p>So, a task is just a means to generate data for our experiment.</p>
<p>Here&#8217;s out experiment variables split in to three tasks.</p>
<p><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/tasks.png" alt="tasks.png" border="0" width="520" height="81" /></p>
<p>Splitting the variables in to tasks has lots of benefits.</p>
<p>We can ask half you users to complete task 1 and the other half task 2 as found in classic <a href="http://en.wikipedia.org/wiki/A/B_testing">A/B testing</a>, or designate task 3 as the <a href="http://en.wikipedia.org/wiki/Placebo">placebo</a> group.</p>
<p>We can mandate that task 2 can be completed only once whereas tasks 1 and 3 can be completed multiple times by the same person.</p>
<p>We can designate periods of time that must occur between taking the tasks (ie. daily, three times a week, every other Tuesday) or we can decide how to assign (or prevent) certain tasks from people based on their given age, gender or some previous variables we hold against them. For example, the ethical considerations might require us to prevent under 18&#8217;s from taking part.</p>
<p>We can also randomize the order of tasks or serve them up a in fixed sequence.</p>
<p>Tasks, if you haven&#8217;t guessed yet, form the structure of the experiment, they control the flow of how we collect variables.</p>
<h3>Participants (not Users)</h3>
<p>I keep saying &#8216;users&#8217;, but people who go to websites are called users. People who take part in experiments (as the human subjects under study) are more usually called <em>participants</em>, so this is what we refer to them as. Presently a participant is an alias for a person with a <a href="https://id.bbc.co.uk/users/">BBC iD</a> account.</p>
<p>Modeling the data this way allows us to inherit the benefits of this external service. A BBC iD user has a date-of-birth (i.e. an age at the point of participation), a country of residence and perhaps later things like <a href="http://www.bbc.co.uk/blogs/webdeveloper/2009/11/extending-opensocial-and-shind.shtml">friends and relations</a>. All these things may be of interest to a scientist during their research.</p>
<p>The three experiments so far have all been sociological based, which lend themselves well to this vocabulary. As our understanding of different experiments grows perhaps there might be other types of user.</p>
<h3>Variable Sets and Traits</h3>
<p>In the last couple of months we&#8217;ve realized that we needed to do something with the variables the participant enters in to the site. There&#8217;s not a lot of reward in just filling out forms, so people want feedback. </p>
<p>If <em>tasks</em> are sets of variables that control how the data is collected we need a second <a href="http://en.wikipedia.org/wiki/Set_%28mathematics%29">set</a> containing the same variables that control how the variable are analysed. We call this second group <em>variable sets</em>.</p>
<p>Here&#8217;s our variables again:</p>
<p><img src="http://matt.chadburn.co.uk/wp-content/uploads/2009/11/variablesets.png" alt="variablesets.png" border="0" width="453" height="40" /></p>
<p>Lets say variables <em>D</em>, <em>H</em>, and <em>K</em> belong to a variable set. And lets say <em>D</em> represents <em>height</em>, <em>H</em> represents <em>weight</em>, and <em>K</em> the participant&#8217;s <em>age</em>. We could call this variable set, &#8216;<a href="http://en.wikipedia.org/wiki/Body_mass_index"><em>body mass index</em></a>&#8216;, or <em>BMI</em> for short.</p>
<p>Note that these variables don&#8217;t need to be collected at the same time, they can span tasks. Age can collected at the point of registration, upon <em>consenting</em> to the experiment, and the other variables might be collected at some future point in time or conditionally based on their past activity.</p>
<p>We can think of the variables in this set as inputs to a <a href="http://en.wikipedia.org/wiki/Function_%28mathematics%29">function</a>. In the case of our <em>BMI</em>, the function might calculate your age/weight/height ratio. The output of this function can be called, in very, very general terms, a <em>trait</em>, something that the participant can be said to exhibit give the measures they have divulged.</p>
<p>As far as is possible the technical system does not interpret the trait as being positive or negative. If a participant has a BMI value of less than 16.5 then we present that figure to the front-end (editorial) system that can chose to interpret the finding as it wishes and present them to the participant.</p>
<p>This is an important distinction. Our system does not know if a participant is morbidly obese or if they are severely emaciated, nor how to present that is a sensitive way to the participant, all these things are editorial judgements. Our software&#8217;s responsibility is to accurately calculate the BMI from a given set of inputs, not to have return things like, &#8220;needs to diet&#8221;, &#8220;needs to see relationship councillor&#8221;.</p>
<p>In the <a href="https://www.bbc.co.uk/labuk/experiments/personality/">personality experiment</a> the variable sets were based around the <a href="http://en.wikipedia.org/wiki/Big_Five_personality_traits">Big Five traits</a>. Once calculated, the participant&#8217;s traits were mapped to videos of Robert Winston explaining what they meant.</p>
<h3>Fin.</h3>
<p>When I find some time I&#8217;ll write some more notes on the technical components, the APIs and how the software evolves over the coming months.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=680</wfw:commentRss>
		</item>
		<item>
		<title>Quote from Bill Watterson (lesson for Quentin Blake et al.)</title>
		<link>http://matt.chadburn.co.uk/?p=666</link>
		<comments>http://matt.chadburn.co.uk/?p=666#comments</comments>
		<pubDate>Wed, 02 Sep 2009 06:25:07 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[quote]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=666</guid>
		<description><![CDATA[
Actually, I wasn&#8217;t against all merchandising when I started the strip, but each product I considered seemed to violate the spirit of the strip, contradict its message, and take me away from the work I loved. If my syndicate had let it go at that, the decision would have taken maybe 30 seconds of my [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>
Actually, I wasn&#8217;t against all merchandising when I started the strip, but each product I considered seemed to violate the spirit of the strip, contradict its message, and take me away from the work I loved. If my syndicate had let it go at that, the decision would have taken maybe 30 seconds of my life.
</p></blockquote>
<p>Bill Watterson, <a href="http://www.andrewsmcmeel.com/calvinandhobbes/interview.html">interview</a>, also <a href="http://ignatz.brinkster.net/cheapening.html">here</a>. </p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=666</wfw:commentRss>
		</item>
		<item>
		<title>Heartbeat</title>
		<link>http://matt.chadburn.co.uk/?p=658</link>
		<comments>http://matt.chadburn.co.uk/?p=658#comments</comments>
		<pubDate>Tue, 25 Aug 2009 22:18:23 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=658</guid>
		<description><![CDATA[pip-4-months.mp3, with artwork &#038; composer credits.
]]></description>
			<content:encoded><![CDATA[<p><a href="/projects/pip/heartbeat-4-months.mp3" title="4 months heartbeat">pip-4-months.mp3</a>, with artwork &#038; composer credits.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=658</wfw:commentRss>
		</item>
		<item>
		<title>Bee-Sting of Lomie</title>
		<link>http://matt.chadburn.co.uk/?p=650</link>
		<comments>http://matt.chadburn.co.uk/?p=650#comments</comments>
		<pubDate>Sat, 08 Aug 2009 16:27:17 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[quote]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=650</guid>
		<description><![CDATA[This quote from a Simon Kuper book made me laugh a lot but I&#8217;ve no idea of it&#8217;s veracity,

The general director agreed to an interview (for free) and the next day I found him in his office. It is basic and battered and located in the basement of the Omnisports Stadium, just a few doors [...]]]></description>
			<content:encoded><![CDATA[<p>This quote from a Simon Kuper book made me laugh a lot but I&#8217;ve no idea of it&#8217;s veracity,</p>
<blockquote><p>
The general director agreed to an interview (for free) and the next day I found him in his office. It is basic and battered and located in the basement of the Omnisports Stadium, just a few doors down from the room where he kept 120 pygmies from the Cameroonian rainforests locked up last summer. Milla [a Cameroonian star at the 1990 World Cup] had invited the pygmies to play a few games at the Omnisports, to raise money for their health and education, but he imprisoned them there, issued them with guards (one of whom wore a Saddam Hussein T shirt) and seldom fed them. A tournament spokesman explained to Reuters: “They play better if they don’t eat too much”. As for the imprisonment: “You don’t know the pygmies. They are extremely difficult to keep in control”. The Omnisports cook concurred: “These pygmies can eat at any time of the day and night and never have enough”. The little hunters themselves were too frightened to comment. </p>
<p>Their tournament was a disaster. Team names included Bee-Sting of Lomie and the aptly named Ants of Salapoumbe, but only 50 fans bought tickets, and most of these came strictly to shout abuse at the pygmies.
</p></blockquote>
<p>From <a href="http://www.amazon.co.uk/Football-Against-Enemy-Simon-Kuper/dp/0752848771">Football Against The Enemy</a> - Simon Kuper </p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=650</wfw:commentRss>
		</item>
		<item>
		<title>Baby</title>
		<link>http://matt.chadburn.co.uk/?p=646</link>
		<comments>http://matt.chadburn.co.uk/?p=646#comments</comments>
		<pubDate>Wed, 29 Jul 2009 16:16:21 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[photo]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=646</guid>
		<description><![CDATA[
Due sometime in Feb. Woot!
Quite a good picture, definitely has a head, a heart, hands etc.
]]></description>
			<content:encoded><![CDATA[<p><img src="/projects/pip/pip-3-months.jpg" /></p>
<p>Due sometime in Feb. Woot!</p>
<p>Quite a good picture, definitely has a head, a heart, hands etc.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=646</wfw:commentRss>
		</item>
		<item>
		<title>More Timeplotting</title>
		<link>http://matt.chadburn.co.uk/?p=622</link>
		<comments>http://matt.chadburn.co.uk/?p=622#comments</comments>
		<pubDate>Mon, 20 Jul 2009 13:20:14 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[timeplot]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=622</guid>
		<description><![CDATA[I made another timeplot similar to the Wimbledon Singles thing I produced earlier this month.
This one shows the final day of England&#8217;s win at Lords.
It reminded me of Tufte&#8217;s Sparklines, so I plugged the data file in to timetric, a Cambridge based company who will plot your data to a pretty graph if you give [...]]]></description>
			<content:encoded><![CDATA[<p>I made <a href="http://matt.chadburn.co.uk/projects/timeplots/cricket/2009/ashes/2/">another timeplot</a> similar to the <a href="http://matt.chadburn.co.uk/?p=598">Wimbledon Singles</a> thing I produced earlier this month.</p>
<p>This one shows <a href="http://matt.chadburn.co.uk/projects/timeplots/cricket/2009/ashes/2/">the final day</a> of England&#8217;s win at Lords.</p>
<p>It reminded me of Tufte&#8217;s <a href="http://en.wikipedia.org/wiki/Sparkline">Sparklines</a>, so I plugged the <a href="http://matt.chadburn.co.uk/projects/timeplots/cricket/2009/ashes/2/100653157.timeseries.txt">data file</a> in to <a href="http://timetric.com/">timetric</a>, a Cambridge based company who will plot your data to a pretty graph if you give them a half-sensible URI.</p>
<p>Out popped this:</p>
<p>  <img src="http://timetric.com/series/GmVnWx-KSIedjp6vOBqvkg/sparkline/"/></p>
<p>The Sparkline shows England&#8217;s odds of winning which went from &#8216;likely&#8217; to &#8216;very likely&#8217; shortly after 11am when Flintoff took the first wicket, indicated by the sudden drop in odds from 1.43 to 1.13 as the line starts. The visitors spent the next hour battling back to only to find the game effectively over when Swann took the second, as seen in the cliff-like drop in odds to 1.01 towards the centre of the sparkline.</p>
<p>You can visit the (bigger) graphs at <a href="http://timetric.com/series/GmVnWx-KSIedjp6vOBqvkg/">my timetric page</a>.</p>
<p>I think next time I&#8217;ll find a sport with multiple participants to see what lots of interspersed lines over the x-axis look like. The <a href="http://news.bbc.co.uk/sport1/hi/golf/8158376.stm">British Open golf</a> would have been a neat example of this but I wasn&#8217;t in front of a computer yesterday, alas.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=622</wfw:commentRss>
		</item>
		<item>
		<title>Sauce Labs &#8216;Not Found&#8217;</title>
		<link>http://matt.chadburn.co.uk/?p=612</link>
		<comments>http://matt.chadburn.co.uk/?p=612#comments</comments>
		<pubDate>Mon, 13 Jul 2009 21:12:18 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[code]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=612</guid>
		<description><![CDATA[Saucelabs is a cunning idea. Running Selenium RC (the Selenium HTTP API) off some EC2 instances means I don&#8217;t need to build and maintain my own functional test boxes.
It cost me about $0.02 to run my first simple test and when I have a 50 tests running each night of various complexity I guess is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://saucelabs.com/">Saucelabs</a> is a cunning idea. Running <a href="http://seleniumhq.org/projects/remote-control/">Selenium RC</a> (the Selenium HTTP API) off some EC2 instances means I don&#8217;t need to build and maintain my own functional test boxes.</p>
<p>It cost me about $0.02 to run my first simple test and when I have a 50 tests running each night of various complexity I guess is going to cost perhaps $3 or $4 a day, which isn&#8217;t a lot for the hassle of avoiding purchase/maintenance of a little farm of selenium boxes. It&#8217;s peanuts compared to developer time, who I&#8217;d rather pay to write tests than maintain infrastructure.</p>
<p>The only problem I found was an unhelpful error message sent from the RC server when my Selenium client supplied it bad credentials.</p>
<pre>Not Found
com.thoughtworks.selenium.HttpCommandProcessor.getCommandResponse(HttpCommandProcessor.java:124)
</pre>
<p>The problem in my case was twofold. Firstly I&#8217;d cut and pasted the Saucelabs access-key incorrectly causing an authentication error (my fault) and secondly, for some reason, I quoted the browser string in my Ant properties file with single quotes which caused ResourcesBundle to read and pass the complete string (quotes intact) to the RC server, ie.</p>
<pre>
selenum.server=saucelabs
selenum.port=4444
selenium.browser='{"username": "bbc_labuk", "access-key": "xxx", "os": ... }'
...
</pre>
<p>Why did I do that? My fault again I suppose. Or maybe ResourceBundle should strip quoted properties?</p>
<p>Either way, when Saucelabs RC server doesn&#8217;t like the smell of your browser string it will give you a &#8216;Not Found&#8217; error and it confused me for the best part of an afternoon - &#8216;Not Found&#8217; being very synonymous with a <a href="http://en.wikipedia.org/wiki/HTTP_404">certain other</a> class of HTTP error.</p>
<p><strong>Updated.</strong></p>
<p>John from Saucelabs has been in touch. Error messages is something they are &#8216;investing more attention&#8217; too in the near future, so stay tuned!</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=612</wfw:commentRss>
		</item>
		<item>
		<title>Timeplotting Betfair data</title>
		<link>http://matt.chadburn.co.uk/?p=598</link>
		<comments>http://matt.chadburn.co.uk/?p=598#comments</comments>
		<pubDate>Fri, 10 Jul 2009 22:04:59 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[timeplot]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=598</guid>
		<description><![CDATA[Betting odds are a great (if not the greatest) indicator of future truths.
Most online betting companies operate live markets that are left open to new bets as the event is taking place. For example, last weekend you could still bet on Andy Roddick to win Wimbledon right up until the final point of the final [...]]]></description>
			<content:encoded><![CDATA[<p>Betting odds are a great (if not the <a href="http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds">greatest</a>) indicator of future truths.</p>
<p>Most online betting companies operate live markets that are left open to new bets as the event is taking place. For example, last weekend you could still bet on Andy Roddick to win Wimbledon right up until the final point of the final game, the odds growing smaller and smaller as his chance to win grew ever more impossible.</p>
<p>This data about the <em>truth</em> can tell a fascinating retrospective story about the market. If there&#8217;s a lot of market movement during the game, with odds fluctuating between the eventual victor and loser, the event could be deemed more exciting as the collective wisdom couldn&#8217;t make up their minds as to who was going to win, the outcome only being discovered in the dying moments of the match.</p>
<p>In the dullest matches the odds flat-line, showing little movement in any direction or very quickly favouring one team over the other.</p>
<p>Similarly, points of high drama (a sending off in rugby, a tie break in tennis&#8230;) tend to swing the markets rapidly in one direction or another for a short period of time as the crowd herd towards a particular outcome.</p>
<p>So, if we have data that can determine <em>competitiveness</em> and amount of <em>drama</em> in a sporting event then these are at least two of the prerequisites that determine whether something is worth watching on the various online <a href="http://www.bbc.co.uk/iplayer/">catch-up</a> services. Furthermore, now that <a href="http://www.bbc.co.uk/blogs/bbcinternet/2009/07/bbc_iplayer_now_lets_you_link.html">iPlayer can link to time segments</a> within a show, the betting data could be also be used to provide indexes to key moments in a match.</p>
<p>I thought it would be fun to log the live market movement of the 2009 Women&#8217;s singles final every few seconds to see what story it would tell. The image below uses <a href="http://www.simile-widgets.org/timeplot/">timeplot</a> from the MIT SIMILE project, showing each player&#8217;s odds along the y-axis and time along x-axis,</p>
<p><a href="/projects/timeplots/wimbledon/2009/women_singles/"><img src="http://matt.chadburn.co.uk/projects/timeplots/wimbledon/2009/women_singles/timeplot.png" width="600"/></a></p>
<p><em><a href="/projects/timeplots/wimbledon/2009/women_singles/">demo here</a> (requires html canvas support)</em></p>
<p>The red line shows Venus the clear favourite (with lower odds on the y-axis) right up until the first set tie break at 3pm after which the odds were reversed as the match gradually slipped away from her over the next half-an-hour. It was an exciting match for an hour or so.</p>
<p>Looking at the corresponding <a href="http://news.bbc.co.uk/sport1/hi/tennis/8133985.stm">BBC Sport</a> live text reports there were two moment of drama. The first around 30 minutes in to the match can be seen obout a third of the way along the timeplot, where Serena&#8217;s odds jump sharply up for a couple of minutes. Here&#8217;s the BBC Sport notes from around that time,</p>
<blockquote><p>14:38 Venus *4-4 Serena Venus ramps up the power on her return, jumping out to a 30-0 lead, and peppering the baseline with some ferocious groundstrokes, she earns two break points. Serena&#8217;s second serve kicks up viciously to force the error, before lil sis comes galloping to the net. Venus misses by inches with the pass and Serena comes through with two aces on the trot.</p></blockquote>
<p>It seems to describe the first important moment in the match, Serena nearly losing her serve.</p>
<p>The other obvious change in the time series, at 3pm, where the market swings rapidly between the two possible outcomes is again described by the BBC,</p>
<blockquote><p>15:02 Venus 6-7 (3-7) Serena Serena nudges ahead, a rocketing forehand making Venus net for 3-1. HawkEye challenge on the next point, but Venus&#8217;s backhand brushes the baseline. A crunching off-forehand means Serena swaps sides at 4-2. At 5-2, Serena comes up with a brutal combination of groundstrokes, finding the angle to wrong-foot Venus - leaving her sister on the ground. Serena doesn&#8217;t see that, she has turned around and is pumping her fist. She misses the first set point but then produces a stunning, stunning backhand lob to claim it. More fist-pumping. Brilliant stuff from the younger Williams.</p></blockquote>
<p>After that the odds diverge to the extent where the match is effectively over 15 minutes before the last point.</p>
<p>It would be neat if I could link the timeline up to the time stamps in <a href="http://www.bbc.co.uk/i/p003n7bt/">archived Wimbledon final</a> on iPlayer so that people could dip in and watch the clips of these two important moments - <a href="http://www.bbc.co.uk/i/p003n7bt/?t=25m30s">bbc.co.uk/i/p003n7bt/?t=25m30s</a> - but it doesn&#8217;t seem the feature is enabled for that video. </p>
<p>You can get the <a href="http://matt.chadburn.co.uk/projects/timeplots/wimbledon/2009/women_singles/data.txt">time series data</a> yourself, as well as download the <a href="http://code.google.com/p/betfairfree/">betfairfree</a> project that generated this data.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=598</wfw:commentRss>
		</item>
		<item>
		<title>Random (enough) strings on OS X</title>
		<link>http://matt.chadburn.co.uk/?p=594</link>
		<comments>http://matt.chadburn.co.uk/?p=594#comments</comments>
		<pubDate>Thu, 09 Jul 2009 16:02:12 +0000</pubDate>
		<dc:creator>mattc</dc:creator>
		
		<category><![CDATA[code]]></category>

		<guid isPermaLink="false">http://matt.chadburn.co.uk/?p=594</guid>
		<description><![CDATA[I needed some big random strings.

cat /dev/random &#124; hexdump &#124; tr -d ' \n'

Take your pick,

f8af39f196671bedac2a252a400219f905a40aea18a7c8460ef37e7a1993b51130219fa
06da0efd4e57c77f3ef898b93a08905b70219fb07451adff722635f0e9a91393f9046a9
b0219fc0a05f41d1225cc5544b5440926bb56dd50219fd0233f831154277b7c2f2ea6c2
e354f5690219fe0ee82dc88de5647dc1f68377af4f570890219ff0bcc39ff21e0108c85
535f4082aed3463021a000c3451ca6d31d6bb4eced539502fa6295021a01023432a6f94
dfb09eb0e2ad431ba399a1021a02088df34efac7cf34b3505d7a0db85e194021a030dfc
348f9f4ca7ac4cec11186c981da86021a04078bd504f169b45fc5ecadbf5c4bcb9cf021
a050136134d82552d6e66ed98df588a0a458021a060e6c545ee29682cec0220ef2d509c
9a13021a070e5530fd5f5111dc5a3c5caac7d7707a8021a0800aa56c58be02cd787ff67
a49da53adac021a09099cb0e2c7bd6c371b2583a346795128e021a0a0cb8e0c94a3fa89
e224d4ef4605c9dc9e021a0b0bcb328a854c94085b122db28fed33056021a0c0ba33066
3abd4e8c93dfb844638c8e39c021a0d02bdd93d26b01b12e864ee479d204d97a021a0e0
b34d88531075b4dd4743516d41e8a120021a0f072d129dde427facf3007c837202c8d25
021a10057d9eec29ffeaaaaad30e445b501cf44021a110372eb24a3fd950e2a4fcc2ad4
8a5f36a021a120937e4c3a5f8a1c2361c9686dd84a72f1021a1305eba3541beb6aaf543
ade2896466f0cf021a140eb357c1630dac7822de7cf314970e1a9021a150873934b7e02
cc986e6042f75778f7250021a16095f8936a531133dcea3ea1f74dec2286021a170c5d9
4f7a1ad2bd535a29019bd539b22a021a180f808b28cf6ef5998b1835e12997131d6021a
190ea14d048bc3dcc5602ca8ee034b6d219021a1a097870e6061d8d03f7de6a84858cd7
af9021a1b0f91a388d379dd71bfb2927468d643358021a1c0f49219d7b3f24d3389bd76
1f0742bce7021a1d022158a751e9f56e42be5621951bff4d9021a1e0471b33bcbe3bae8
63bec16954ae2bf28021a1f0d71a9e8dccf8b4fc3e7b6f235c2ba06e021a2007a443a07
7309130d2eaceafafd159b83021a210ff4db4075c4ba736b7aa318b009d2383021a220b

]]></description>
			<content:encoded><![CDATA[<p>I needed some big random strings.</p>
<pre name="code" class="perl">
cat /dev/random | hexdump | tr -d ' \n'
</pre>
<p>Take your pick,</p>
<pre>
f8af39f196671bedac2a252a400219f905a40aea18a7c8460ef37e7a1993b51130219fa
06da0efd4e57c77f3ef898b93a08905b70219fb07451adff722635f0e9a91393f9046a9
b0219fc0a05f41d1225cc5544b5440926bb56dd50219fd0233f831154277b7c2f2ea6c2
e354f5690219fe0ee82dc88de5647dc1f68377af4f570890219ff0bcc39ff21e0108c85
535f4082aed3463021a000c3451ca6d31d6bb4eced539502fa6295021a01023432a6f94
dfb09eb0e2ad431ba399a1021a02088df34efac7cf34b3505d7a0db85e194021a030dfc
348f9f4ca7ac4cec11186c981da86021a04078bd504f169b45fc5ecadbf5c4bcb9cf021
a050136134d82552d6e66ed98df588a0a458021a060e6c545ee29682cec0220ef2d509c
9a13021a070e5530fd5f5111dc5a3c5caac7d7707a8021a0800aa56c58be02cd787ff67
a49da53adac021a09099cb0e2c7bd6c371b2583a346795128e021a0a0cb8e0c94a3fa89
e224d4ef4605c9dc9e021a0b0bcb328a854c94085b122db28fed33056021a0c0ba33066
3abd4e8c93dfb844638c8e39c021a0d02bdd93d26b01b12e864ee479d204d97a021a0e0
b34d88531075b4dd4743516d41e8a120021a0f072d129dde427facf3007c837202c8d25
021a10057d9eec29ffeaaaaad30e445b501cf44021a110372eb24a3fd950e2a4fcc2ad4
8a5f36a021a120937e4c3a5f8a1c2361c9686dd84a72f1021a1305eba3541beb6aaf543
ade2896466f0cf021a140eb357c1630dac7822de7cf314970e1a9021a150873934b7e02
cc986e6042f75778f7250021a16095f8936a531133dcea3ea1f74dec2286021a170c5d9
4f7a1ad2bd535a29019bd539b22a021a180f808b28cf6ef5998b1835e12997131d6021a
190ea14d048bc3dcc5602ca8ee034b6d219021a1a097870e6061d8d03f7de6a84858cd7
af9021a1b0f91a388d379dd71bfb2927468d643358021a1c0f49219d7b3f24d3389bd76
1f0742bce7021a1d022158a751e9f56e42be5621951bff4d9021a1e0471b33bcbe3bae8
63bec16954ae2bf28021a1f0d71a9e8dccf8b4fc3e7b6f235c2ba06e021a2007a443a07
7309130d2eaceafafd159b83021a210ff4db4075c4ba736b7aa318b009d2383021a220b
</pre>
]]></content:encoded>
			<wfw:commentRss>http://matt.chadburn.co.uk/?feed=rss2&amp;p=594</wfw:commentRss>
		</item>
	</channel>
</rss>
