If in the year 2,600 archaeologists were to exhume the remains of, say, Margaret Thatcher, how would they go about verifying their find?

The story of the 'King in the car park' — the conclusion drawn by the University of Leicester that a skeleton dug up from a car park belongs to King Richard III — has briefly piqued the public's interest in British history. The event got me thinking about the future, too.

If in the year 2,600 or so, archaeologists were to exhume the remains of, say, Margaret Thatcher in the 27th century equivalent of a car park (tele-transportation hub?), how would they go about verifying their find?

When it came to Richard III's remains, in addition to not entirely conclusive DNA results (we'll probably be better at DNA testing by 2600) there were a few key texts — mainly Tudor propaganda — that historians could refer to.

It must have been a complicated balancing act — both Shakespeare's dramatic account and the true origins of the skeleton are historical uncertainties, and yet you hope that some agreement between the two will strengthen your confidence in both.

For instance, the skeleton had a curved spine. If you don't believe (for other reasons) the remains belong to Richard III, then this is of little significance. But, if you do believe the remains belong to a former King then the curved spine simultaneously strengthens your confidence in this belief, and reinforces your faith in Tudor accounts of Richard III as having a twisted spine. 
Assuming that data stored on the internet is preserved over the next 600 years, future archaeologists will have much more textual evidence to draw on when they are analysing remains and building up a picture of the person. They will have thousands of conflicting news articles, opinion pieces, blogs and tweets — particularly on a divisive character like Thatcher

While trying to match her ancient human remains with these many, conflicting online accounts to generate an idea of Margaret Thatcher's character, they will have to decide which pieces of evidence carry the most weight. I don't envy them.

Nate Silver, the FiveThirtyEight blogger and forecaster who shot to fame for accurately predicting the Democrats' electoral victory, says we live in an era of 'Big Data' with 2.5 billion quintiles of data produced each day.

And while you might think that this makes a forecaster's life easier – the more information you have, the more you have to go on when making a prediction — he says this isn't necessarily true. The vast majority of this data is just 'noise' — a distraction diverting your attention from the real issues at hand.

To use the famous example, you might expect some correlation between bikini wearing and shark attacks but this isn't causal — it isn't that sharks find bikinis particularly tasty, but on hot days, when people are more likely to wear bikinis, they are also more likely to go for a swim, which places them at greater risk of shark attacks.

A forecaster who used bikini-wearing patterns to predict shark-attacks could easily be proved wrong by extraneous events, for instance if swimming costumes come into fashion. Bikini trends are just 'noise' when you're predicting shark attacks.

The job of a forecaster is to be discerning with the data they use, to be able to separate the 'signals' from the 'noise'. The problem for many forecasters — especially when it comes to political forecasting — is that the signals you select are often biased by your own theory or political standpoint.

When the archaeologists of the 27th century dig up Maggie's body, they will have to wade through an awful lot of internet noise to find a few 'signals' that might be able to tell them anything useful about the ancient politician. Will they be able to distinguish between tweets by informed political analysts and ranting bloggers who live in their pyjamas and never see daylight?

I'm fairly optimistic. We've clearly come along way when it comes to forecasting in the next 600 years. Nate Silver's very aware of his limitations, unlike Nostradamus, and he's also more spot-on. Hopefully by 2600 we will be better at picking out 'signals'.  

