What use is a personal tweet archive ?

A little while ago I wrote a post about the need to plan for archiving the digital “papers” of historians. In that post I talked about research data (what we used to called “notes”); about the systems that form the bridge between that data and the writing process; and about written outputs themselves, and their various iterations. It looked forward to a time when all these digital objects, in multiple formats but from one mind, are available to future students of the way the discipline has developed.

What that post neglected was data about the way I publicise my work. Perhaps one of the reasons we’ve been slow to think about this is that, at one time, most academics didn’t need to. Apart from giving papers at gatherings of the learned, the task of publicising one’s work belonged to the publisher. And if one’s publisher was the right one, then the work would inevitably end up in the hands of the small group of people who needed to know about it. And whilst the media don is not a new phenomenon, most historians might have thought such self-publicity outside the academy something of an embarrassment, even rather vulgar.

How times change. Universities are training their staff in dealing with the traditional media and in the most effective way of using social media. And this opens up a new category of data that ought to be archived, if only to understand how the push for ‘impact’ actually played out in these early years. And some of it is being archived. The Library of Congress are archiving every tweet, although it isn’t yet clear how that archive may be made available for use. The UK Web Archive, along with other national web archives, have been archiving selected blogs (including this one) for several years, and the EU-funded BlogForever project is looking to join those projects up. But this approach, valuable though it is, separates the content from the author, and from the rest of their digital archive. Whilst that link might be retrievable at a higher discovery layer, something important is still lost.

But now the helpful folk at Twitter, in a move that ought to be applauded, have made it very quick and easy to download an archive of one’s own tweets, right back to the beginning. And so I did: 1682 tweets, over 14.5 months. But what to do with it ?

Straight away, scrolling through a long CSV file starts to tell the story of the making of other things: the first retweet of someone else’s work which was subsequently to influence my own; the first traces of an idea, or even of a question I was beginning to ask, which spawned a blog post, and then a paper. I also find that I shared at least one link in more than two thirds of my tweets, which sounds public-spirited until I add that a good proportion were my own posts. I can start mining the data for key terms and themes, and how they ebbed and flowed.

It would be useful if there was a way to keep this data fresh, of course, to avoid going back to Twitter for a new download every so often. And, thanks to @mhawksey, there is a simple way of doing this, using Google Drive. Martin explains all here, with a handy video set-up guide.tweet archive

And so I now have a cloud-based archive of my tweets, complete with a basic search and browse web interface. This is now a lazy man’s look-up of old tweets and the resources they pointed to, searchable by handle, hashtag or key term.

But perhaps this is something about which most people are lazy. Social media provides us with an overwhelming stream of quite-interesting things, in amongst which are nuggets of gold. Those nuggets I can manage in the old way, by recording them properly, perhaps in a bibliography. I might even read them, one day. But the quite-interesting stuff, whilst being too much ever to record properly, will probably remain quite interesting. And so this provides a middle way between formal curation of a webliography and just searching the live web (which assumes I can remember enough about what I’m looking for.)

Might this archive now change my future tweeting ? Early days to judge perhaps. But I think it may, since I may now retweet and share in preference to using favourites, in order to get a link to a resource into the archive. I can also imagine starting to use personal hashtags, as a way of structuring my own archive at the same time as I tweet. Real-time curation perhaps ?

And I might share it too. Since this is now unambiguously my own data, rather than Twitter’s, I can licence it for reuse by others in larger corpora for analysis. Imagine a pooled archive of the tweets of many historians. Now that would be interesting.

Religion, politics and law in contemporary Britain: a web archive

[This is an expanded version of a post first published in the UK Web Archive blog.]

It has been over two years in the making, but I am delighted to be able to say that my own special collection in the UK Web Archive is now online.

UKWA (for which I am engagement and liaison lead, based at the British Library) collects and preserves websites of scholarly and cultural importance for the UK web domain. Already UKWA collect some 11,000 sites, and has more than 50,000 instances in total, with series of snapshots of some sites going back the best part of a decade. That’s a lot of data, and so one of the ways into the archive is by means of the special collection, of sites on a particular theme.religion politics law thumbnail

A couple of years ago, long before coming to the BL, I joined a project at the Library which brought together a group of scholars to guest-curate special collections on our research interests. I had become interested in the sharpening of the terms of debate about the place of religion in British public life, particularly since 9/11 and the London bombings in 2005. I’ve long been interested in public debate about church and state; but until relatively recently this happened by means of the print press, public oratory, ephemeral publication and the broadcast media. It struck me that a good deal of this debate had already moved online, and so new ways of capturing and preserving it were going to be needed. And so, the ‘politics of religion collection’ (as it was then known) was born. (See these posts on my progress.)

I fairly soon realised why I’m not an archivist, since all sorts of unfamiliar questions hove into view. When archiving the web, what is the base unit ? A whole domain, such as www.bbc.co.uk ? Or a single URL ? Several sites, like that of the National Secular Society or the Christian Institute were central to my concerns, and so could be included whole. But what does one do with a single post on a PR blog about the handling of the sharia law row by Rowan Williams and his staff ? In fact, the collection is a mixture of whole domains and individual directories or pages from larger sites; an uneasy compromise, but a necessary one.

Also (and I may as well come straight out with it), the collection is selective, and thus in a real sense subjective. As a watcher of contemporary religious politics, against the backdrop of recent history, my impression is that the place of religious ideas, symbols and organisations in public life is at its most contested for decades. Historians are traditionally wary of assessing the significance of present trends, since it leaves hostages to fortune and later events. Yet, all archival choices from a pool of material not defined in advance by provenance involve some judgements as to significance; and historians are as well suited as any to make those judgements. And so I have put the collection together now to enable future historians to begin to answer the questions which I anticipate will be significant. (See an older post on why I think historians should engage with this way of working.)

There were other issues. Were I the archivist for a particular organisation, I’d have no problem with getting permission to add material to my archive: everything produced in-house would be in view. The problem for web archiving is that we’re dealing with other people’s copyright work, and so an individual permission is needed for each site. I have a long list of sites which I would dearly love to add to the collection, but for which (for various reasons) we’ve had no response. So, if you are the owner of Protest the Pope, or Holy Redundant, or Christians in Politics, please get in touch. For now, even if the collection cannot be anything like comprehensive, I do hope that it is at least coherent.

There are particular strengths, and some gaps. It includes many campaigning organisations, both secularist and religious, and is heavy on the conservative Christian groups about which I myself know most. It is very light on non-Christian faiths, since I know the field much less well.  It is still very much open, however, and so suggestions of sites that ought to be included are very welcome, via this blog or at the UKWA Nominate a Site page.

What can you do with it ?  For now, there is a simple browse function; and the collection can be searched on its own.  And over time, all sorts of uses will present themselves, which we can’t currently imagine. But the data is there: a growing longitudinal series of timed instances of websites, identified as thematically related; that is to say, an archive.

Why historians should care about web archiving

Someone said to me at a conference recently (not his exact words), “if we can’t get historians interested in web archives, then who can we reach ?” But so far, there hasn’t been much visible engagement between contemporary historians and web archives, even though those archives are now well established at national memory institutions such as the Library of Congress or the British Library. [Full disclosure: the latter employs me, but this post represents a personal view, not the Library’s.] And as an historian who has been involved with web archives since before coming to the BL, I think this needs to change.

The evidence is mounting of how vulnerable the web actually is. One study found that 11% of content shared via social media will have disappeared a year later, and another 7% each year after that – a startling rate. And since there was a time lag between the migration of the archival record into a digital-only mode and the establishment of web archives, there is already a large hole in the record from perhaps the mid-nineties to the mid-noughties. A recent post of mine over at the UK Web Archive blog showed just how significant are some of the sites that now exist only in web archives; and that’s only the ones the UKWA managed to capture in time. We can only guess at what is now lost forever.

So, in twenty or thirty years’ time, historians of the very late twentieth century will have reason to regret that no-one thought to keep their primary sources safe for them. But there is another problem. It is a brave historian who writes on the very recent past, a remote subject indeed; I myself wrote an article in 2004 that extended up to 1990, and not without some unease about the hostages to scholarly fortune it gave. And so most of the historians who have the greatest personal stake in archiving the web right now haven’t yet entered the profession. I would argue that historians are uniquely well-placed to view the present in relation to the past, and thus to anticipate those aspects of the present for which there is most need for a record. But it would take a significant change in culture such that historians working now start to take a hand in preserving sources for our successors.

“But this isn’t my job”, the response might be. “Surely this is what archivists are for ? (It always used to be.)” Granted, in a pre-digital world, institutional archivists in government, civil society, the churches, concentrated on capturing unpublished materials produced in-house, took in those personal archives that were offered to them, and left the copyright libraries to pick up books and journals. If the ephemeral stuff in the cracks didn’t survive, then such was life. Now, the volume of words is so much greater, and the means of disseminating them so dispersed, that archivists as a profession (already an undervalued and underpaid one, I might add) can’t hope even to see, let alone arrange to capture everything of note.

So: we need a new model of archival curation, based on a partnership between archivists, scholars and the public. The technical means are there; it simply needs a new form of engagement, and we historians can help make it happen.

A Heisenberg Principle of web archiving ?

Whatever it means to real scientists, the famous ‘uncertainty principle’ of Werner Heisenberg is sometime popularly taken to mean that it is impossible closely to observe something without in some way altering it. It’s also a conundrum that has faced anthropologists when observing cultures far removed from their own: how far does the consciousness of being observed alter the behaviour of the subject ?

I’ve been publishing in print in the traditional way for some years now, and everyone knows that books are (in theory) permanent, that they find their way into libraries; and so one writes conscious that the words cannot be unwritten. Writing for the web, however, has had a more transient aesthetic: I can write with the freedom that comes from knowing that (in a site I control) I can retrospectively edit at will, should I choose to. There are good scholarly reasons not to, to do with making my work reliably citable; but in the final analysis I am not bound by them.

So far, the visibility of web archiving by national memory institutions is not yet high. In addition, if the UK Web Archive considers a site important enough to archive, then it must gain explicit permission; and by no means all website owners give that consent.  This blog is already being archived by the UK Web Archive  (last crawl in April 2012); but had I been at all concerned about the things I write having a permanent existence, then I could have withheld permission.

On the horizon is a major piece of legislation that could subtly but importantly change things: the Legal Deposit Libraries (Non-print Works) Regulations 2013 (see the most recent public consultations here.) As and when these successfully negotiate the passage through Parliament, any website in the .uk domain could be archived for posterity without the explicit consent of the owner.

The change in the law in itself isn’t my main point, however: the effects of increasing consciousness of it is. Put simply: will some words that might have been written in 2012 not be written in 2014 because the author was conscious that they could not later be retracted ? I think it likely. Would it be a ‘bad thing’ ? I don’t suppose we know yet; but we ought to be thinking about it.

Archiving the Jubilee: Part Two

I’ve been looking at some of the coverage of and reaction to the jubilee weekend, in order to suggest that the British Library archives them for safe keeping in the UK Web Archive. (See Part One.) My earlier post looked at some of the preparatory statements from official church sources, and some very early oppositional voices. Here are some examples of reportage and comment after the event.

Rowan Williams’ sermon at St Paul’s

Perhaps predictably, the archbishop did not allow the pieties of the situation to restrict his thinking on the subject, making some robust comments about aspects of current economic life. See the full text, and the reactions of the Daily Mail (negative) and the Guardian and Nelson Jones in the New Statesman (rather more positive).

Local events

The Church Times gave a useful digest of local events, including a street party in the nave of Ripon Cathedral and various sermons, including that of the Dean of Belfast.  Events in local communities includes an inter-faith Family Fun Day in Tooting, south London.

The ‘real meaning’ of Jubilee

A good few campaigning sites sought to draw a distinction between the biblical concept of jubilee and the pattern of the celebrations, often making a more or less explicit connection with the current climate of austerity. See Christianity Uncut, Ekklesia and Symon Hill. The work of the Jubilee Debt Campaign predates this year’s events, although their site did draw attention to the connection.

Archiving the Jubilee. Part One

The UK Web Archive are creating a special collection of sites relating to all aspects of the jubilee, and are inviting nominations.

Although we are still a couple of months away from the event itself, I thought it would be worth starting to pull together some of the various sites for the Queen’s jubilee that come from within or relate to the Christian churches. This will include press sources that the UKWA don’t ordinarily take. I thought I’d make a start with some of the more predictable and national ones. I would be delighted to add more if readers were to suggest them.

Official church resources

As you would expect, the several denominations have made various preliminary statements. The Church of England’s site refers to several linked ventures: the Big Jubilee Lunch, with a specially composed grace;  there will also be a special service at St Paul’s on June 5th, and also the Big Jubilee Thankyou, where Anglicans are invited to sign a copy letter displayed in churches, all of which will then be combined and presented to the Queen – a petition, as it were, without demands. The lunch is being coordinated by HOPE, a pan-church organisation which is evangelical in origin, but has partnerships in place with most of the Protestant denominations in the UK.

See also the Bishop of London’s  sermon on the accession (Feb 6) as Dean of the Chapels’ Royal.

The Catholic bishops in England and Wales have urged parishes to pray for the Queen on Sunday June 3 (which is also Trinity Sunday), as reported in the Catholic Herald. (The press release is here.)

Churches Together in England are assembling resources as they appear here, and there is a joint presidential statement from Canterbury, Westminster, the Free Churches Group, and the Lutheran church, although it is rather lost amongst references to the Olympics.

The Jubilee Churches Festival is looking to co-ordinate celebrations at a local level.

Oppositional voices

One has to dig very deep to find many Christians voicing opinions critical of either the event or the monarchy itself.  Ekklesia noted the beginnings of the campaign of protest by Republic, and complaints about the BBC’s coverage, but refrained from comment. (Incidentally, Republic’s position on the established church is also interesting.) However, one would expect this type of comment to appear more reactively, and nearer the event; and so watch this space for later posts.

St Paul’s and OccupyLSX: towards a web archive

I thought it would be worth beginning to assemble some of the coverage of the OccupyLSX encounter with St Paul’s. Once we got past the rather lame ‘moneylenders and the Temple’ comments, there has been some rather more trenchant and useful comment regarding the incident as an opportunity lost for the Church of England. This is necessarily just a snapshot, and I’d be more than happy to add to it if readers were to suggest.

  • Observer editorial, Sunday 30th October and Andrew Rawnsley in the same issue
  • Giles Fraser on his reasons for resigning as Canon Chancellor, in the Guardian, October 27.
  • George Carey’s intervention in the Telegraph, 27 Oct, and that of the local MP, the Conservative Mark Field.
  • Peter Stanford on the Church’s lost opportunity
  • Reactions to the encounter between the Bishop and the Dean with the protesters on Sunday 29th, in the Telegraph , and video from the BBC of Chartres addressing the crowd.
  • Graeme Knowles’ statement on resigning as Dean, October 31st, along with the reaction of the Bishop of London and of the Chapter. It provoked a characteristically measured and lucid response from Damian Thompson in the Telegraph, and the first statement from Canterbury of the whole episode, which has been followed up with a fuller response to the broader issues on Nov 1st.

Geoffrey Robertson on the Papal visit

The work I’ve been doing for the web archiving project has necessarily picked up much material in the press, little of which can be added to the UK Web Archive for reasons of permissions. I thought therefore that it would be worth capturing some of that material here.

Readers may remember a campaign by Geoffrey Robertson against the Papal visit: see two pieces amongst several, on the legal basis of the Vatican’s statehood, in the Guardian, and earlier on the prospects for indicting the Pope for complicity in the child abuse scandal

Bishops in the Lords

Whilst doing some work for the Politics of Religion project for the UK Web Archive, a selection of interesting links, some of which will be submitted to the UKWA:

On the negative side: Heresy Corner, Johann Hari in the Independent, and Polly Toynbee in the Guardian.
There have also been in recent years ongoing campaigns for reform to change the composition of the House: from the British Humanist Association, and the Power 2010 campaign.

On the more positive side, see a debate from Theos.

Using the UK Web Archive

I have recently begun to take a part in a project at the British Library, the ‘Researchers and the UK Web Archive’ project. A group of scholars, including sociologists and experts in the built environment, politics and the visual arts have embarked on a year-long project using the UK Web Archive as a source. Topics include sport, gender, elections and religious architecture, amongst others. More details are available on the project blog.

The project should provide the BL with valuable feedback on the archive itself; in addition, several of us will be helping to create new collections of previously unarchived material relating to our particular research projects. My own is on the politics of religion in Britain in the last five years, paying particular attention to events such as reactions to the recent visit of the Pope, and to Rowan Williams’ lecture on the place of sharia law in England. I’m particularly interested in the ways in which the communication of religious ideas has been affected by the web, and by social media in particular.

Of particular interest is how perennial themes in the church/state/law relationship are transmuted in a new information environment. Part of the endeavour is to suggest that contemporary historians are in a unique position to anticipate which present-day websites will be of interest to future historians, by virtue of understanding something about the pedigree of an issue, and the significance of a forthcoming event.