Peer Comment

Andrew Bevan

UCL Institute of Archaeology, UK. a.bevan@ucl.ac.uk

Cite this as: Bevan, A. (2015) Peer Comment, Internet Archaeology 39, http://dx.doi.org/10.11141/ia.39.1.com1

This interesting and provocative article considers the location of archaeological content in a wider online information landscape. 'Mapping' is the metaphor that the author, Shawn Graham, adopts to understand how people navigate to and from different archaeological resources in this online landscape, with a special interest in the public discoverability of archaeology blogs. The article's goal, on my reading, is partly to provide a fun discussion of the current structure of the web and its consequences for archaeological communication, and partly to offer archaeologists a methodological proof-of-concept of hyperlink network modelling. Hence, it lays down plenty of interesting trails, without necessarily offering a firm conclusion or prescribing a single road map for the future… but that's no bad thing. Overall, I enjoyed reading it and have just three general comments and a few other minor suggestions to make.

#CritBlogArch Tweets

1. Central Tendencies

While the success of a 'random surfer' model for general web-search optimisation is undeniable, I still worry about how this article uses the idea of an 'ordinary interested member of the public'. For example, it is now widely acknowledged that a simple distinction between access and non-access to the Internet risks ignoring users' very variable web skills, Internet connectivity, access and online intentions (e.g. Richardson 2014a). These differences also vary from country to country so can we really profile an 'average' person for this analysis? The article also seems to be suggesting a distinction between topic specialists (e.g. vocational archaeologists who blog) and the rest, but isn't there more of a continuum of expertise and commitment? Similarly, the evocation of a 'layperson' as someone who 'might signal [a blog or website's] perceived value through linking, re-tweeting, commenting, and writing her own blog posts about it' risks implying that the majority of Internet users do these things regularly. Do they, or is that still a minority group (albeit a growing one) worldwide? Finally, would the hypothetical enthusiast for archaeology really begin satisfying their interest by tapping 'Roman archaeology' into a search engine? Maybe they would, but I wonder about alternative access routes (e.g. blog aggregators, bulletin boards such as Reddit, traditional news media, etc.) and, regardless, I think the article needs to be clearer about whether this assumption is merely an analytical convenience or whether it is meant as an explicit behavioural model about how archaeological content is typically found.

2. Hyperlink Network Models

A second point relates to the quantification of online networks. I completely agree with the author that visual network diagrams are an over-rated form of visualisation and it is nice to see that they are not the priority here. But even if summary network statistics are preferred, I still wonder about the overall fitness-for-purpose of hyperlink-based networks for understanding discoverability. For example, the article does a nice job of making clear that the hyperlink lists built by standard web-crawling software are potentially misleading because, unlike most link-mapping software, a real person looking for content on Roman archaeology (a) does not notice the web's technical plumbing in the same way, (b) knows from both webpage layout cues and received wisdom to de-prioritise advertising content, and (c) is increasingly given a geo-socially and historically personalised search experience. It would be worth adding that there are additional analytical challenges in including or excluding hidden client- or server-side content that is generated on the fly. Even if we retain the idea of hyperlink mapping as a useful way forward, I think there is still a need for more sensitive analysis of how the starting and stopping conditions for a depth-restricted crawl might affect its results. How was 20 minutes decided upon and how much difference does it make?

3. Blog Impact

This article also raises the notoriously thorny question of how to assess 'impact'. Do blogs matter and if so how? I would love to see a slightly tighter fit of this discussion to the formal 'impact' agenda in research institutions. For example, we could follow the UK's Economic and Social Research Council (ESRC 2014) and distinguish between impacts that are (a) 'instrumental' and can be concretely shown to influence policy, legislation or behaviour (e.g. with regard to archaeological issues such as the antiquities trade), (b) 'conceptual', where they visibly contribute to the reframing of key debates (e.g. about open access publication), or (c) 'capacity building' in their rapid dissemination of technical skills (e.g. with respect to scientific methods). To the last of these, I'd also add 'transparency building', especially for blogs, given their suggested ability to expose how knowledge is created and wrestled with (e.g. Deitering and Gronemyer 2011). Currently, the main way to document these different textures of impact is through densely described case studies (after the fact), and the main way to design such impact pathways (beforehand) is through qualitative identification of possible stakeholders or beneficiaries. Where the beneficiary to be considered and blog author are one and the same, it may be possible to quantify comparatively what blogging does for the popularity of their more traditional outputs (e.g. does blogging increase citation of their published papers: McKenzie and Özler 2011). Otherwise, we remain in something of an analytical vacuum and I was hoping the article might say a few words about whether network mapping might help. An unfair request for this particular article, but I also suspect that instead of generic archaeology terms such as 'Roman archaeology', it might be better to disaggregate the analysis to distinguish between blog impact on fast- versus slow-moving archaeological subject matter, as this separation might isolate different situations where blogs are more or less nodal as a form of communication (given that one of the strengths of blogs is rapid response time).

Finally, the article raises the really interesting question of whether archaeology blogs are best seen as personal stream-of-consciousness, intra-tribal chat or wider public conversations. Personally, I am not sure an aggregate hyperlink mapping approach will help here, but network analysis might still be useful. The article raises the question but does not seem to come back to it directly. Given that complex 'ecologies' of information exist online (Nardi and O'Day 1999), we really need to understand patterns of niche construction and blog diversification as archaeological content is produced and reproduced. Network models might also be used to consider direct competition among archaeology blogs as information resources, and between blogs and other forms of discoverable archaeological content. Of course, these ecological relationships also co-evolve over time, and are affected by the size, composition and territoriality of different online archaeological communities. So far, there have also been some keystone species for archaeology blog production, with PhD students and early career researchers being particularly important, and a network mapping of how these people interact might address more directly the paper's nice query about 'shouting into the void'. Given the practical challenges of hyperlink listing noted above, perhaps a Share- and Like-based mapping of the archaeological web (Finin et al. 2007) might be both better and in closer step with the drive for alternative measures of research's societal impact (so-called 'altmetrics': Bornmann 2014). Anyway, these are a host of interested queries rather than major problems and all hopefully a good sign that this article is doing its job.

4. Other Minor Points

'for what else is an excavation but the careful recording of everything on the chance that it will be useful later on?' Most people would argue strongly that excavation is always selective, involving either implicit or (preferably) explicit decision-making about what to sample and record. I agree that certain kinds of excavation (e.g. rescue/salvage projects) are general-purpose collect-a-lot strategies, but it is perhaps not a good idea to imply they are 'collect-it-all' approaches.
'Site Spider II will crawl from a designated starting page, …. but it too can be echoed through Gephi's 'http graph generator' plug-in if so desired.' The discussion here is a bit too mechanistic. Instead of a workflow comment (e.g. 'I used the export function…') it would be more useful to have a clearer explanation of what certain analytical methods do (e.g. Gephi's modularity approach to cluster definition).
The article seems to agree with Lusher and Ackland's (2011) argument that social network analysis provides 'fundamentally different conclusions' about the social behaviour behind hyperlinks. If this is the case, it is worth giving an example of where different conclusions have been reached.
In the 'Searching and Mapping: Roman Archaeology' section, it might ease comparison if the PageRank, eigenvector and betweenness centrality results were tabulated for each different approach to mapping. In fact, overall, I think the summaries of network structure, run-time etc. would work better in a table.
It might be nice to have a few words in the article about the role played (or not played) by blog aggregators and portals in simplifying navigation to specialist topic blogs (e.g. Deitering and Gronemyer 2011, 497).

Response to Peer Comment

I appreciate and am grateful for the care with which Andrew Bevan has read my article! I have re-written portions of the article to respond to Bevan's major points. Bevan's concerns regarding central tendencies is well taken. I have added to the discussion of my 'ideal surfer' to consider these issues. The work of Shawn Anctil I think will provide us all with a methodological guideline to considering how we can better understand the actual actions of interested individuals. Similarly, Lorna-Jane Richardson's periodic surveys of archaeologists' social media use (2014b) will give us the data to cross-reference against Anctil's intentionality research and stylistic inquiries such as Kjellberg's 2014 study (which as it happens includes some archaeology blogs). Bevan mentions sites such as Reddit as a place that a hypothetical invidivual might go to find out about archaeology. Reddit, it strikes me, has much untapped potential for us to use as a locus for our public archaeology. I have not scraped or otherwise mapped patterns in Reddit, but Matthew Taylor, a history student at Carleton University, has written a script that can be used to scrape all the comments in a subreddit for both structural and text analysis purposes (see https://github.com/Ottawagunner/RedditData).

Concerning Bevan's point about sensitive analysis, the rationale behind the 20 minute cut-off is entirely pragmatic. I was able to run for longer periods on occasion, but found that the overall size and diameter of the network did not change that much. It is a point on which I hope someone will do further research. Ideally, one would interrogate the Common Crawl dataset, but this is not straightforward and would require the help of a data scientist or information specialist. As for blog impact, I do think that network mapping could play a role, but the structural patterns alone might not be enough to satisfy external assessors (like various government agencies). Or rather, I fear that without more nuance – the shortcomings of this present paper for instance – such mappings would quickly become gamed. What that would then do to how archaeological knowledge is discovered and constructed online, I would surmise might not be a good thing for the discipline as a whole.

The comments facility has now been turned off.